Re: [HACKERS] 10 beta docs: different replication solutions
On Sun, Jul 30, 2017 at 8:34 PM, Steve Singerwrote: > > We don't seem to describe logical replication on > > https://www.postgresql.org/docs/10/static/different-replication-solutions.html > > The attached patch adds a section. This is a good catch. Two quick observations: 1) Super pedantic point. I don't like the 'repl.' abbreviation in the 'most common implementation' both for the existing hs/sr and for the newly added logical. 2) This lingo: + Logical replication allows the data changes from individual tables + to be replicated. Logical replication doesn't require a particular server + to be designated as a master or a slave but allows data to flow in multiple + directions. For more information on logical replication, see . Is good, but I would revise it just a bit to emphasize the subscription nature of logical replication to link the concepts expressed strongly in the main section. For example: Logical replication allows the data changes [remove: "from individual tables to be replicated"] to be published to subscriber nodes. Data can flow in any direction between nodes on a per-table basis; there is no concept of a master server. Conflict resolution must be handled completely by the application. For more information on... what do you think? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add Roman numeral conversion to to_number
On Mon, Aug 14, 2017 at 2:48 PM, Peter Eisentrautwrote: > On 8/3/17 13:45, Robert Haas wrote: >> On Thu, Aug 3, 2017 at 9:25 AM, Oliver Ford wrote: >>> Adds to the to_number() function the ability to convert Roman numerals >>> to a number. This feature is on the formatting.c TODO list. It is not >>> currently implemented in either Oracle, MSSQL or MySQL so gives >>> PostgreSQL an edge :-) >> I kind of put my head in my hands when I saw this. I'm not really >> sure it's worth complicating the code for something that has so little >> practical utility, but maybe other people will feel differently. > > I can't get excited about it. to_number() and such usually mirror the > Oracle implementation, so having something that is explicitly not in > Oracle goes a bit against its mission. > > One of the more interesting features of to_number/to_char is that it has > a bunch of facilities for formatting decimal points, leading/trailing > zeros, filling in spaces and signs, and so on. None of that applies > naturally to Roman numerals, so there isn't a strong case for including > that into these functions, when a separate function or module could do. Well, doesn't that also apply to scientific notation ()? 'RN' is documented as an accepted formatting string, and nowhere does it mention that it only works for input. So we ought to allow for it to be fixed or at least document that it does not work. It's nothing but a curio obviously, but it's kind of cool IMO. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] emergency outage requiring database restart
On Thu, Aug 10, 2017 at 12:01 PM, Ants Aasma <ants.aa...@eesti.ee> wrote: > On Wed, Jan 18, 2017 at 4:33 PM, Merlin Moncure <mmonc...@gmail.com> wrote: >> On Wed, Jan 18, 2017 at 4:11 AM, Ants Aasma <ants.aa...@eesti.ee> wrote: >>> On Wed, Jan 4, 2017 at 5:36 PM, Merlin Moncure <mmonc...@gmail.com> wrote: >>>> Still getting checksum failures. Over the last 30 days, I see the >>>> following. Since enabling checksums FWICT none of the damage is >>>> permanent and rolls back with the transaction. So creepy! >>> >>> The checksums still only differ in least significant digits which >>> pretty much means that there is a block number mismatch. So if you >>> rule out filesystem not doing its job correctly and transposing >>> blocks, it could be something else that is resulting in blocks getting >>> read from a location that happens to differ by a small multiple of >>> page size. Maybe somebody is racily mucking with table fd's between >>> seeking and reading. That would explain the issue disappearing after a >>> retry. >>> >>> Maybe you can arrange for the RelFileNode and block number to be >>> logged for the checksum failures and check what the actual checksums >>> are in data files surrounding the failed page. If the requested block >>> number contains something completely else, but the page that follows >>> contains the expected checksum value, then it would support this >>> theory. >> >> will do. Main challenge is getting hand compiled server to swap in >> so that libdir continues to work. Getting access to the server is >> difficult as is getting a maintenance window. I'll post back ASAP. > > As a new datapoint, we just had a customer with an issue that I think > might be related. The issue was reasonably repeatable by running a > report on the standby system. Issue manifested itself by first "could > not open relation" and/or "column is not in index" errors, followed a > few minutes later by a PANIC from startup process due to "specified > item offset is too large", "invalid max offset number" or "page X of > relation base/16384/1259 is uninitialized". I took a look at the xlog > dump and it was completely fine. For instance in the "specified item > offset is too large" case there was a INSERT_LEAF redo record > inserting the preceding offset just a couple hundred kilobytes back. > Restarting the server sometimes successfully applied the offending > WAL, sometimes it failed with other corruption errors. The offending > relations were always pg_class or pg_class_oid_index. Replacing plsh > functions with dummy plpgsql functions made the problem go away, > reintroducing plsh functions made it reappear. Fantastic. I was never able to attempt to apply O_CLOEXEC patch (see upthread) due to the fact that access to the system is highly limited and compiling a replacement binary was a bit of a headache. IIRC this was the best theory on the table as to the underlying cause and we ought to to try that first, right? Reminder; I was able to completely eliminate all damage (but had to handle occasional unexpected rollback) via enabling checksums. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] JSONB - JSONB operator feature request
On Tue, Jul 18, 2017 at 12:49 PM, David Fetterwrote: > On Tue, Jul 18, 2017 at 01:36:32PM +0200, david.tu...@linuxbox.cz wrote: >> Hi, >> >> some users and me used hstore - hstore for example storing only changed >> rows in trigger like: >> >> hsore(NEW) - hstore(OLD) >> >> There isn't same operator/function in JSON/JSONB. We can only remove keys >> from JSONB, but not equal key-value pairs. Is there any chance to have >> same feature with JSON/JSONB in postgres core? > > Here's one slightly modified from > http://coussej.github.io/2016/05/24/A-Minus-Operator-For-PostgreSQLs-JSONB/ > > CREATE OR REPLACE FUNCTION jsonb_minus ( arg1 jsonb, arg2 jsonb ) > RETURNS jsonb > LANGUAGE sql > AS $$ > SELECT > COALESCE(json_object_agg( > key, > CASE > -- if the value is an object and the value of the second argument > is > -- not null, we do a recursion > WHEN jsonb_typeof(value) = 'object' AND arg2 -> key IS NOT NULL > THEN jsonb_minus(value, arg2 -> key) > -- for all the other types, we just return the value > ELSE value > END > ), '{}')::jsonb > FROM > jsonb_each(arg1) > WHERE > arg1 -> key IS DISTINCT FROM arg2 -> key > $$; > > CREATE OPERATOR - ( > PROCEDURE = jsonb_minus, > LEFTARG = jsonb, > RIGHTARG = jsonb > ); > > I suspect that there's a faster way to do the jsonb_minus function > internally. yes, please! I also sorely miss the hstore 'slice' function which is very similar. The main remaining disadvantage with jsonb WRT to hstore is that you can't do simple retransformations that these operations allow for. Too often you end up doing multiple '->' operations against the same object followed by a rebundling which is a real performance killer. I understand that there are more edge cases due the flexible json structure but I'd be quite happy returning NULL or erroring when you can't arrive at a sensible extraction. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] lag(bigint,int,int), etc?
On Tue, Jun 27, 2017 at 10:01 AM, Colin 't Hartwrote: > Hi, > > The following rather contrived example illustrates that lag(), lead() > (and probably other functions) can't automatically cast an integer to > a bigint: > > select lag(sum,1,0) over () from (select sum(generate_series) over > (order by generate_series) from generate_series(1,10)) x; > ERROR: function lag(bigint, integer, integer) does not exist > LINE 1: select lag(sum,1,0) over () from (select sum(generate_series... >^ > HINT: No function matches the given name and argument types. You > might need to add explicit type casts. > > > I guess this is because the lag() and lead() functions take any type, > and hence the default must be of the same type. > This had me stumped for a few while until I realised that the types > were different. > > Would there be any way to implement an automatic conversion? > > On the off-chance that this is actually a bug, this is on 9.6.3, but > it also occurs on 9.3.17 Why not cast the arguments? The first and the third argument have to be the same, and the second argument is always int. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Segmentation fault in libpq
On Thu, Jun 29, 2017 at 8:23 AM, Michal Novotny <michal.novo...@greycortex.com> wrote: > Hi, > > comments inline ... > > > > On 06/29/2017 03:08 PM, Merlin Moncure wrote: >> >> On Thu, Jun 29, 2017 at 4:01 AM, Michal Novotny >> <michal.novo...@greycortex.com> wrote: >>> >>> Hi all, >>> >>> we've developed an application using libpq to access a table in the PgSQL >>> database but we're sometimes experiencing segmentation fault on >>> resetPQExpBuffer() function of libpq called from PQexecParams() with >>> prepared query. >>> >>> PostgreSQL version is 9.6.3 and the backtrace is: >>> >>> Core was generated by `/usr/ti/bin/status-monitor2 -m >>> /usr/lib64/status-monitor2/modules'. >>> Program terminated with signal 11, Segmentation fault. >>> #0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152 >>> 152 str->data[0] = '\0'; >>> >>> Thread 1 (Thread 0x7fdf68de3840 (LWP 3525)): >>> #0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152 >>> No locals. >>> #1 0x7fdf66e0333d in PQsendQueryStart (conn=conn@entry=0x9f46d0) at >>> fe-exec.c:1371 >>> No locals. >>> #2 0x7fdf66e044b9 in PQsendQueryParams (conn=conn@entry=0x9f46d0, >>> command=command@entry=0x409a98 "SELECT min, hour, day, month, dow, >>> sensor, >>> module, params, priority, rt_due FROM sm.cron WHERE sensor = $1 ORDER BY >>> priority DESC", nParams=nParams@entry=1, paramTypes=paramTypes@entry=0x0, >>> paramValues=paramValues@entry=0xa2b7b0, >>> paramLengths=paramLengths@entry=0x0, >>> paramFormats=paramFormats@entry=0x0, resultFormat=resultFormat@entry=0) >>> at >>> fe-exec.c:1192 >>> No locals. >>> #3 0x7fdf66e0552b in PQexecParams (conn=0x9f46d0, command=0x409a98 >>> "SELECT min, hour, day, month, dow, sensor, module, params, priority, >>> rt_due >>> FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=1, >>> paramTypes=0x0, paramValues=0xa2b7b0, paramLengths=0x0, paramFormats=0x0, >>> resultFormat=0) at fe-exec.c:1871 >>> No locals. >>> >>> Unfortunately we didn't have more information from the crash, at least >>> for >>> now. >>> >>> Is this a known issue and can you help me with this one? >> >> Is your application written in C? We would need to completely rule >> out your code (say, by double freeing result or something else nasty) >> before assuming problem was withing libpq itself, particularly in this >> area of the code. How reproducible is the problem? >> >> merlin > > > The application is written in plain C. The issue is it happens just > sometimes - sometimes it happens and sometimes it doesn't. Once it happens > it causes the application crash but as it's systemd unit with > Restart=on-failure flag it's automatically being restarted. > > What's being done is: > 1) Ensure connection already exists and create a new one if it doesn't exist > yet > 2) Run PQexecParams() with specified $params that has $params_cnt elements: > > res = PQexecParams(conn, prepared_query, params_cnt, NULL, (const char > **)params, NULL, NULL, 0); > > 3) Check for result and report error and exit if "PQresultStatus(res) != > PGRES_TUPLES_OK" > 4) Do some processing with the result > 5) Clear result using PQclear() > > It usually works fine but sometimes it's crashing and I don't know how to > investigate further. > > Could you please help me based on information provided above? You might want to run your code through some analysis tools (for example, valgrind). Short of that, to get help here you need to post the code for review. How big is your application? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Segmentation fault in libpq
On Thu, Jun 29, 2017 at 4:01 AM, Michal Novotnywrote: > Hi all, > > we've developed an application using libpq to access a table in the PgSQL > database but we're sometimes experiencing segmentation fault on > resetPQExpBuffer() function of libpq called from PQexecParams() with > prepared query. > > PostgreSQL version is 9.6.3 and the backtrace is: > > Core was generated by `/usr/ti/bin/status-monitor2 -m > /usr/lib64/status-monitor2/modules'. > Program terminated with signal 11, Segmentation fault. > #0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152 > 152 str->data[0] = '\0'; > > Thread 1 (Thread 0x7fdf68de3840 (LWP 3525)): > #0 resetPQExpBuffer (str=str@entry=0x9f4a28) at pqexpbuffer.c:152 > No locals. > #1 0x7fdf66e0333d in PQsendQueryStart (conn=conn@entry=0x9f46d0) at > fe-exec.c:1371 > No locals. > #2 0x7fdf66e044b9 in PQsendQueryParams (conn=conn@entry=0x9f46d0, > command=command@entry=0x409a98 "SELECT min, hour, day, month, dow, sensor, > module, params, priority, rt_due FROM sm.cron WHERE sensor = $1 ORDER BY > priority DESC", nParams=nParams@entry=1, paramTypes=paramTypes@entry=0x0, > paramValues=paramValues@entry=0xa2b7b0, paramLengths=paramLengths@entry=0x0, > paramFormats=paramFormats@entry=0x0, resultFormat=resultFormat@entry=0) at > fe-exec.c:1192 > No locals. > #3 0x7fdf66e0552b in PQexecParams (conn=0x9f46d0, command=0x409a98 > "SELECT min, hour, day, month, dow, sensor, module, params, priority, rt_due > FROM sm.cron WHERE sensor = $1 ORDER BY priority DESC", nParams=1, > paramTypes=0x0, paramValues=0xa2b7b0, paramLengths=0x0, paramFormats=0x0, > resultFormat=0) at fe-exec.c:1871 > No locals. > > Unfortunately we didn't have more information from the crash, at least for > now. > > Is this a known issue and can you help me with this one? Is your application written in C? We would need to completely rule out your code (say, by double freeing result or something else nasty) before assuming problem was withing libpq itself, particularly in this area of the code. How reproducible is the problem? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [BUGS] [HACKERS] Segmentation fault in libpq
On Thu, Jun 29, 2017 at 9:12 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Merlin Moncure <mmonc...@gmail.com> writes: >> On Thu, Jun 29, 2017 at 8:23 AM, Michal Novotny >> <michal.novo...@greycortex.com> wrote: >>> Could you please help me based on information provided above? > >> You might want to run your code through some analysis tools (for >> example, valgrind). > > valgrind is not a perfect tool for finding that kind of problem, > especially if you can't reproduce the crash reliably; but at least > valgrind is readily available and easy to use, so you might as > well start there and see if it finds anything. If you have access > to any sort of static analysis tool (eg, Coverity), that might be > more likely to help. Or you could fall back on manual code > auditing, if the program isn't very big. clang static analyzer is another good tool to check out https://clang-analyzer.llvm.org/ merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] Crash observed during the start of the Postgres process
On Tue, Apr 25, 2017 at 8:44 AM, K S, Sandhya (Nokia - IN/Bangalore)wrote: > Hello, > > Did you get a chance to take a look into the issue? > > Please consider it with high priority. We will be awaiting your inputs. This email is heavily cross posted, which is obnoxious. Can you paste the relevant log snippet? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan
On Tue, Sep 19, 2017 at 1:37 PM, Robert Haaswrote: > On Tue, Sep 19, 2017 at 12:45 PM, Pavel Stehule > wrote: >>> You can already set a GUC with function scope. I'm not getting your >>> point. >> >> yes, it is true. But implementation of #option is limited to PLpgSQL - so >> there is not any too much questions - GUC is global - there is lot of >> points: >> >> * what is correct impact on PREPARE >> * what is correct impact on EXECUTE >> * what should be done if this GUC is changed .. > > For better or for worse, as a project we've settled on GUCs as a way > to control behavior. I think it makes more sense to try to apply that > option to new behaviors we want to control than to invent some new > system. This seems very sensible. We also have infrastructure at the SQL level (SET) to manage the GUC. Tom upthread (for pretty good reasons) extending SET to pl/pgsql specific scoping but TBH I'm struggling as to why we need to implement new syntax for this; the only thing missing is being able to scope SET statements to a code block FWICT. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC plpgsql - possibility to force custom or generic plan
On Fri, Sep 8, 2017 at 2:48 PM, Pavel Stehulewrote: > > > 2017-09-08 21:21 GMT+02:00 Daniel Gustafsson : >> >> > On 08 Sep 2017, at 19:14, Simon Riggs wrote: >> > >> > On 6 September 2017 at 07:43, Robert Haas wrote: >> > >> >> LET custom_plan_tries = 0 IN SELECT ... >> > >> > Tom has pointed me at this proposal, since on another thread I asked >> > for something very similar. (No need to reprise that discussion, but I >> > wanted prepared queries to be able to do SET work_mem = X; SELECT). >> > This idea looks a good way forward to me. >> > >> > Since we're all in roughly the same place, I'd like to propose that we >> > proceed with the following syntax... whether or not this precisely >> > solves OP's issue on this thread. >> > >> > 1. Allow SET to set multiple parameters... >> > SET guc1 = x, guc2 = y >> > This looks fairly straightforward >> > >> > 2. Allow a SET to apply only for a single statement >> > SET guc1 = x, guc2 = y FOR stmt >> > e.g. SET max_parallel_workers = 4 FOR SELECT count(*) FROM bigtable >> > Internally a GUC setting already exists for a single use, via >> > GUC_ACTION_SAVE, so we just need to invoke it. >> >> This syntax proposal makes sense, +1. My immediate thought was that the >> per-statement GUCs were sort of like options, and most options in our >> syntax >> are enclosed with (), like: SET (guc1 = x, guc2 = y) FOR SELECT ..; > > we newer support this syntax in combination with SET keyword > > see - CREATE FUNCTION command > > personally I prefer syntax without FOR keyword - because following keyword > must be reserved keyword > > SET x = .., y = .. SELECT ... ; This seems pretty ugly from a syntax perspective. We already have 'SET LOCAL', which manages scope to the current transaction. How about SET BLOCK which would set until you've left the current statement block? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Aggregation push-down
On Thu, Aug 17, 2017 at 10:22 AM, Antonin Houskawrote: > Antonin Houska wrote: > output type. For other aggregates (like avg()) the remote nodes will have to > return the transient state value in an appropriate form (maybe bytea type), > which does not depend on PG version. Hm, that seems like an awful lot of work (new version agnostic serialization format) for very little benefit (version independent type serialization for remote aggregate pushdown). How about forcing the version to be the same for the feature to be used? merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL procedures
On Wed, Nov 8, 2017 at 9:13 AM, Peter Eisentrautwrote: > I have already submitted a separate patch that addresses these questions. Maybe I'm obtuse, but I'm not seeing it? In very interested in the general approach to transaction management; if you've described it in the patch I'll read it there. Thanks for doing this. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL procedures
On Tue, Oct 31, 2017 at 12:23 PM, Peter Eisentrautwrote: > - Transaction control in procedure bodies This feature is really key, since it enables via SQL lots of things that are not possible without external coding, including: *) very long running processes in a single routine *) transaction isolation control inside the procedure (currently client app has to declare this) *) certain error handling cases that require client side support *) simple in-database threading *) simple construction of daemon scripts (yeah, you can use bgworker for this, but pure sql daemon with a cron heartbeat hook is hard to beat for simplicity) I do wonder how transaction control could be added later. The last time I (lightly) looked at this, I was starting to think that working transaction control into the SPI interface was the wrong approach; pl/pgsql would have to adopt a very different set of behaviors if it was called in a function or a proc. If you restricted language choice to purely SQL, you could work around this problem; SPI languages would be totally abstracted from those sets of considerations and you could always call an arbitrary language function if you needed to. SQL has no flow control but I'm not too concerned about that. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SQL procedures
On Wed, Nov 8, 2017 at 11:03 AM, Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote: > On 11/8/17 11:11, Merlin Moncure wrote: >> On Wed, Nov 8, 2017 at 9:13 AM, Peter Eisentraut >> <peter.eisentr...@2ndquadrant.com> wrote: >>> I have already submitted a separate patch that addresses these questions. >> >> Maybe I'm obtuse, but I'm not seeing it? In very interested in the >> general approach to transaction management; if you've described it in >> the patch I'll read it there. Thanks for doing this. > > https://www.postgresql.org/message-id/178d3380-0fae-2982-00d6-c43100bc8...@2ndquadrant.com All right, thanks. So, *) Are you sure you want to go the SPI route? 'sql' language (non-spi) procedures might be simpler from implementation standpoint and do not need any language adjustments? *) Is it possible to jump into SPI without having a snapshot already set up. For example? If I wanted to set isolation level in a procedure, would I get impacted by this error? ERROR: SET TRANSACTION ISOLATION LEVEL must be called before any query merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostgreSQL port to pure Java?
Robert Treat wrote: Someone did it but it didn't catch fire. I think what will catch fire in a big way is plphp. Managers will like an all php platform that is extremely capable and productive. Developers will enjoy php's natural syntax and agnostic approach to programming. PHP5, when it becomes production ready, will offer high level language features that compete with Java, C#, and Object Pascal. Merlin ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Encoding problems in PostgreSQL with XML data
Peter Eisentraut wrote: The central problem I have is this: How do we deal with the fact that an XML datum carries its own encoding information? Maybe I am misunderstanding your question, but IMO postgres should be treating xml documents as if they were binary data, unless the server takes on the role of a parser, in which case it should handle unspecified/unknown encodings just like a normal xml parser would (and this does *not* include changing the encoding!). According to me, an XML parser should not change one bit of a document, because that is not a 'parse', but a 'transformation'. Rewriting the ?xml? declaration seems like a workable solution, but it would break the transparency of the client/server encoding conversion. Also, some people might dislike that their documents are being changed as they are stored. Right, your example begs the question: why does the server care what the encoding of the documents is (perhaps indexing)? ZML validation is a standardized operation which the server (or psql, I suppose) can subcontract out to another application. Just a side thought: what if the xml encoding type was built into the domain type itself? create domain xml_utf8 ... Which allows casting, etc. which is more natural than an implicit transformation. Regards, Merlin ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Encoding problems in PostgreSQL with XML data
Andrew Dunstan wrote: I think I agree with Rod's opinion elsewhere in this thread. I guess the philosophical question is this: If 2 XML documents with different encodings have the same canonical form, or perhaps produce the same DOM, are they equivalent? Merlin appears to want to say no, and I think I want to say yes. Er, yes, except for canonical XML. Canonical XML neatly bypasses all the encoding issues that I can see. Maybe I am still not getting the basic point, but the part I was not quite clear on is why the server would need to parse the document at all, much less change the encoding. Sure, it doesn't necessarily hurt to do it, but why bother? An external parser could handle both the parsing and the validation. Reading Peter's post, he seems to be primarily concerned with an automatic XML validation trigger that comes built in with the XML 'type'. *unless* 1. The server needs to parse the document and get values from the document for indexing/key generation purposes, now the encoding becomes very important (especially considering joins between XML to non XML data types). 2. There are plans to integrate Xpath expressions into queries. 3. The server wants to compose generated XML documents from stored XML/non XML sources, with (substantial) additions to the query language to facilitate this, i.e. a nested data extraction replacement for psql. But, since I'm wishing for things, I may as well ask for a hockey rink in my living room :) Merlin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Encoding problems in PostgreSQL with XML data
Hannu Krosing wrote: IIRC, the charset transformations are done as a separate step in the wire protocol _before_ any parser has chance transform or not. Yep. My point is that this is wrong. I think of XML the same way I think of a zip file contains a text document. Postgres does not unzip a text file to change the char encoding any more than it should parse an XML document and change the encoding unless this is the specific intent of the user for a specific purpose. Validation alone does not qualify as a reason because a XML parser (xerces) can do validation server-side without mucking with document. Postgres need only be aware of the fact that the data is XML and should be validated. If postgres needs to be aware of internal document contents (for indexing, for example), XSLT can be used for that purpose. Regards, Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [pgsql-hackers-win32] Win32 signal code - first try
Claudio Natoli wrote: FWIW, in a multithreaded version of postgres I'm fooling around with, I replaced the recv call (where backends spend most of their time waiting) which a select(small timeout)/SleepEx(0) busy loop, which calls to recv when ready. Works just fine. Ok, that makes perfect sense. Simply checking pending signals in this loop and just after a command is received will catch most of them, and provide a suitable testing platform. IMO, it's time for a second run of the code, and a functional test which simulates the command processing loop which should include: 1. setjmp/longjmp stack manipulation (i.e. ELOG) 2. in process/out of process generates signals 3. all thread mechanisms. under heavy load conditions. We should be especially watching for deadlocks, stack corruption, and memory leaks...If everything goes ok, I think we'll have a good 'proof of concept' signaling mechanism. After that, its time to start submitting patches to the hackers for review... Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Encoding problems in PostgreSQL with XML data
Hannu Krosing wrote: In that case, treat the XML document like a binary stream, using PQescapeBytea, etc. to encode if necessary pre-query. Also, the XML domain should inherit from bytea, not varchar. why ? the allowed characters repertoire in XML is even less than in varchar. Yes, that is correct. I was resorting to hyperbole...see my reasoning below. The document should be stored bit for bit as was submitted. Or in some pre-parsed form which allows restoration of submitted form, which could be more for things like xpath queries or subtree extraction. This is the crucial point: I'll try and explain my thinking better. OTOH, if we are transforming the document down to a more generic format (either canonical or otherwise), then the xml could be dealt with like text in the ususal way. Of course, then we are not really storing xml, more like 'meta' xml ;) On the contrary! If there is DTD or Schema or other structure definition for XML, then we know which whitespace is significant and can do whatever we like with insignificant whitespace. According to the XML standard, whitespace is always significant unless it is outside an element or attribute and thus not part of the real data. A DTD or Schema adds constraints, not removes them. I'm nitpicking, but this is extra evidence to my philosophy of xml storage that I'll explain below. select 'd/'::xml == '?xml version=1.0 encoding=utf-8?\nd/\n'::xml Right: I understand your reasoning here. Here is the trick: select '[...]'::xml introduces a casting step which justifies a transformation. The original input data is not xml, but varchar. Since there are no arbitrary rules on how to do this, we have some flexibility here to do things like change the encoding/mess with the whitespace. I am trying to find away to break the assumption that my xml data necessarily has to be converted from raw text. My basic point is that we are confusing the roles of storing and parsing/transformation. The question is: are we storing xml documents or the metadata that makes up xml documents? We need to be absolutely clear on which role the server takes on...in fact both roles may be appropriate for different situations, but should be represented by a different type. I'll try and give examples of both situations. If we are strictly storing documents, IMO the server should perform zero modification on the document. Validation could be applied conceptually as a constraint (and, possibly XSLT/XPATH to allow a fancy type of indexing). However there is no advantage that I can see to manipulating the document except to break the 'C' of ACID. My earlier comments wrt binary encoding is that there simply has to be a way to prevent the server mucking with my document. For example, if I was using postgres to store XML-EDI documents in a DX system this is the role I would prefer. Validation and indexing are useful, but my expected use of the server is a type of electronic xerox of the incoming document. I would be highly suspicious of any modification the server made to my document for any reason. Now, if we are storing xml as content, say for a content database backing a web page, the server takes on the role of a meta-data storage system. Now, it is reasonable to assume the server might do additional processing besides storage and validation. The character encoding of the incoming data is of little or no importance because the xml will almost certainly undergo an additional transformation step after extraction from the database. Flexibility, simplicity, and utility are the most important requirements, so text transformation to a default encoding would be quite useful. Based on your suggestions I think you are primarily concerned with the second example. However, in my work I do a lot of DX and I see the xml document as a binary object. Server-side validation would be extremely helpful, but please don't change my document! So, I submit that we are both right for different reasons. Regards, Merlin ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Allow backend to output result sets in XML
Greg Stark wrote: Personally I don't see any point in xml, but if there was a standard query protocol then a client could send queries to any database that supported it without using any libraries. That might be useful. Of course you could do that without xml, but people seem to get more excited about complying with standards when they invoke xml. hm. I have to deal with xml quite frequently because I do a lot of DX with the gov't and other entities that are rapidly standardizing on xml. I like Oracle's approach to xml using object relational mappings to allow composition of documents server side based on natural data relationships. The XML document becomes something like a specialized view. It would save me tons of app-level coding if the server could do this for me. Since postgres is already fairly Oracle-ish in design, IMO this is definitely the way to go (XQuery = Insanity.). A FE/BE protocol revision would be useful but not necessary...the XML doc could be returned as a scalar. Right now I think all xml processing is done in app-level code, because the server (due to limitations of sql) is usually unable to return data the way you want it...so simply adding xml output from psql would be fairly useless for most real tasks (if it wasn't, someone would have done it a long time ago). Also, contrib\xml can already handle most of the simple things. Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Allow backend to output result sets in XML
Peter Eisentraut wrote: I think output XML is just buzz. Give us a real use scenario and an indication that a majority also has that use scenario (vs. the other ones listed above), then we can talk. Consider: create table person (name varchar primary key, age int); create table account (number varchar primary key, name varchar references person); insert into person values ('Fred', 35); insert into person values ('Barney', 37); insert into account values ('1234', 'Fred'); insert into account values ('5678', 'Fred'); insert into account values ('abcd', 'Barney'); test=# select * from person into xml natural; ?xml version=1.0 encoding=UTF-8 ? result row n=1 person nameFred/name age35/age account number1234/number /account account number5678/number /account /person /row /result row n=2 [...] now consider: select * from person into xml natural namespace is 'some_uri' schema is 'person.xsd'; this returns result set above, but with schema and namespace declarations included. Of course, there is tons of complexity hiding in there, but is this worth considering? Merlin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] What's left?
Hello, I think it's safe to say there is a working implementation of a signal handler. The one tricky part left is to identify some smart places for the backend to check the awaiting signal queue. The first one is easy: switch recv() with select() with a reasonable timeout and a poll. If and when the signal handler gets patched into CVS, is there anything else left that prevents the cvs version from compiling and linking? From what I understand, Claudio's fork/exec implementation is either complete or nearly complete. I would like very much to help any way possible in solving any last remaining issues. Once the CVS sources are compliable, it will be easier to make meaningful contributions. I'm really looking forward to testing and benchmarking the win32 port. A big thanks to all who continue to work so hard on this project. Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint
Kevin Brown wrote: I have no idea whether or not this approach would work in Windows. The win32 API has ReadFileScatter/WriteFileScatter, which was developed to handle these types of problems. These two functions were added for the sole purpose of making SQL server run faster. They are always asynchronous and are very efficient. Perhaps the win32 port could just deal with the synchronization with an eye for future optimizations down the line? Merlin ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] MS SQL features for new version
Most of the new features are new database tools, etc. That has always been a strong point with SQL server. IMO, the weak point of the database (aside from the vendor and the price), is a lack flexibility of the client APIs and the stored procedure syntax. The interesting features are the xml querying, recursive querying, and 'multiple active results sets'. I think MS is laying down a more relational foundation which would allow for such features. They have been working on this product for years and years...it will be fun to see how it turns out. Merlin ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] ISAM driver for PostgreSQL
Does anybody think there might be some interest in an ISAM driver for PostgreSQL? I've written a functional alpha that allows PostgreSQL to be a drop in (or as easy as reasonably possible) replacement for an ISAM file system driving a COBOL application. It is a STL based thin wrapper around libpq that generates queries on the fly from traditional ISAM (read, write, start, etc.). It was made to be generic, allowing porting to various COBOL vendors' compilers with minimal effort. Error conditions are returned as traditional COBOL error codes. The main drawback to the system is that performance is highly sensitive to network latency (as are most COBOL applications). Unlike traditional hybrid COBOL which use ESQL, this system could run native COBOL code with reasonable performance in any COBOL application which allows linking to an external ISAM file system. This is very much a work in progress, built strictly for my own needs; but could be made to be useful in a more general sense. I'm thinking about cleaning up the code and setting up a project on sourceforge. PostgreSQL is uniquely suited for this purpose because of its portability and rich syntax. A good example is postgres's array type which maps very elegantly to COBOL's OCCURS syntax. Regards, Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [pgsql-hackers-win32] [HACKERS] What's left?
Greg Stark wrote: imposed no such conditions. If Microsoft wanted to release a Microsoft Postgresql under a completely proprietary license they would be free to do I have often wondered, in a completely off-topic and unproductive sort of way, if exactly that has not already been done by an unscrupulous or semi-scrupulous commercial vendor. This has been done in the past (a certain vendor's tcp/ip stack comes to mind), but I wonder if anyone ever pulled it with this project. Merlin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Regression tests on Nintendo Game Cube
Today Michael Steil and I have tested PostgreSQL 7.4.1 on Nintendo Game Cubes. All regression test (but stats - stats collector was off instead of on) have passed successfully. What about the XBOX? :-) Merlin ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Timing of 'SELECT 1'
Bruce Momjian wrote: I am timing small queries, and found that a PREPARE/EXECUTE of SELECT 1 takes about 1.2ms on my machine. A normal SELECT doesn't take much longer, so I am wondering why a simpler query isn't faster. Looking at log_executor_stats, I see the following. Execute shows nothing taking much time, mostly .2ms, but the total seems high. I wonder if one of our standard query start/stop functions is taking too long and can be optimized. There seems to be a 'PostgreSQL ping' time of about 1-2 ms in best case conditions which limits the amount of queries you can fire off in 1 second, no matter how simple. In certain rare cases this is something of a bottleneck. In my personal case it would be nice to see that time lower because converted COBOL applications tend to generate a lot of 'simple' queries. Merlin ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] optimizing impossible matches
Tom Lane wrote: The optimizer has no knowledge of specific operators except what it finds in the system catalogs. It has no way in general to determine that a comparison involving nonconstant values must always fail. Even if we could do it, I am dubious that it would be worth expending the cycles on every query to determine whether the situation holds. AFAICS those would be wasted cycles on the huge majority of queries. Ok, fair enough. And queries that did have this issue could be easily rewritten... Still, there is something that doesn't site quite right with me...my problems is with SQL really, not Postgres. For example, the server forbids 'abcd'::char(3) but allows 'abcd' char(3) because the operator is not bound to the specific type, but to the general type and ignores type constraints. In other words, SQL implicitly allows comparison between objects of different domains if the domains differ only by constraint (including size). Anyways, thanks for taking the time to answer. Merlin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Timing of 'SELECT 1'
The problem with gprof is that I am going to see all the backend startup stuff too, no? Is there a way to get a dump just the run of the query? I was sort of lurking on this thread, waiting to see what became of it. Did nobody actually come to a conclusion on what that last msec was from? I think the consensus was it was coming from the network layer somehow. If that's the case (it probably is), there isn't a whole lot that can be done about it except to bypass it using server side functions and such. Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] 7.5 beta version
The compilers from Microsoft and Borland atleast aren't compatible. But that shows up as link errors, not at runtime, right? Correct. Microsoft and Borland use different library packaging formats, COFF and OMF. However (non C++) DLLs are compatible and you can extract a static lib from a dll in the format of your choice. Merlin ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] What can we learn from MySQL?
J. Andrew Rogers wrote: No. The greatest strength of Postgres, marketing-wise, are technical and is what drives its growth today. I think most of the ease-of-use issues are in the packaging of the larger Postgres product and mid-level developer documentation, both of which seem to be eminently solvable problems. I think improved default product packaging would remove 80% plus, up to this point AFAIK the postgresql docs have not been quoted here: http://www.dbdebunk.com which speaks volumes ;) Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] ALTER TABLE TODO items
Two very minor points with the new alter table (not sure if they are even bugs are have already been addressed). 1. alter table alter type on a clustered index seems to drop the cluster (by design)? 2. alter table cluster on seems to give a strange error message of the index name is really the name of a table. Ex: alter table test cluster on test; returns: ERROR: cache lookup failed for index 19013 Merlin Log: test=# create table test (id int); CREATE TABLE test=# create index test_id_idx on test(id); CREATE INDEX test=# alter table test cluster on test; ERROR: cache lookup failed for index 19046 test=# alter table test cluster on test_id_idx; ALTER TABLE test=# \d test Table public.test Column | Type | Modifiers +-+--- id | integer | Indexes: test_id_idx btree (id) CLUSTER test=# alter table test alter id type bigint; ALTER TABLE test=# \d test Table public.test Column | Type | Modifiers ++--- id | bigint | Indexes: test_id_idx btree (id) test=# ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[HACKERS] alter table alter columns vs. domains
Is it feasible or practical to consider adding ALTER DOMAIN TYPE type? (basically following the same rules as ALTER TABLE). I don't mind bringing down all the views before I do this, it would be just great if you could change domains from a centralized location. Oracle offers this feature... Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] ALTER TABLE TODO items
Tom Lane wrote: I tweaked things so that the clustered flag is preserved for indexes that aren't directly affected by the ALTER TYPE. It would take more work to preserve the setting for an index that is rebuilt by ALTER TYPE, and I'm not even sure that it's sensible --- the new index could have a significantly different ordering from the old. What do you think? Fair enough. Perhaps a notice level log message would be appropriate? Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] alter table alter columns vs. domains
Yes. This is something I was going to look at doing in the next release. Quick question: With your potential changes, you would then be able to alter a domain that is involved in RI constraints between 2 or more tables without bringing down the constraints, yes? This would be great :) Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] alter table alter columns vs. domains
Tom Lane wrote: If we were willing to abuse the ALTER TABLE syntax some more, it would be possible to support changing the datatypes of f1 and f2 simultaneously, thereby allowing the above to work. The infrastructure for hacking multiple tables in parallel is already there in CVS tip, but it only gets exercised in cases involving inheritance. Just a clarification: isn't ALTER DOMAIN the best place to do this? IMHO, this is one of those rare cases were extending PostgreSQL beyond the SQL spec is justified. Right now, as I understand it, the only way to do these types of things is to bring down the RI rules for a short time until the table manipulation is completed, which seems inelegant, not to mention the convenience factor. Plus, I see anything that encourages usage of domains as good thing, as domains themselves are very good things (and quite underutilized by the unwashed masses, I expect). Merlin ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Usability, MySQL, Postgresql.org, gborg, contrib, etc.
[EMAIL PROTECTED] wrote: PostgreSQL. Maybe it is in the form of a web server like Samba's SWAT utility, I don't know (A SWAT type utility could run as the PostgreSQL I've found webmin to be pretty good swat type tool...it's lacking some things to be a full postgres administration system, but I think they have the right idea... (4) Blessed projects, lets play favorites. Lets find good and meaningful extensions on gborg and ./contrib and work with the authors and make them part of the PostgreSQL environment. Projects like, replication, .NET service provider, ODBC, pgAdmin, etc. are important and users need to find them close to PostgreSQL's main location. (5) Programming languages. We need to make a programming language standard in PostgreSQL. plpgsql is good, but isn't someone working on a Java language. That would be pretty slick. Personally, I like the idea of plphp better. I haven't used either, though. plgsql will always have value in converting oracle deployments, so it is truly hard to standardize on a favorite here. I think the latest versions of pgAdmin are just awesome. I never used to like it very much, but it really is great software. I think any effort expended on beefing that project up is not wasted. Merlin ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[HACKERS] FW: getting a crash during initdb
[sorry for the repost...lists down? Problem may or may not be on win32 version only] The catalog number was updated, so it was time to run initdb. Sometime over the weekend (or since the last initdb I ran, about a week), the source was updated which caused an AV CRASH during initdb...specifically during pg_depend step. Also, after initdb fails, the files are removed but the folders are not, causing subsequent attempts to initdb not to work...is this the expected hehavior? Here is the log: H:\initdb The files belonging to this database system will be owned by user Merlin.Moncure. This user must also own the server process. The database cluster will be initialized with locale C. fixing permissions on existing directory c:/postgres/data ... ok creating directory c:/postgres/data/global ... ok creating directory c:/postgres/data/pg_xlog ... ok creating directory c:/postgres/data/pg_clog ... ok creating directory c:/postgres/data/base ... ok creating directory c:/postgres/data/base/1 ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 1000 creating configuration files ... ok creating template1 database in c:/postgres/data/base/1 ... ok initializing pg_shadow ... ok enabling unlimited row size for system tables ... ok initializing pg_depend [CRASH]... initdb: child process was terminated by signal 5 initdb: failed initdb: removing contents of data directory c:/postgres/data Deleted file - c:\postgres\data\pg_hba.conf [...] Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] FW: getting a crash during initdb
I have no idea what caused the pg_depend stuff to crash. The AV is in postgres.exe following the first SQL call in setup_depend(). The problem is not in initdb (it hasn't changed) but something in the backend. Changing the SQL statement made no difference: I'd venture a guess that postgres.exe crashes when *any* statement is sent to it. About 20 files have changed since my last initdb; I have a list. Forced initdb + initdb crash = nasty bug, even if it's just the cvs version (specific only to windows?). I'll check why the rmdir command is not working as expected. Merlin ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] FW: getting a crash during initdb
I'll check why the rmdir command is not working as expected. I just poked around and couldn't figure out the cause. Initdb should either remove the directory if it created it, or remove everything _in_ the directory if the directory already existed. I tried the rmdir/del /s /q commands in a CMD window and it worked fine. Problem is backslash issue :) initdb is issuing del /s /q c:/postgres/data. Is there a library command to properly resolve slashes? Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] FW: getting a crash during initdb
Alvaro Herrera wrote: The AV is in postgres.exe following the first SQL call in setup_depend(). The problem is not in initdb (it hasn't changed) but something in the backend. Changing the SQL statement made no difference: I'd venture a guess that postgres.exe crashes when *any* statement is sent to it. About 20 files have changed since my last initdb; I have a list. Forced initdb + initdb crash = nasty bug, even if it's just the cvs version (specific only to windows?). I've initdb'd more than twenty times with CVS tip code, so it would seem to be Windows-specific. Thought as much. I posted to the win32 list but it hasn't shown up yet. Are you sure you're using a clean build and a really current checkout? Yes, 100%. Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] FW: getting a crash during initdb
make clean appears to have fixed the initdb crash :) sorry to bother... :) Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] FW: getting a crash during initdb
Bruce Momjian wrote: Merlin Moncure wrote: I'll check why the rmdir command is not working as expected. I just poked around and couldn't figure out the cause. Initdb should either remove the directory if it created it, or remove everything _in_ the directory if the directory already existed. I tried the rmdir/del /s /q commands in a CMD window and it worked fine. Problem is backslash issue :) initdb is issuing del /s /q c:/postgres/data. Is there a library command to properly resolve slashes? Uh, I just testd this using CMD and it worked: Yes, you are correct, I was jumping to conclusions. What's really strange is now initdb is properly cleaning up the folders. I examined my previous logs and apparently the value of rmtopdir parameter to rmtree was wrong... Note my previous initdb log said removing contents of ... (now says removing ...) Some more checking turns out it is all hooking on the return val of check_data_dir...perhaps the condition was hooking if I had the folder open in another window...so no worries there. Still having the problems with initdb. Merlin ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Call for 7.5 feature completion
Bruce Momjian wrote: Well, if Win32 doesn't complete by June 1, do we still do the feature freeze? I don't want to be adding features after the freeze, that is [...] As I remember, we decided that we should not make decisions to extend the feature freeze date just before the freeze date because it causes people to stop doing development, then we extend. I think we should go another week or two then decide if we should stay to June 1, and if we don't, we should schedule for July 1. Extending it by a week at a time, and almost before the deadline, has caused considerable waste of time in the past. I agree 100%. Right now the win32 side doesn't qualify for beta...just the date issue alone is a pretty big deal, IMO. There are pending patches for only about 50% of the outstanding issues. A tremendous amount of work has been done, but there is still quite a bit to be done to meet basic QC guidelines. So I suggest (my choices are of course subjective): Dividing win32 'should fix' (installer, /contrib, etc.) win32 'must fix' (psql query cancel, 1970 dates, non-cygwin regression) Apply freeze date to the 'must fix' items. 'Should fix' items can be delayed until the beta, dot release, or beyond. My personal estimation on completion date depends on what gets put in which category. If everything is designated 'must fix', I think July 1 is practical. Merlin ---(end of broadcast)--- TIP 8: explain analyze is your friend
[HACKERS] ~ crashes backend
In the current build on the anon cvs server, if I try to \d a table from psql, the backend comes down. I played with the query a bit and discovered any query using '~' operator in the where clause on any table (catalog or otherwise) causes an immediate backend crash. Can anybody confirm that this is not happening on a win32/non-win32 build? (I had to change a couple of things to compile, just want to make sure I didn't break anything). I did a make clean and a brand new initdb just to be safe. Merlin LOG: statement: select * from pg_catalog.pg_class where relname ~ 'test'; LOG: server process (PID 4544) was terminated by signal 5 LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server proc ess exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. FATAL: the database system is in recovery mode LOG: all server processes terminated; reinitializing LOG: database system was interrupted at 2004-05-21 11:22:35 Eastern Daylight Time LOG: checkpoint record is at 0/EEC050 LOG: redo record is at 0/EEC050; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 7678; next OID: 33592 LOG: database system was not properly shut down; automatic recovery in progress LOG: record with zero length at 0/EEC090 LOG: redo is not required LOG: database system is ready LOG: statement: select * from chevy.cusfil limit 1; LOG: statement: select * from chevy.cusfil where cust_addr1 ~ 'test'; LOG: server process (PID 5500) was terminated by signal 5 LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server proc ess exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] ~ crashes backend
I have binary install from snapshot from the other night on WINDOWS 2000 PRO and had same problem. well, I'm the one compiling the snapshot :), so I trying to confirm that I have not busted anything...when did you download the snapshot, though? I switched to gcc 3.4 yesterday, was it earlier than that? Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [pgsql-hackers-win32] ~ crashes backend
Merlin Moncure [EMAIL PROTECTED] writes: In the current build on the anon cvs server, if I try to \d a table from psql, the backend comes down. I played with the query a bit and discovered any query using '~' operator in the where clause on any table (catalog or otherwise) causes an immediate backend crash. Can anybody confirm that this is not happening on a win32/non-win32 build? CVS tip works fine for me on HPUX and Linux. The problem was gcc 3.4 for mingw. I reverted back to 3.3 and it works ok now. No more pre-compiled headers, oh well :( This is probably not a problem with gcc, but with mingw. I updated the binary build with the corrected version. Merlin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[HACKERS] possible obvious bug?
I was playing with the Japanese win32 7.2.1 port and I noticed that select 0 / 0 caused the server to crash and restart. I understand that it is a totally unsupported version, but it should be easy enough to check vs. the current version. Note that select 0.0/0.0 worked fine! Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] PostGres and WIN32, a plea!
I had read on one of the newsgroups that there is a planned native port to the win32 platform, is this true? I read most of the win32 thread off of the dev site and it was not clear if this was true. In either case, I would like to advocate such a port to be done, and soon, not for any altruistic reasons, but simply on behalf of myself (a windows applications developer) and the many others who are like me. I personally believe that Postgres has a great deal of potention in the applications market, with the database server packaged along with the application. There is a great deal of need for this for medium to high end windows applications, because there as of yet no existing Microsoft package that can handle it. Postgres is ideally suited for this need because of its rich server side programming interfaces, liberal licensing, and high performance. Mysql, despite their sucking up to the windows crowd, fails on all three of those counts. However they have realized the need for a database embedded application by allowing the mysql server to be linked directly with a windows app (at least, on a technical level), and have talked about providing a single user database .dll. I believe that mysql is not well suited for these types of applications though for stability and performance reasons. For all the talk of speed, I think postgres is the fastest database ever written for pc hardware, with the one possible exception of Microsoft Foxpro (note: not written by Microsoft). Sql server costs to much to ship with an app, and, quite frankly, is rather slow. Postgres could easily springboard into a very strong niche market in embedded applicaions. From there, with increased awareness and developer support on the windows side, it could start pecking at more traditional data services currently dominated by sql server, and (yuck!) access, and their eveil fraternal twin, visual basic. Site note: good strategic positioning in this regard would be an XML shell for postgres (pgxml) and a data provider for .net. Thats my .02$. Many great thanks to the dev team. Please don't let postgres continue to be the software wold's best kept secret. Merlin ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] PostGres and WIN32, a plea!
Hmm, ever tried using a large multiuser database such as a finance system using a Foxpro database? Network managers have been known to murder for less... :-) Hmm, I have, and you could imagine the result :) It was a small system, really and everything was fine until I added my 10th user. Then my data left me like the parting of the Red Sea :). Building a database system on lousy tehnology, only to rewrite it is something all database admins have to go through. I think its kind of like coming of age. On the unix side of things, you have mysql catching people the same way. FP did have a very nice query optimizer. Also, FP views optimized the where condition through the query, and have for quite some time (does PG do this yet?). I think the FP team was really on to something, till M hamstrung the project. FP also had the ability to write user defined functions into the query, something I thought I would have to give up forever, until I stumbled across PG (from the mysql docs, go figure!) Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [PERFORM] Realtime VACUUM, was: performance of insert/delete/update
How about OPTIMIZE? eg. optimize customers instead of analyze, could be paired with agressive so, OPTIMIZE AGREESSIVE very much a glass half empty, half full type thing. vacuum is not a problem, its a solution. Merlin Curtis Faith [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... tom lane wrote: Sure, it's just shuffling the housekeeping work from one place to another. The thing that I like about Postgres' approach is that we put the housekeeping in a background task (VACUUM) rather than in the critical path of foreground transaction commit. Thinking with my marketing hat on, MVCC would be a much bigger win if VACUUM was not required (or was done automagically). The need for periodic VACUUM just gives ammunition to the PostgreSQL opponents who can claim we are deferring work but that it amounts to the same thing. A fully automatic background VACUUM will significantly reduce but will not eliminate this perceived weakness. However, it always seemed to me there should be some way to reuse the space more dynamically and quickly than a background VACUUM thereby reducing the percentage of tuples that are expired in heavy update cases. If only a very tiny number of tuples on the disk are expired this will reduce the aggregate performance/space penalty of MVCC into insignificance for the majority of uses. Couldn't we reuse tuple and index space as soon as there are no transactions that depend on the old tuple or index values. I have imagined that this was always part of the long-term master plan. Couldn't we keep a list of dead tuples in shared memory and look in the list first when deciding where to place new values for inserts or updates so we don't have to rely on VACUUM (even a background one)? If there are expired tuple slots in the list these would be used before allocating a new slot from the tuple heap. The only issue is determining the lowest transaction ID for in-process transactions which seems relatively easy to do (if it's not already done somewhere). In the normal shutdown and startup case, a tuple VACUUM could be performed automatically. This would normally be very fast since there would not be many tuples in the list. Index slots would be handled differently since these cannot be substituted one for another. However, these could be recovered as part of every index page update. Pages would be scanned before being written and any expired slots that had transaction ID's lower than the lowest active slot would be removed. This could be done for non-leaf pages as well and would result in only reorganizing a page that is already going to be written thereby not adding much to the overall work. I don't think that internal pages that contain pointers to values in nodes further down the tree that are no longer in the leaf nodes because of this partial expired entry elimination will cause a problem since searches and scans will still work fine. Does VACUUM do something that could not be handled in this realtime manner? - Curtis ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] win32 port --asynchronous I/O and memory
Just a quick question... are you guys using the C runtime or the win32 API to do things like file i/o and memory allocation. If you are using the win32 api, are you using asynchronous I/O? Generally, how much raw win32 code do you expect to write (assumption: as little as possible). As for memory, what's the general allocation scheme? I have not looked at the source much, but I know postgres has a very good memory manager. There are a few different ways of going about it. I wrote a database backend of sorts a while back and my experience was that you have to take certain precautions or you are in danger of thrashing the server, which in extreme cases is basically the same as crashing the system. Part of the danger is memory allocations for the database sometimes compete with the file system caching, causing massive performance degradations. MSSQL avoids this because it is very tightly wound with the virtual allocation system. Merlin ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] mysql -- cygwin
mysql does not have cygwin in the server, either static or otherwise. We looked at the code a while back and confirmed this. mysql has a much smaller code base than pg. If they did, it would be a very strange deal because you can link your app directly to the mysql server (for 200$...non GPL) which would be a weird interaction with cygwin license if it were static linked to the mysql server (it isn't). mysql does have some win32 isms in the code, notably asnch file i/o in some cases (mostly the import), which gives very fast data import speed (amazingly fast, even). In testing, this turned out to be the only thing fast about mysql on windows so we dropped the project. mysql on windows also seems to spend a lot of time waiting on mutexes and is not stable when canceling large queries in progress. Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] mysql -- cygwin
I'm not sure what version of MySQL you were looking at, but that's certainly doesn't seem to be the case anymore. I just checked: MySQL 4.0.9 has ~435,000 LOC, PgSQL from CVS HEAD has ~372,000. Hmm, you got me there, tho this was some time back from the last version of the 3.x series. Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Win32 Technical Questions
Can the ConsoleApp thing be written in C so we don't have to get an extra C++ compiler for one file (for those who don't want to use the Microsoft toolchain)? Critical sections and semaphores and mutexes are all available from the win32 API. I agree with Peter: I am not sure it is a good idea to introduce MFC dependency for process control and application initializations. This would allow compilation without having the MFC (although, Borland supplies a version of the MFC with its non-free compilers). Also, a C++ compiler would not be required although this to me as strictly an aesthetic point. Likewise, I think the API functions are the best choice for file i/o for possible adaption of asynch file i/o. Despite the name, the apps written carefully using the win32 api should generally compile ok in 64 bit environment. Merlin ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Win32 and fsync()
For Win32, in order to emulate fsync() we will need to call FlushFileBuffers(): The supplied link does not work. FlushFileBuffers() is for flushing files opened with CreateFile() etc. For files opened with fopen(), call fflush(). For files opened with _open(), call _commit(). Likekly the win32 port code, which I have yet to inspect, will use the second version. Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Win32 and fsync()
I'm having difficulty digging up the reference, but I think I recall seeing something that said, roughly, on W32 there are 2 sets of buffers - those in the user level library and those in the kernel level driver, and FlushFileBuffers drains the first, while _commit drains both (it includes a call to FlushFileBuffers). You were correct: here is the source. int __cdecl _commit ( int filedes ) { int retval; /* if filedes out of range, complain */ if ( ((unsigned)filedes = (unsigned)_nhandle) || !(_osfile(filedes) FOPEN) ) { errno = EBADF; return (-1); } _lock_fh(filedes); /* if filedes open, try to commit, else fall through to bad */ if (_osfile(filedes) FOPEN) { if ( !FlushFileBuffers((HANDLE)_get_osfhandle(filedes)) ) { retval = GetLastError(); } else { retval = 0; /* return success */ } /* map the OS return code to C errno value and return code */ if (retval == 0) { goto good; } else { _doserrno = retval; goto bad; } } bad : errno = EBADF; retval = -1; good : _unlock_fh(filedes); return (retval); } ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Win32 and fsync()
I think the fopen or _open family of functions all map directly to the win32 API. They add a little cruft, which generally makes using them pointless, because you have less control over security, caching, and other such things when opening the file. There is the slight overhead of the extra call, and the conversion of the C handle to the win32 handle. However, all you get with the api is readfile() and writefile(), which means workhorses like fprintf() and especially fgets() and must be reimplemented (in lieu of using the C++ runtime). MS has always shipped the source code to their standard libraries on the CD. However, the compiler source remains closed. This seems insignificant, but sometimes functions are totally inlined within the compiler (e.g. math functions like sin() and exp()). They have a special calling convention __dllspec(naked), which strips the function prolog and epilog and allows you to write your own in assembly. You can use that to write very fast functions when you don't trust the optimizer to inline your functions. I think your are correct some of the math functions are written this way. Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [pgsql-advocacy] [HACKERS] PostgreSQL Benchmarks
I've tested all the win32 versions of postgres I can get my hands on (cygwin and not), and my general feeling is that they have problems with insert performance with fsync() turned on, probably the fault of the os. Select performance is not so much affected. This is easily solved with transactions and other such things. Also Postgres benefits from pl just like oracle. May I make a suggestion that maybe it is time to start thinking about tuning the default config file, IMHO its just a little bit too conservative, and its hurting you in benchmarks being run by idiots, but its still bad publicity. Any real database admin would know his test are synthetic and not meaningful without having to look at the #s. This is irritating me so much that I am going to put together a benchmark of my own, a real world one, on (publicly available) real world data. Mysql is a real dog in a lot of situations. The FCC publishes a database of wireless transmitters that has tables with 10 million records in it. I'll pump that into pg, run some benchmarks, real world queries, and we'll see who the faster database *really* is. This is just a publicity issue, that's all. Its still annoying though. I'll even run an open challenge to database admin to beat query performance of postgres in such datasets, complex multi table joins, etc. I'll even throw out the whole table locking issue and analyze single user performance. Merlin _ How much of the performance difference is from the RDBMS, from the middleware, and from the quality of implementation in the middleware. While I'm not surprised that the the cygwin version of PostgreSQL is slow, those results don't tell me anything about the quality of the middleware interface between PHP and PostgreSQL. Does anyone know if we can rule out some of the performance loss by pinning it to bad middleware implementation for PostgreSQL? Regards, -- Greg Copeland [EMAIL PROTECTED] Copeland Computer Consulting ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] Windows SHMMAX (was: Default configuration)
Does anyone know whether cygwin has a setting comparable to SHMMAX, and if so what is its default value? How about the upcoming native Windows port --- any issues there? From a pure win32 point of view, a good approach would be to use the VirtualAlloc() memory allocation functions and set up a paged memory allocation system. From a very top down point of view, this is the method of choice if portability is not an issue. An abstraction to use this technique within pg context is probably complex and requires writing lots of win32 api code, which is obviously not desirable. Another way of looking at it is memory mapped files. This probably most closely resembles unix shared memory and is the de facto standard way for interprocess memory block sharing. Sadly, performance will suffer because you have to rely on the virtual memory system (think: writing to files) to do a lot of stupid stuff you don't necessarily want or need. The OS has to guarantee that the memory can be swapped out to file at any time and therefore mirrors the pagefile to the allocated memory blocks. With the C++/C memory malloc/free api, you are supposed to be able to get some of the benefits of virtual alloc (in particular, setting a process memory allocation limit), but personal experience did not bear this out. However, this api sits directly over the virtual allocation system and is the most portable. The application has to guard against fragmentation and things like that in this case. In win32, server thrashing is public enemy #1 for database servers, mostly due to the virtual allocation system (which is quite fast when used right, btw). Merlin ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
FW: [HACKERS] Changing the default configuration (was Re:
True, but even so, 32 max connections is a bit light. I have more pgsql databases than that on my box now. My point in my previous answer to Tom was that you HAVE to shut down postgresql to change this. It doesn't allocate tons of semaphores on startup, [snip] is this correct? I recall looking through the source and seeing comments to the affect that it is better to allocate them all (semaphores) up front in order to prevent runtime failed allocations. (could be totally off base on this). You don't have to be using apache to need more than 32 simo connections. Heck, how many postgresql databases do you figure are in production with that setting still in there? My guess is not many. [snip] True, and it is not unheard of to put minimum specs for version x of the database, i.e. 7.4 requires kernel 2.x and so on. Here's the comment I was referring to: /* * InitProcGlobal - *initializes the global process table. We put it here so that *the postmaster can do this initialization. * *We also create all the per-process semaphores we will need to support *the requested number of backends. We used to allocate semaphores *only when backends were actually started up, but that is bad because *it lets Postgres fail under load --- a lot of Unix systems are *(mis)configured with small limits on the number of semaphores, and *running out when trying to start another backend is a common failure. *So, now we grab enough semaphores to support the desired max number *of backends immediately at initialization --- if the sysadmin has set *MaxBackends higher than his kernel will support, he'll find out sooner *rather than later. * *Another reason for creating semaphores here is that the semaphore *implementation typically requires us to create semaphores in the *postmaster, not in backends. */ Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] win32 port
Has a final decision been made if the win32 port is going to be threaded or not? Merlin
Re: [HACKERS] PostgreSQL Windows port strategy
I think there should be a special mailing list set up called pg-sql-win32-advocacy where people can continually harass the postgres dev team and debate the merits of the win32 operating system. In particular, make sure to insult them for being elitists (while at the same time asking them to work on software which you have not paid for). We can also rout all the questions about how to set up cygin and the ipc-daemon over there too. Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Windows SHMMAX (was: Default configuration)
Another way of looking at it is memory mapped files. This probably most closely resembles unix shared memory and is the de facto standard way for interprocess memory block sharing. Sadly, performance will suffer because you have to rely on the virtual memory system (think: writing to files) to do a lot of stupid stuff you don't necessarily want or need. To the contrary, for the majority of the shared memory usage of postgres, which is cached file data, the virtual memory system is doing exactly what you want it to: managing the movement of data between memory and disk, and caching the more frequently accessed data to reduce the chances you will actually need to access the disk for it. Yes. Generally, I was trying to point out the disadvantages of memory mapped files compared to shared memory. In windows, there is no direct equivalent so shared memory. MMFs are very similar in usage. I suspect they might not perform quite as well as the shared memory functions. For example, if used in place of shared memory to cache static file data, you are maintaining: 1. the file itself, 2. the file cache handled by the os. 3. the MMF memory side cache (following a page fault). 4. the virtual memory space set aside for the os to swap it out should the os need more memory. MMFs are efficient when memory allocations are relatively static: they work especially well with a freestore memory allocation system (this minimizes movement inside the virtual memory pagefile). For example, the MMF is allocated at the startup of the backend and doled out to processes through an internal 'as needed' basis. This is equivalent in function to memory allocations using the VirtualAlloc() family except its good for IPC. (IMHO, it will still run slower). If memory allocations are frequent and dynamic, you start to run into problems with fragmentation of the pagefile and such problems. This is very undesirable. Also, if memory allocations are large, you could potentially run into the worst possible scenario: your file cache system is competing with the virtual memory system. This will cause the server to thrash. One workaround for that is to set up the files for sequential access: this minimizes os caching of files. This also more or less removes 'double dipping' into the memory system to cache your static file data. The down side is that the work of maintaining an intelligent file cache has been offloaded from the OS to you, the programmer. I am not experienced enough with the postgres memory allocation system to say how well this would work for PostgreSQL. For shared memory used only for IPC, typically a VM system treats it no differently from any other non-shared memory, so if it's doing something you don't want or need (a proposition I quite heartily disagree with), it's going to be doing that very every piece of memory your application allocates and uses, shared or not. The OS has to guarantee that the memory can be swapped out to file at any time and therefore mirrors the pagefile to the allocated memory blocks. The OS does not need to write the pagefile. On modern Unix systems that are not allowing overcommit, the space will be allocated but never written unless there's a need to free up some physical memory, and the pages in question are used infrequently enough that the system decides that they are good candidates to be paged out. I would imagine that Windows does the same. In windows, things are backwards: the space is allocated in virtual memory *first* (i.e. the page file), then following a page fault it gets swapped into memory. The overhead I spoke of was related to the fact the windows always has to ensure space exists in the page file (or some user defined file) to swap the file back out. IMHO, *nix has a much superior approach to IPC in this context. It's much simpler and very straightforward. It also exlains why in windows, most server apps are multi threaded, not multi process. I agree with you on most salient points. The question is: are MMFs the proper analog of SHHMEM on native port of postgres? My answer to that question is: it is by no means certain, but what else is there to use? Merlin cjs -- Curt Sampson [EMAIL PROTECTED] +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] PostgreSQL Windows port strategy
Well said I was just trying to be funny. Since we are all programmers here, I'll try and re-express my thought in a (somewhat) portable language: char HackerThought[] = Maybe there really is something to this windows business. Perhaps its time to consider a port. int iAnnoyanceLevel = 0; PerformDailyRoutine(); PerformDailyRoutine () { char* EmailMessage; Think(HackerThought); while (EmailMessage = GetEmailMessage(HACKER_LIST)) { if (!strcmp(EmailMessage, Windows is an unstable, crappy OS)) iAnnoyanceLevel++; if (!strcmp(EmailMessage, Windows is better than that pile of trash, linux)) iAnnoyanceLevel += 2; if (!strcmp(EmailMessage, Can anybody here tell me how to install the IPC-Daemon as a service on my win98 machine?)) iAnnoyanceLevel += 100; } } Really, I'm new here, and I have not business telling anybody anything, anyways :) Merlin -Original Message- From: Jean-Michel POURE [mailto:[EMAIL PROTECTED]] Sent: Wednesday, February 12, 2003 11:26 AM To: [EMAIL PROTECTED] Subject: Re: [HACKERS] PostgreSQL Windows port strategy Le Mercredi 12 Février 2003 15:49, Merlin Moncure a écrit : I think there should be a special mailing list set up called pg-sql-win32-advocacy where people can continually harass the postgres dev team and debate the merits of the win32 operating system. I realize my views about PostgreSQL are not shared by anyone, otherwize in a few hours time, I would have received several emails saying I have the same feeling, etc It seems like everyone is looking for something different, which ultimately turns out to become ... PostgreSQL. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Can pessimistic locking be emulated?
I am trying to emulate a pessimistic locking system you would find in an old school database file system, for example cobol. Generally, when a cobol program tries to read a record that is locked by somebody else, the read fails and either a message is displayed by the user or a error handling procedure is executed. I would like to emulate this behavior for legacy code while using mvcc for newer procedures I write. 4 questions: 1. Can you query if a tuple is locked by another transaction (the documentation unclearly suggests this can't be done via the pg_lock view) before executing select for update...? 2. If so, is this reasonable efficient to do, i.e. straight join on oid/xid? 3. If so, is this possible to query without a race condition regarding the lock status? 4. If so, is this likely to be possible in future versions of postgres without non-trivial changes? In other words, if User B attempts to select for update a record that user A has selected for update, it would be nice if User B's query would fail with a NOTICE to act upon. Thanks in advance, Merlin ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Can pessimistic locking be emulated?
That's my fallback position. Obviously, this will lead to false positives depending on server load. In my case, I'm targeting between 30-50 users so its likely to throw timeouts for various reasons other than locks even though my queries of interest are generally select a from b where id = c type of thing. This is a kludgy solution but its still better than writing cobol. The bigger issue is that a timeout will not return the reason the query timed out. There are cases where I would like to run a select for update over a range of records and handle the locked records and unlocked records differently. A query that could match locked oids vs the oids I am interested in would be super. I could then aggregate my select for updates into larger queries and reap massive performance gains. Another way of putting it is this: waiting for your select to timeout is kind of like parking in Manhattan: you back your car up until you hit the next car. I would sort of like to, uh, look in the rear view mirror first. Merlin In other words, if User B attempts to select for update a record that user A has selected for update, it would be nice if User B's query would fail with a NOTICE to act upon. No idea if this is of any help, but you may have a look into PostgreSQL 7.3 Documentation 3.4. Run-time Configuration STATEMENT_TIMEOUT (integer) Aborts any statement that takes over the specified number of milliseconds. A value of zero turns off the timer. Regards, Christoph ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Can pessimistic locking be emulated?
I was referring to 10.3 in the administrator's guide, regarding the pg_lock view. According to the documentation, the view only contains table level locks. However, the view also contains an xid for transactions. The unclear part, at least to me, was what the role of the xid was in the view and if it could be used to produce a list of locked tuples somehow. The xid is referred to as a 'lockable object'. I wasn't sure of the xid's role in the mix. I see now how it all works together. In my case, being able to view outstanding row level locks would be enormously useful. I'm assuming this is not possible for structural or performance reasons. I'm aware of the possible nasty side affects of repeated query calls to the lock manager. I'm also aware what I'm asking about may be folly or silly, my understanding of how mvcc and transactions work together is not very refined. A curious thought struck me: does the pg_lock view follow the mvcc rules, i.e. if you query the pg_lock view inside a transaction, and an external effect introduces new locks into the server are you able to see those locks? Merlin 1. Can you query if a tuple is locked by another transaction (the documentation unclearly suggests this can't be done via the pg_lock view) before executing select for update...? Where did you find this? Regards, Christoph ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Can pessimistic locking be emulated?
This directly answers my question (wasn't previously aware that xid could be queried out in such a useful fashion). Not only does this accomplish what I need, but now allows me to not use select ... for update and stick with a transaction based locking mechanism. The 'Why' isn't that interesting in my case: merely that the knowledge that the record is involved in a transaction is enough. I've felt for a while that the descriptions of transactions, mvcc, and row level locking in the official docs could use a little bit better treatment (selfishly motivated, I could never figure them completely out!) but this is the wrong list for that :). Many thanks to the hackers for helping me with my problem. Merlin Actually, I don't think you need a dirty read at all. A locked row can't be deleted as well (because there's only one xmax slot), so if you can see it (ie, you think its xmin is committed) then you can in principle find out whether it's locked or not. We just don't expose the info at the moment. (You can see xmax at the user level, but you can't easily tell if xmax is trying to delete the row or just lock it, because you don't have access to the infomask bit that would tell you.) regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
FW: [HACKERS] [PATCHES] XML ouput for psql
-Original Message- From: Merlin Moncure Sent: Monday, March 03, 2003 3:47 PM To: 'Peter Eisentraut' Subject: RE: [HACKERS] [PATCHES] XML ouput for psql My 0.2$: keep the xml formatting rules as simple as possible and rely on xslt to do the document markup (going out) and schemas/xslt to do the validation (going in). This would allow flexibility for any type of application. Without that stuff just do very basic columndata/column. There are many excellent free tools for xml manipulation which imho are the best thing about xml. Xml datasets that are not nested tend to get verbose :( Merlin -Original Message- From: Peter Eisentraut [mailto:[EMAIL PROTECTED] Sent: Monday, March 03, 2003 12:55 PM To: [EMAIL PROTECTED] Cc: PostgreSQL Development Subject: Re: [HACKERS] [PATCHES] XML ouput for psql [EMAIL PROTECTED] writes: I don't agree with this: XML and XHTML are two different things. No one claimed anything to the contrary. We could certainly upgrade the HTML portion, but I am pretty sure that the XML standard calls for this format: columnnamedata here/columnname The XML standard does not call for any table format. But a number of table formats have been established within the XML framework. Some of them are formatting-oriented (e.g., the HTML model, or CALS which is used in DocBook) and some of them are processing-oriented (e.g., SQL/XML). Which do we need? And which do we need from psql in particular (keeping in mind that psql is primarily for interactive use and shell-scripting)? In any case, it should most likely be a standard table model and not a hand-crafted one. (If, for whatever reason, we go the processing-oriented route, then I claim that there should not be a different output with and without \x mode.) -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] XML ouput for psql
XSLT could be used to convert virtually any xml table format directly into an insert statement. For me, this is better than using a programming language plus a parser. XSLT is quite powerful and fast and is build on top of xpath, and is a closer fit to the declarative programming model of sql. Validation could be done at the xslt stage or with schemas, which I prefer. Acually, the difficult part has been getting the information back into the database. Getting it out is a very simple query. I imagine that every language/environment has an SQL-XML library somewhere, but I wasn't able to find something that would go from XML to SQL. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] XML ouput for psql
This is just about a total conversion of the backend to an xml document server. The marriage of xml and sql is awkward and not easily retrofitted to existing databases. Its pretty much proven that hierarchal storage techniques (xml included) are more difficult to manage and use than traditional sql databases. However, xml does have some very powerful supplemental technologies for document generation on the client end, especially xslt. Unless there is a compelling reason to use those tools, you are 99% likely better off not using xml at all. XML has also found a niche in the edi world, but in this case you can get away with using the blob technique below. Before thinking about xquery you have to think about the role tuples and relations play in storing xml data. The simplest route is to store your xml data in a blob object and use little tricks like xslt to extract elements out of the document into separate fields for index purposes and use vanilla sql queries to get the data. This is a very loose coupling of sql and xml and leads to very brittle designs because at the very least you have to store two separate definitions of your data as well as your marshalling xslt scripts. I wrote a little blob based xml server which you can see here if you are interested: http://www.radiosoft.com/freetools.php?Location=Directional+Patterns The real magic was in the xslt which you can't see, though. A tighter coupling would involve decomposing your xml structure into discrete objects and building xml power into the backend. I think it is yet to be proven if this is even reasonably possible. The big problem with xml is that there is too many ways to do things, for example the choice of dtd or schemas. I think, the problem with this approach is recomposing your document involves complex or inefficient queries. If you think this is good idea, good luck, many companies have tried and I've personally not seen one that seems to work very well. The next major version of MS SQL server is rumored to be something of an xml document server, and that's been several years in development. Merlin -Original Message- From: Bob Calco [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 05, 2003 9:10 AM To: Merlin Moncure; Alan Gutierrez Cc: [EMAIL PROTECTED] Subject: RE: [HACKERS] XML ouput for psql I would like to see PostgreSQL eventually support XQuery: http://www.w3.org/TR/xquery/ http://www.w3.org/TR/query-datamodel/ I see potentially an alternative front end called xsql, providing substantially the same functionality as psql, only using XQuery syntax and optionally returning recordsets as XML. Anybody want to put together a team to explore this seriously? There are probably several non-trivial semantic issues on the back end, but I only dimly grasp them at this point. - Bob Calco %% -Original Message- %% From: [EMAIL PROTECTED] %% [mailto:[EMAIL PROTECTED] Behalf Of Merlin Moncure %% Sent: Wednesday, March 05, 2003 8:16 AM %% To: Alan Gutierrez %% Cc: [EMAIL PROTECTED] %% Subject: Re: [HACKERS] XML ouput for psql %% %% %% XSLT could be used to convert virtually any xml table format directly %% into an insert statement. For me, this is better than using a %% programming language plus a parser. XSLT is quite powerful and fast and %% is build on top of xpath, and is a closer fit to the declarative %% programming model of sql. Validation could be done at the xslt stage or %% with schemas, which I prefer. %% %% %% Acually, the difficult part has been getting the information back %% into the database. Getting it out is a very simple query. I imagine %% that every language/environment has an SQL-XML library somewhere, %% but I wasn't able to find something that would go from XML to SQL. %% %% %% ---(end of broadcast)-- - %% TIP 3: if posting/reading through Usenet, please send an appropriate %% subscribe-nomail command to [EMAIL PROTECTED] so that your %% message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Win32 Powerfail testing
My experience with windows backend work is that you have to turn off all buffering and implement your own write cache of sorts. Flushing is not the only reason: heavy buffering of files (the default behavior) also tends to thrash the server, because the cache does not always release memory properly. Likewise, with memory for maximum results you have to go straight to VirtualAlloc() and avoid using the C run time to do any persistent memory allocation. Memory pages get mapped to file pages and all file reads/writes are on sector boundaries. Generally, it's a nightmare. Merlin -Original Message- From: Dave Page [mailto:[EMAIL PROTECTED] Sent: Thursday, March 06, 2003 11:02 AM To: Tatsuo Ishii Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] Win32 Powerfail testing -Original Message- From: Tatsuo Ishii [mailto:[EMAIL PROTECTED] Sent: 06 March 2003 15:17 To: Dave Page Cc: [EMAIL PROTECTED] Subject: Re: [HACKERS] Win32 Powerfail testing I'm sure FlushFileBuffers() is usesless for files opend with open() too. As I said in the previlus mails, open()+_commit() does the right job with the transaction log files. So probably I think I should stick with open()+_commit() approach for ordinary table/index files too. Oh, I didn't see that message. So it's either: open() + _commit() Or CreateFile() + FlushFileBuffers() Magnus also mentioned using FILE_FLAG_NO_BUFFERING or FILE_FLAG_WRITE_THROUGH with CreateFile(). I was concerned about the additional complexity with FILE_FLAG_NO_BUFFERING, but FILE_FLAG_WRITE_THROUGH sounds like it might do the job, if a little sub-optimally. Is there really no way of allowing a decent write cache, but then being able to guarantee a flush at the required time? Sounds a little cuckoo to me but then it is Microsoft... Anyhoo, it sounds like open() and _commit is this best choice as you say. Regards, Dave. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Aggregate rollup
-Original Message- From: mlw [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 05, 2003 3:47 PM To: [EMAIL PROTECTED] Subject: [HACKERS] Aggregate rollup I had written a piece of code about two years ago that used the aggregate feature of PostgreSQL to create an array of integers from an aggregate, as: select int_array_aggregate( column ) from table group by column Do I understand correctly that this still follows the normal rules for grouping, so that only like values are put in the array? Example: column has values 1,1,1,2,2 spread over 5 rows. Your query returns two rows with row1={1,1,1} and row2 = {2,2}...is this correct? Also, what if your aggregate column is different from the group column: Table t with columns c1, c2 with 5 rows: C1 C2 1, 1 1, 2 1, 3 2, 1 2, 2 Does select C1, int_array_aggregate( C2 ) from table group by C1 return 1, {1, 2, 3} 2, {1, 2} ?? FWIW, I think that's a pretty cool function. This allows the backend to telescope 1 dimension (only) out of a dataset, the most detailed one. In certain situations with large datasets over slow connections, this could be a big payoff. Also, all this talk about XML has got me thinking about how to allow basic query features to provide simple nesting services. consider: select C1, C2 from t for xml; returns: t C11/C1C21/C2 C11/C1C22/C2 C11/C1C23/C2 C12/C1C21/C2 C12/C1C22/C2 /t select C1, xml_aggregate(C2) from t for xml; returns: t C1 value=1C21/C2C22/C2C23/C2/C1 C1 value=2C21/C2C22/C2C23/C2/C1 /t create table fast_lookup as select reference, int_array_aggregate(result) from table group by result The question is, would a more comprehensive solution be wanted? Possible? Something like: Any thoughts? I think I need to fix the code in the current /contrib/intagg anyway, so is it worth doing the extra work to included multiple data types? Yes. Just a thought. Merlin ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] What's up with www.postgresql.org?
I haven't been able to get to it all morning. I think its time to consider migrating off of mysql on the web server. Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] talking to postgresql from C/C++
mlw [mailto:[EMAIL PROTECTED] wrote: I use PostgreSQL with C++ all the time. I actually have a SQL class that abstracts libpq and ODBC, so I'm pretty much past a lot of the how I want to use it stuff. What about libpq++? I have not used the thing, but if he absolutely insists on using C++ in his database interface that's at least worth checking out. Same for embedded C. I often use the zeos toolkit for postgres, which works with C++ Builder, Delphi, and Kylix. If you use those tools I can vouch that they are a good way to write apps with postgres. The zeos connection toolkit is an order of magnitude faster than pgodbc. For tight oo integration with the database, I would take either Java or (if you hail from *nix and can deal with mono) C#. Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Who puts the Windows binaries on the FTP server?
Justin Clift wrote: This is the 'proof of concept' cygwin windows build. Strangely, I have a newer build than the one on the ftp server. Is there a binary version of postgres with Jan's patch available? Uh Oh. When you say newer version, what gives the feeling of it being newer? The timestamp of the file on the ftp server is 1/28/03. The timestamp of file I previously dl'd (which I collected from whatever link you posted on this list) is 2/3/03. However I downloaded the older version and they are the same (same number of bytes, at least). Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] division by zero
Tom Lane wrote: I checked into this, and indeed OS X 10.2 is behaving funny: integer divide by zero doesn't raise any signal, it just returns a bogus answer. They're within their rights to do so according to the ANSI C spec (wherein division by zero is stated to have undefined behavior). But since other BSD-derived Unixen all seem to raise SIGFPE, I can't help wondering if this shouldn't be considered a bug. FWIW, this also is a problem with some of the windows ports. For example, 'select 0/0' is unpredictable and can cause the server to gpf and restart. This does not include the SRA port, because I don't have it. Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Roadmap for FE/BE protocol redesign
Justin Clift wrote: PostgreSQL 8.0 ** + Includes PITR and the Win32 port *snip* I feel like the upcoming 7.4 is the most important release since the introduction of toast, maybe even since the introduction of the sql language. I wholeheartedly agree with your proposition. Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [GENERAL] division by zero
Tom Lane wrote: This is not C. I can't argue that; but it will compile on a C compiler on the Microsoft platform. I'm not sure if you were answering tongue-in-cheek, so for the benefit of the group: __try and __except, as far as I can tell are the only way to gracefully handle certain events. There is also a __finally. This is very much a Microsoft hack to C and not C++. GetExceptionCode() is from the win32 api. In C++, you get to use the much more standard try/catch system. Katie mentioned a while back using CWinApp from MFC for the windows port. I advised against this based on it requiring a C++ compiler and the MFC libs. However, if the win32 port is going that route maybe introducing a little c++ exception handling might be the best solution to the int/0 problem. Barring that, it comes down to a choice of two not very pleasant scenarios: either adopting the __try abomination or standardizing on non-microsoft implementation of the C run time. You can forget using anything from MFC in this case. The only other solution is a #ifdef win32 around places that potentially use integers in the divisor and do some nasty hacking. I would prefer to use some type of signaling or 'exception' handling to that. The end justifies the means, I suppose. Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [GENERAL] division by zero
Doug Royer wrote: No, try/catch does not trap division by zero unless the underlying implementation throws an error there is nothing to catch. I am absolutely 100% sure that you can catch int/0 with a try catch handler (in c++) on windows platforms (when compiled with ms/borland compiler). All these weird issues are a direct result of windows's dos legacy. Try it and see. Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [INTERFACES] Roadmap for FE/BE protocol redesign
If the backend can not handle the version I request, but can handle a prior version, I'd like to know. I am planning on having handlers for multiple protocol versions in the same memory space (I'm using Smalltalk, BTW) so that one application can talk to various databases of various vintages. how about a system table, say pg_features which can be queried by the fe interface? This could be a table of bools with named features and/or other little widgets for compatibility. No protocol modification required, unless the scope of the protocol change is such that you can't execute a connection and a query. Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Roadmap for FE/BE protocol redesign
Justin Clift wrote: confidentiality level of the Win32/PITR patches at present, but I'd guess there would be at least a few solid volunteers willing to contribute to the Win32/PITR ports if we asked for people to step forwards. I'd like to help. I've been following the list for several months now. I'd like to submit a patch or two and get a little active if I can cut the mustard, so to speak. If not, I can always supply a little testing and benchmarking. I have some experience writing backends, GIS, xml, and all manner of database. Unfortunately I have very low familiarity with unix tools, except cvs. Is there a non cygwin version of the source that will compile (or not) on a win32 machine? That would be a great place to start. Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Roadmap for FE/BE protocol redesign
Peter Eisentraut writes: Dave Page writes: Well what I *really* need has been made quite clear in other posts, but, when I say resultset in the same sentence as pgAdmin, I'm referring to the ability to enter an arbitrary SQL query, have the results displayed in a grid, which can then be editted. To do this pgAdmin needs to be able to figure out enough info about the source of the data to generate the required insert/update/delete statements. Right. But since you can't really write a literal SQL statement that does an update that refers to a previous query, you are already doing a fair amount of internal magic anyway, so if the meta-data is determined by magic as well, that seems consistent. While this may be true, it is possible to build a client side system that can do this for you. Views and cursors are great, but they are not always the best tool for the job. What you need is an updateable cursor on the server side. It has all the facilities you need, including standardized ways to find out the updatability metadata. Please concentrate on that and do not attempt to clutter the wire protocol with data that will not withstand a throrough investigation of semantics. It's not foolproof and may even be foolhardy, but there are certain advantages to client-side decision making. A couple of integers or so for each attribute is not a terribly high price to pay. If a compelling case can be made that it can be put to good use, why not do it? Merlin ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] division by zero
The big question is how to fix this on Win32. Is a test in the integer division routines enough? Is there a signal to catch on Win32? After fighting with the docs a little bit, here is how to handle an int/0 in a C application. #include stdio.h #include excpt.h #include windows.h int HandleException( int iExcept ); int main(int argc, char* argv[]) { int b = 0; int a; puts(hello); __try { puts(in try); a = 0/b; } __except( HandleException(GetExceptionCode()) ) { puts(in except); } puts(world); } int HandleException( int iExcept ) { if (iExcept == EXCEPTION_INT_DIVIDE_BY_ZERO) { puts(Handled int/0 exception); return EXCEPTION_EXECUTE_HANDLER; } /* call the system handler and crash */ return EXCEPTION_CONTINUE_SEARCH ; } Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Problems with win32 patch/build
I am working my way through getting the postgres win32 port (7.2.1) compiled under MSVC. I got through the patch and the compilation, but am having problems with the linker. Trying to link the backend, I get 43 linker errors like: unresolved external symbol GUC-scanstr unresolved external symbol ProcessConfigFile unresolved external symbol num_columns_read unresolved external symbol Int_yychar These symbols are listed in the postgres.def file. Removing them removes the errors. If I do so, I get a new set of errors: Linking... Creating library ..\..\Debug/postgres.lib and object ..\..\Debug/postgres.exp postmaster.obj : error LNK2001: unresolved external symbol _ProcessConfigFile bootstrap.lib(bootstrap.obj) : error LNK2001: unresolved external symbol _ProcessConfigFile tcop.lib(postgres.obj) : error LNK2001: unresolved external symbol _ProcessConfigFile bootstrap.lib(bootstrap.obj) : error LNK2001: unresolved external symbol _Int_yyparse ..\..\Debug\backend.dll : fatal error LNK1120: 2 unresolved externals Error executing link.exe Also, when trying to compile ecpg, I get flex: could not create '.\preproc\pgc.c' There were some weird things that went on during my first compile run that I fixed as I went and I'm afraid I broke something. Any suggestions? If I can get through this I'll start hitting patch #2 and start running tests versus 7.3.x source. Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Problems with win32 patch/build
I am working my way through getting the postgres win32 port (7.2.1) compiled under MSVC. I got through the patch and the compilation, but am having problems with the linker. Trying to link the backend, I get 43 linker errors like: unresolved external symbol GUC-scanstr unresolved external symbol ProcessConfigFile unresolved external symbol num_columns_read unresolved external symbol Int_yychar These symbols are listed in the postgres.def file. Removing them removes the errors. If I do so, I get a new set of errors: Linking... Creating library ..\..\Debug/postgres.lib and object ..\..\Debug/postgres.exp postmaster.obj : error LNK2001: unresolved external symbol _ProcessConfigFile bootstrap.lib(bootstrap.obj) : error LNK2001: unresolved external symbol _ProcessConfigFile tcop.lib(postgres.obj) : error LNK2001: unresolved external symbol _ProcessConfigFile bootstrap.lib(bootstrap.obj) : error LNK2001: unresolved external symbol _Int_yyparse ..\..\Debug\backend.dll : fatal error LNK1120: 2 unresolved externals Error executing link.exe Also, when trying to compile ecpg, I get flex: could not create '.\preproc\pgc.c' There were some weird things that went on during my first compile run that I fixed as I went and I'm afraid I broke something. Any suggestions? If I can get through this I'll start hitting patch #2 and start running tests versus 7.3.x source. I wrote: I think I figured it out. lexx has a problem with the guc_file.l. I should be able to tackle that: sorry to pester. Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] another optimizer bug?
Following example is with latest anonymous cvs of 7.5. I can't get LIKE to use an index even with seq_scan = off. I'm using the default locale and hchassis.vin_no is defined as char(17). The hchassis table has about 250k rows in it. The non aggregate versions of the selects have the same results WRT the optimizer. Varying the VIN makes no difference. Simple form: select a from b where a like 'k%'; Am I crazy? This is a query I would normally expect to always use the index. Merlin Log: [first two queries with like, second two with =] cpc=# explain select count(vin_no) from hchassis where vin_no like '2FTZX08W8WCA21580%'; QUERY PLAN --- Aggregate (cost=19576.22..19576.22 rows=1 width=21) - Seq Scan on hchassis (cost=0.00..19576.21 rows=1 width=21) Filter: (vin_no ~~ '2FTZX08W8WCA21580%'::text) (3 rows) cpc=# select count(vin_no) from hchassis where vin_no like '2FTZX08W8WCA21580%'; count --- 1 (1 row) cpc=# cpc=# explain select count(vin_no) from hchassis where vin_no = '2FTZX08W8WCA21580'; QUERY PLAN --- Aggregate (cost=5.61..5.61 rows=1 width=21) - Index Scan using hchassis_vin_no_idx on hchassis (cost=0.00..5.60 rows=1 width=21) Index Cond: (vin_no = '2FTZX08W8WCA21580'::bpchar) (3 rows) count --- 1 (1 row) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] select like...not using index
Sometime between yesterday and today queries in the form of select * from t where t.f like 'k%' have been broken so that they never use the index (on win32, not sure about others). On win32, at least, they have been broken for a while but this was due to a known issue based on the locales. AFAICT, the current cvs has addressed this issue and (show lc_collate returns C) there seems to be no reason why the queries aren't working properly. Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [pgsql-hackers-win32] [HACKERS] select like...not using index
Sometime between yesterday and today queries in the form of select * from t where t.f like 'k%' have been broken so that they never use the index (on win32, not sure about others). On win32, at least, they have been broken for a while but this was due to a known issue based on the locales. AFAICT, the current cvs has addressed this issue and (show lc_collate returns C) there seems to be no reason why the queries aren't working properly. Merlin Did you do an ANALYZE on the table? Yes. Just for kicks, I also drop/rc the index...no help. Following that, I ran a fresh initdb which reported: The database cluster will be initialized with locale English_United States.1252. I then ran I just recently had the same issue (due to locale problems). This was recently fixed in cvs and replaced the hack I was using to work around the problem. The index search no longer works and I am very suspicious about a locale related issue. This is all off of a fresh copy of 7.5devel from the anonymous cvs server. Are there a lot of duplicate keys? How big is the table? About 250k with less than 1% duplicatation. What does the explain look like? cpc=# explain select * from hchassis where vin_no = '2FTZX08W8WCA24365'; QUERY PLAN -- Index Scan using hchassis_vin_no_idx on hchassis (cost=0.00..8.94 rows=2 width=437) Index Cond: (vin_no = '2FTZX08W8WCA24365'::bpchar) (2 rows) cpc=# explain select * from hchassis where vin_no like '2FTZX08W8WCA24365%'; QUERY PLAN -- Seq Scan on hchassis (cost=0.00..19577.70 rows=1 width=437) Filter: (vin_no ~~ '2FTZX08W8WCA24365%'::text) (2 rows) cpc=# Merlin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [pgsql-hackers-win32] [HACKERS] select like...not using index
It looks to me like you have an index of type bpchar but are searching with type text. I find type conversions very limited with LIKE. I would create an index on 'vin_no' using a cast to TEXT. This should work on both queries. Not in this case. Just to be sure, I created a new column as text type, created index, analyzed, and searched and got the same behavior. Furthermore, I did this: cpc=# show lc_collate; lc_collate C (1 row) cpc=# show lc_ctype; lc_ctype -- C (1 row) followed by this: C:\postgres\pgsql\src\test\localepg_controldata [...] LC_COLLATE: English_United States.1252 LC_CTYPE: English_United States.1252 At this point I'm about 90% sure I've turned up a locale related bug...initdb warned me wrt the locale but psql is still reporting 'C'. Plus, my queries don't work where they used to about a week ago. My next step is to initdb --locale=C to confirm this. I've informed Magnus about this and he is looking into it. Merlin ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [pgsql-hackers-win32] [HACKERS] select like...not using index
Hmmm, snipped from your reply was the explain plan from the query where it was clear you were using two different character data types: bpchat and text. That, alone, may have been a problem. Looking at your defaults, did you do: initdb --locale=C somepath I reran initdb --locale=C yesterday and that fixed the problem. Since I am doing the nightly win32 builds I run initdb each night around 1am and I missed the locale warning. I had a feeling it was something like this. The part I don't understand is why psql was saying the locale (show lc_ctype) was 'C' when pg_controldata was not. This, along with recent code revisions tricked me for a while (not to mention the default locale being changed). Here is what I think happened (this might be a bug, might not): Each night I run initdb but I use a special postgresql.conf which is optimized for quick data loading. This is copied over the default one after the server is started. This contains the locale information which is 'initialized by initdb'. These were still 'C' because this file was generated before the default locale was changed. psql shows this information when you ask it for the locale info even if it is incorrect. The real settings are of course built into the database itself. This stuff is all new to me, I've never really had to deal with locales before. Personally, I think, if I do not specify a locale, I don't want a specific locale. Period. I haven't been paying too close attention to the hackers list to say when this happened, but it bit me a couple times. I now accept this as dogma :) Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Compile failure with SSL
Dave Page wrote: OK, looks like the error below is a Win32 thing. The patch attached #ifdef'd out the permissions check on the private key file as it won't work under Windows anyway (a similar check in postmaster.c has has already been ifdef'd out for the same reason). Incidently, the consts are also used in initdb.c where they work just fine - can't seem to figure out where it gets them from though (must be missing something). Normally they're in sys/stat.h, but not in mingw (where they can only be found in glibc/sys/stat.h). Magnus also mentioned to me that Merlin's snapshots are building fine without this patch - dunno if he has a different version of sys/stat.h... Anyway, regardless of that I think the patch is relevant. I had made the same changes as you to be_secure...(actually, I recently stopped doing that, I need to update the snapshot info page). The real problem is that the SSL based connection does not receive signals while waiting on select() like the regular connection does. IMO, the SSL patch should not be applied until this has been resolved. Merlin ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] PREPARE and transactions
Now, here's a scenario that has us worried: BEGIN PREPARE foo AS ... ... [error] DEALLOCATE foo [fails: already aborted by previous error] ABORT BEGIN PREPARE foo AS ... [fails: foo is already defined!] EXECUTE foo [fails: already aborted by previous error] COMMIT[fails: already aborted by previous error] Part of the problem is that PREPARE has no provision to overwrite an existing plan (CREATE OR REPLACE). I run into this all the time because I make heavy use of prepared statements to emulate an ISAM file system. I have to jump through hoops to keep track of what statements are already prepared to keep from bouncing the current transaction. However, at least for me, nested x basically solves this problem. I'll just always wrap the prepare statement with a sub-transaction and commit/rollback as necessary. This is odd because the rollback does nothing other than guard the following statements from the prepare failure to execute. So, you do: BEGIN BEGIN PREPARE foo AS ... COMMIT/ROLLBACK ... [error] DEALLOCATE foo[fails: already aborted by previous error] ABORT BEGIN BEGIN PREPARE foo AS ... [fails: foo is already defined!] COMMIT/ROLLBACK EXECUTE foo [will now always run if prepare is aborted] COMMIT [commit executes] To me, this is good style and it looks like nested x is going to make 7.5. I have no opinion on whether rollback should affect prepare/deallocate. Merlin ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] PREPARE and transactions
Jeroen T. Vermeulen wrote: Well, except prepared statements apparently; I'm not sure why they are an exception. When I say within a transaction as opposed to outside a transaction, I mean of course an explicit transaction. If you want a prepared statement to last throughout the session, I'd say it stands to reason that you create it outside a transaction--in unfettered session context, so to speak. I can't see how that would be either less intuitive or harder to program in the client. I disagree. Lots of people use prepared statements for all kinds of different reasons. A large percentage of them do not need or make use of explicit transactions. Having to continually rebuild the statement would be a hassle. The caching mechanism also seems like extra work for little result (to be fair, I like the idea of multiple backends being able to make use of the same plan). Generic routines can just always wrap the prepare statement in a subtransaction, which now allows safety until such time that a create or replace version becomes available, Merlin p.s. Is this correct behavior? A DROP TABLE gives a missing oid error which is fine, but I don't like this much: cpc=# create table test (a int, b int, c int); CREATE TABLE cpc=# prepare p (int) as select * from test; PREPARE cpc=# execute p(0); a | b | c ---+---+--- (0 rows) cpc=# alter table test drop column a; ALTER TABLE cpc=# execute p(0); a | b | c ---+---+--- (0 rows) ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] PREPARE and transactions
I disagree. Lots of people use prepared statements for all kinds of different reasons. A large percentage of them do not need or make use of explicit transactions. Having to continually rebuild the statement would be a hassle. The caching mechanism also seems like extra work for I think we're talking at cross purposes here... If the client doesn't use explicit transactions, as you say is common, then you're obviously not defining prepared statements inside explicit transactions either. And so you're certainly not going to be bothered by what happens at the end of a transaction! In that case, what I'm saying doesn't affect you at all, in any way. Ok, I am starting to get your point and perhaps agree with you. Let me give a little more detail about where I am coming from, and frame it inside your logic. Also, my situation is probably highly unusual and maybe unimportant in the grander scheme of things. I am using PostgreSQL as a backend for legacy COBOL applications and have written a driver which maps the COBOL I/O statements to SQL statements. To save a little bit on parsing time and for various other reasons these SQL statements are handled as prepared queries. Each COBOL file has a corresponding SQL table in the database and each table can have up to 7 prepared statements that the application creates when it needs them. Unless I am misunderstanding things, if you change the prepared statement's lifetime, I am forced to prepare a bunch of statements all at once instead of when they are needed. I am prepared to do this, however (pun intended). My driver has to be transactionally agnostic: the application that uses my driver might or might not be in a transaction at any particular point in time. I can, however, keep track of a flag which tracks if I am in a transaction. If my driver guesses wrong I get an SQL error which could potentially bounce the transaction which I may or may not be in. With nested x, I can guard this with a subtransaction (only necessary when I'm in a transaction) but I get in trouble if the app opens a trouble manually through direct SQL. I do not under any circumstances want to keep re-preparing the statement so having the prepared statement having a transaction - determined lifetime under any circumstances is kind of a difficult for me to deal with. I could keep track of a flag which tells me if I am inside a transaction (and thus turn off assumptions about future use of the statement), but there are subtle complexities with this approach (that get worse with nested x) that I'd like to avoid if at all possible. Merlin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [Re] Re: [HACKERS] PREPARE and transactions
Jeroen wrote: Granted, that's probably going to force the issue. I do wonder though: one of the arguments in favour of the current semantics is that the problems can be worked around using nested transactions. Then what were people doing before nested transactions, in Tom's scenario where the programmer doesn't know where transactions begin? The trick is that with the current semantics, you don't have to watch transaction activity, just the prepare statements. You know if and when (from the client/driver's point of view) a prepared statement exists because you created it and don't have to be concerned about the lifetime. If you guys change the lifetime, it becomes difficult or impossible to set a flag on the client which guarantees prepared statement existence. This means I have to wrap the statement execution with a subtransaction or run the risk of bouncing a current transaction. Currently in the applications I write 70% of all I/O goes through prepared statements...the reason to do this was to reduce statement turnaround latency, which is the main driving performance factor in COBOL applications. I would be fine with changing the lifetime if an EXECUTE failure did not abort the current transaction. Then I could simply watch the return code of the statement execution and prepare the statement on demand...from my point of view, this would actually be the most elegant scenario. Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])