Re: [HACKERS] WORM and Read Only Tables (v0.1)
Simon, Use Case: VLDB with tons of (now) read only data, some not. Data needs to be accessible, but data itself is rarely touched, allowing storage costs to be minimised via a storage hierarchy of progressively cheaper storage. There's actually 2 cases to optimize for: 1) write-once-read-many (WORM) 2) write-once-read-seldom (WORS) The 2nd case is becoming extremely popular due to the presence of government-mandated records databases. For example, I'm currently working on one call completion records database which will hold 75TB of data, of which we expect less than 1% to *ever* be queried. One of the other things I'd like to note is that for WORM, conventional storage is never going to approach column-store DBs for general performance. So, should we be working on incremental improvements like the ones you propose, or should we be working on integrating a c-store into PostgreSQL on a per-table basis? -- Josh the Fuzzy Berkus San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] WORM and Read Only Tables (v0.1)
On Tue, 2007-12-11 at 20:30 -0800, Josh Berkus wrote: Simon, Use Case: VLDB with tons of (now) read only data, some not. Data needs to be accessible, but data itself is rarely touched, allowing storage costs to be minimised via a storage hierarchy of progressively cheaper storage. There's actually 2 cases to optimize for: 1) write-once-read-many (WORM) 2) write-once-read-seldom (WORS) The 2nd case is becoming extremely popular due to the presence of government-mandated records databases. For example, I'm currently working on one call completion records database which will hold 75TB of data, of which we expect less than 1% to *ever* be queried. Well, that's exactly the use case I'm writing for. I called that an archival data store in my post on VLDB Features. WORM is a type of storage that might be used, so it would be somewhat confusing if we use it as the name of a specific use case. Getting partitioning/read-only right will allow 70+TB of that to be on tape or similar, which with compression can be reduced to maybe 20TB? I don't want to promise any particular compression ratio, but it will make a substantial difference, as I'm sure you realise. One of the other things I'd like to note is that for WORM, conventional storage is never going to approach column-store DBs for general performance. So, should we be working on incremental improvements like the ones you propose, or should we be working on integrating a c-store into PostgreSQL on a per-table basis? What I'm saying is that there are some features that all VLDBs need. If we had a column store DB we would still need partitioning as well or the data structures would become unmanageable. Plus partitioning can allow the planner to avoid de-archiving/spinning up data and help reduce storage costs. Radical can be good, but it can take more time also. I dare say it would be harder for the community to accept also. So I look for worthwhile change in acceptable size chunks. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [EMAIL PROTECTED]: Re: [pgsql-es-ayuda] SLL error 100% cpu]
On Wed, Dec 12, 2007 at 12:30:50AM -0500, Tom Lane wrote: Trevor Talbot [EMAIL PROTECTED] writes: On 12/11/07, Tom Lane [EMAIL PROTECTED] wrote: I dunno anything about how to fix the real problem (what's winsock error 10004?), WSAEINTR, A blocking operation was interrupted by a call to WSACancelBlockingCall. Oh, then it's exactly the same thing as our bug #2829. I opined in that thread that OpenSSL was broken because it failed to treat this as a retryable case like EINTR. But not being much of a Windows person, that might be mere hot air. Someone with a Windows build environment should try patching OpenSSL to treat WSAEINTR the same as Unix EINTR and see what happens ... When I last looked at this (and this was some time ago), I suspected (and still do) that the problem is in the interaction between our socket-emulation-stuff (for signals) and openssl. I'm not entirely sure, but I wanted to rewrite the SSL code so that *our* code is responsible for aclling the actuall send()/recv(), and not OpenSSL. This would also fix the fact that if an OpenSSL network operation ends up blocking, that process can't receive any signals... I didn't have time to get this done before feature-freeze though, and I beleive the changes are large enough to qualify as such.. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] VLDB Features
Hi, Le mercredi 12 décembre 2007, Josh Berkus a écrit : I'm curious what you feel is missing that pgloader doesn't fill that requirement: http://pgfoundry.org/projects/pgloader/ Because pgloader is implemented in middleware, it carries a very high overhead if you have bad rows. As little as 1% bad rows will slow down loading by 20% due to retries. Not that much, in fact, I'd say. pgloader allows its user to configure how large a COPY buffer to use (global parameter as of now, could easily be a per-section configuration knob, just didn't see any need for this yet). It's the 'copy_every' parameter as seen on the man page here: http://pgloader.projects.postgresql.org/#toc4 pgloader will obviously prepare a in-memory buffer of copy_every tuples to give to COPY, and in case of error will cut it and retry. Classic dichotomy approach, from initial implementation by Jan Wieck. So you can easily balance the error recovery costs against the COPY bulk size. Note also that the overall loading time with pgloader is not scaling the same as the COPY buffer size, the optimal choice depends on the dataset --- and the data massaging pgloader has to make on it ---, and I've experienced best results with 1 and 15000 tuples buffers so far. FYI, now the pgloader topic is on the table, the next items I think I'm gonna develop for it are configurable behavior on errors tuples (load to another table when pk error, e.g.), and some limited ddl-partioning support. I'm playing with the idea for pgloader to be able to read some partitioning schemes (parsing CHECK constraint on inherited tables) and load directly into the right partitions. That would of course be done only when configured this way, and if constraints are misread it would only result in a lot more rejected rows than expected, and you still can retry using your insert trigger instead of pgloader buggy smartness. Comments welcome, regards, -- dim signature.asc Description: This is a digitally signed message part.
[HACKERS] little correction
Bruce, in http://www.postgresql.org/docs/8.3/static/textsearch-dictionaries.html#TEXTSEARCH-THESAURUS I think ALTER TEXT SEARCH CONFIGURATION russian ADD MAPPING FOR asciiword, asciihword, hword_asciipart WITH thesaurus_simple; should be ALTER TEXT SEARCH CONFIGURATION russian ALTER MAPPING FOR asciiword, asciihword, hword_asciipart WITH thesaurus_simple; since configuration russian already exists. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] WORM and Read Only Tables (v0.1)
There are a number of nasty limitations for partitions currently (not the least of which is that real uniqueness guarantees are impractical), Just to add an other opinion to this statement, because it imho sounds overly pessimistic: Uniqueness is currently perfectly practical, when the unique index contains the column[s] that is/are used in a non overlapping partitioning scheme. If you cannot create separate unique indexes on each partition that guarantee global uniqueness because of the chosen partitioning scheme, you can often reconsider your scheme (e.g. use natural keys instead of serials). Other db software allows creating global indexes, or indexes with separate partitioning schemes, but this is then often a pain. When you drop/attach/detach a partition such an index needs to be recreated or reorganized. This then makes a large slow transaction out of attach/detach partition. If you don't need to attach/detach, there is still one other argument against the huge global index which is fault isolation. There is imho large room to make it better than others :-) And I think we should not regard them as positive examples, because that narrows the view. Andreas ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] WORM and Read Only Tables (v0.1)
Getting partitioning/read-only right will allow 70+TB of that to be on tape or similar, which with compression can be reduced to maybe 20TB? I don't want to promise any particular compression ratio, but it will make a substantial difference, as I'm sure you realise. Wouldn't one very substantial requirement of such storage be to have it independent of db version, or even db product? Keeping old hardware and software around can be quite expensive. So, wouldn't a virtual table interface be a better match for such a problem ? Such a virtual table should be allowed to be part of a partitioning scheme, have native or virtual indexes, ... Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] ScalarArrayOp advancing array keys
Consider a situation where there's an index on x,y and we're processing a where clause like: WHERE x IN (1,2,3,4,5) AND y IN ('A','B','C','D') Assuming we use the index we loop through doing an index lookup for every combination of the two (generated) arrays. Except if I understand ExecIndexAdvanceArrayKeys() correctly we do it in what seems like a nonoptimal order. It seems we do index lookups in the order: 1,A 2,A 3,A 4,A 5,A 1,B 2,B 3,B 4,B 5,B 1,C ... Is that right? Or are these array index info structs themselves in reverse order anyways? Wouldn't it make more sense and perhaps perform slightly better to iterate in the other order? That is, 1,A 1,B 1,C... ? I've been discussing here what it would take to be able to use posix_fadvise() in the index lookup itself. The only reasonably proposal we have so far to do this would be to buffer up some number of index probes like this and send them off to index_getmulti() as a group. That would make it more important to do the above in the right order since a big part of the advantage of doing that would be avoiding the redundant index descents for adjacent index keys. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] foreign keys
Hi, How hard/generally useful would it be to allow the target of a foreign key to be on a set of columns where only a subset of them actually have a unique constraint. For example: CREATE TABLE base ( id INTEGER NOT NULL PRIMARY KEY, type INTEGER NOT NULL ); CREATE TABLE type1info ( id INTEGER NOT NULL PRIMARY KEY, type INTEGER NOT NULL CHECK (type = 1), FOREIGN KEY (id,type) REFERENCES base (id,type) ); It's possible to create a UNIQUE constraint on base(id,type) but it seems redundant as the PRIMARY KEY constraint on id already ensures uniqueness. Somewhat independently, it would be nice to allow constant expressions to be used on the left-hand-side of foreign key constraints. Allowing them on the RHS seems nice for completeness, but appears completely useless in practical terms. The second table would simply become: CREATE TABLE type1info ( id INTEGER NOT NULL PRIMARY KEY, FOREIGN KEY (id,1) REFERENCES base (id,type) ); Sam ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] VLDB Features
On Tue, 2007-12-11 at 15:31 -0800, Josh Berkus wrote: Simon, we should start a VLDB-Postgres developer wiki page. http://developer.postgresql.org/index.php/DataWarehousing -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] WORM and Read Only Tables (v0.1)
On Wed, Dec 12, 2007 at 12:14:43PM +0100, Zeugswetter Andreas ADI SD wrote: Uniqueness is currently perfectly practical, when the unique index contains the column[s] that is/are used in a non overlapping partitioning scheme. Well, yes, assuming you have no bugs. Part of the reason I want the database to handle this for me is because, where I've come from, the only thing I can be sure of is that there will be bugs. There'll even be bugs before there is running code. One bug I can easily imagine is that the non-overlapping partitioning scheme has a bug in it, such that it turns out there _is_ an overlap some time. All of that said, I agree with you, particularly about the alternative ways things can suck instead :-/ A ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] result of convert_to is bytea
Hello documentation fix. result of convert_to is bytea, not text. Regards Pavel Stehule *** ./src/sgml/func.sgml.orig 2007-12-12 17:18:55.0 +0100 --- ./src/sgml/func.sgml 2007-12-12 17:19:56.0 +0100 *** *** 1386,1392 literalfunctionconvert_to/function(parameterstring/parameter typetext/type, parameterdest_encoding/parameter typename/type)/literal /entry !entrytypetext/type/entry entry Convert string to parameterdest_encoding/parameter. /entry --- 1386,1392 literalfunctionconvert_to/function(parameterstring/parameter typetext/type, parameterdest_encoding/parameter typename/type)/literal /entry !entrytypebytea/type/entry entry Convert string to parameterdest_encoding/parameter. /entry ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] WORM and Read Only Tables (v0.1)
On Wed, Dec 12, 2007 at 12:58:11PM +0100, Zeugswetter Andreas ADI SD wrote: Wouldn't one very substantial requirement of such storage be to have it independent of db version, or even db product? Keeping old hardware and software around can be quite expensive. This was one of the explicit requirements I had when I wrote my pie in the sky outline. Hrm. I wonder if I can get permission to post it. Let me find out. The requirement was, anyway, that we be able to read old versions of archived rows. IIRC there was an implementation choice, whether we would _never_ allow such rows to be SET READ WRITE or whether they'd be immediately upgraded to the present format on SET READ WRITE. A ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] test
test ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] VLDB Features
Hi, Josh Berkus wrote: Here's the other VLDB features we're missing: Parallel Query Uh.. this only makes sense in a distributed database, no? I've thought about parallel querying on top of Postgres-R. Does it make sense implementing some form of parallel querying apart from the distribution or replication engine? Windowing Functions Isn't Gavin Sherry working on this? Haven't read anything from him lately... Parallel Index Build (not sure how this works exactly, but it speeds Oracle up considerably) Sounds interesting *turs-away-to-google* On-disk Bitmap Index (anyone game to finish GP patch?) Anybody having an idea of what's missing there (besides good use cases, which some people doubt)? Again: Gavin? Simon, we should start a VLDB-Postgres developer wiki page. Thanks, Simon, wiki page looks good! Regards Markus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] WORM and Read Only Tables (v0.1)
On Wed, 2007-12-12 at 11:22 -0500, Andrew Sullivan wrote: On Wed, Dec 12, 2007 at 12:14:43PM +0100, Zeugswetter Andreas ADI SD wrote: Uniqueness is currently perfectly practical, when the unique index contains the column[s] that is/are used in a non overlapping partitioning scheme. Well, yes, assuming you have no bugs. Part of the reason I want the database to handle this for me is because, where I've come from, the only thing I can be sure of is that there will be bugs. There'll even be bugs before there is running code. One bug I can easily imagine is that the non-overlapping partitioning scheme has a bug in it, such that it turns out there _is_ an overlap some time. Enforcing uniqueness with a global index has a number of disadvantages. The worst of these is that the index continues to get bigger and bigger as the total data volume increases. You have to index all partitions, plus each index entry needs to include a partition id as well as the index key. So not only is it big, its huge. Huge indexes are slow, so an index with terabytes of data in it is going to be almost unusable. The best thing to do would be to sit down and work out exactly how big and deep such an index would be in the case you're thinking of so we can tell whether it is very bad or merely bad. I seem to be the only one saying global indexes are bad, so if people that want them can do the math and honestly say they want them, then I will listen. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] VLDB Features
Markus, Parallel Query Uh.. this only makes sense in a distributed database, no? I've thought about parallel querying on top of Postgres-R. Does it make sense implementing some form of parallel querying apart from the distribution or replication engine? Sure. Imagine you have a 5TB database on a machine with 8 cores and only one concurrent user. You'd like to have 1 core doing I/O, and say 4-5 cores dividing the scan and join processing into 4-5 chunks. I'd say implementing a separate I/O worker would be the first step towards this; if we could avoid doing I/O in the same process/thread where we're doing row parsing it would speed up large scans by 100%. I know Oracle does this, and their large-table-I/O is 30-40% faster than ours despite having less efficient storage. Maybe Greenplum or EnterpriseDB will contribute something. ;-) Windowing Functions Isn't Gavin Sherry working on this? Haven't read anything from him lately... Me neither. Swallowed by Greenplum and France. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] WORM and Read Only Tables (v0.1)
On Wed, 2007-12-12 at 10:48 -0800, Josh Berkus wrote: Andrew, The requirement was, anyway, that we be able to read old versions of archived rows. IIRC there was an implementation choice, whether we would _never_ allow such rows to be SET READ WRITE or whether they'd be immediately upgraded to the present format on SET READ WRITE. Well, in theory we need this capability for upgrade-in-place too. While that project has kind of stalled for the moment, we'll pick it back up again soon. Who was working on it? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] WORM and Read Only Tables (v0.1)
Andrew, The requirement was, anyway, that we be able to read old versions of archived rows. IIRC there was an implementation choice, whether we would _never_ allow such rows to be SET READ WRITE or whether they'd be immediately upgraded to the present format on SET READ WRITE. Well, in theory we need this capability for upgrade-in-place too. While that project has kind of stalled for the moment, we'll pick it back up again soon. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] VLDB Features
Hi Josh, Josh Berkus wrote: Sure. Imagine you have a 5TB database on a machine with 8 cores and only one concurrent user. You'd like to have 1 core doing I/O, and say 4-5 cores dividing the scan and join processing into 4-5 chunks. Ah, right, thank for enlightenment. Heck, I'm definitely too focused on replication and distributed databases :-) However, there's certainly a great deal of an intersection between parallel processing on different machines and parallel processing on multiple CPUs - especially considering NUMA architecture. *comes-to-think-again*... Isn't Gavin Sherry working on this? Haven't read anything from him lately... Me neither. Swallowed by Greenplum and France. Hm.. good for him, I guess! Regards Markus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] test
Joshua D. Drake wrote: test Does anybody see any value in having [EMAIL PROTECTED] be an alias for pgsql-hackers? -- Alvaro Herrera http://www.flickr.com/photos/alvherre/ Postgres is bloatware by design: it was built to house PhD theses. (Joey Hellerstein, SIGMOD annual conference 2002) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] test
On Dec 12, 2007 11:37 AM, Alvaro Herrera [EMAIL PROTECTED] wrote: Joshua D. Drake wrote: test Does anybody see any value in having [EMAIL PROTECTED] be an alias for pgsql-hackers? No, but I see some mild irritation in having to modify my rules to tag a second address with the pgsql-hackers label. Andrew
[HACKERS] Trigger problem - conclusion
Good Day, I recently posted a message here ( http://archives.postgresql.org/pgsql-hackers/2007-12/msg00340.php) that the trigger does not work as it should from time to time. Now the trigger works on C, before It was on TCL and it had the same problem. As the trigger works all right in 99.999 % cases (or even more) then we can make a conclusion that the trigger is written all right. And it was written in complete accordance with the documentation that is possible to find. I suggest to write down in documentation for PostgreSQL, that during big loads triggers can fail in some cases. Nik
Re: [HACKERS] WORM and Read Only Tables (v0.1)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, 12 Dec 2007 19:07:57 + Simon Riggs [EMAIL PROTECTED] wrote: I seem to be the only one saying global indexes are bad, so if people that want them can do the math and honestly say they want them, then I will listen. global indexes are bad for certain situations for others they are required. Constraint Exclusion/Partitioning is not only for ginormous tables. It can also be used for maintenance efficiency, micro optimizations and just general data architecture. Joshua D. Drake - -- The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD' -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHYDPhATb/zqfZUUQRApkWAJ0ZRixV0QD5DCAZxexq/oOojkIftwCfZqDv LA1HPCP/h2di7Xlj2uju0zo= =/lMO -END PGP SIGNATURE- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] WORM and Read Only Tables (v0.1)
Simon, Who was working on it? Zdenec and Dhanaraj from Sun, and someone from EDB (I'm not sure who, maybe Korry?). Unfortunately, both companies have shifted staff around and we need to re-start work. Of course, if hackers other than those from EDB Sun want to attack the problem, the more the merrier. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] PGparam proposal v2
Here is our updated PGparam extension to the libpq api: http://archives.postgresql.org/pgsql-hackers/2007-12/msg00356.php We have a patch implementing the following which we are cleaning up. We are also kicking around some ideas for arrays and possibly composite types which we may consider if the community wants to move forwards with this proposal. Tom made a number of comments some of which we have addressed: *) Separate PGparam from PGconn: we agree with this and separated them. *) Chanages to existing API functions: we agreed and moved new behavior to new functions *) 3rd party types: we now support this through a type registration interface *) Internal type changes: We think changes to binary format are fairly rare and easily addressed. *) Type confusion was removed by giving each type its own specifier. *) Objections to printf: We agreed in part: we moved to natural names, from %n4 to %pgint for example. This addressed scalability concerns and should be less cryptic to use. *) Argument passing in putf and getf is identical to the previous proposal. All we changed was the naming schema for the %spec and putf now takes a PGparam rather than a PGconn. * API INTERFACE /* opqaue */ typedef struct pg_param PGparam; PGparam *PQparamCreate(PGconn *conn); /* manually reset a param struct. This is done by * all execution functions for you. */ void PQparamReset(PQparam *param) /* free a PGparam */ void PQparamClear(PQparam *param); int PQputf( PGparam *param, const char *typeSpec, ...); int PQgetf( const PGresult *res, int tup_num, const char *fieldSpec, ...); /* PGparam Execution Functions */ PGresult *PQparamExec( PGconn *conn, PGparam *param, const char *command, int resultFormat); int PQparamSendQuery( PGconn *conn, PGparam *param, const char *command, int resultFormat); PGresult *PQparamExecPrepared PGconn *conn, PGparam *param, const char *stmtName, int resultFormat); int PQparamSendQueryPrepared PGconn *conn, PGparam *param, const char *stmtName, int resultFormat); /* All in wonder, no PGparam needed */ PGresult *PQexecParamsf( PGconn *conn, const char *commandSpec, int resultFormat, ...); /* All in wonder, no PGparam needed */ int PQsendQueryParamsf( PGconn *conn, const char *commandSpec, int resultFormat, ...); /* All in wonder, no PGparam needed */ PGresult *PQexecPreparedf( PGconn *conn, const char *stmtName, const char *typeSpec, int resultFormat, ...); /* All in wonder, no PGparam needed */ int PQsendQueryPreparedf( PGconn *conn, const char *stmtName, const char *typeSpec, int resultFormat, ...); /* gets the PGparam error message */ char *PQparamErrorMessage(const PGparam *param); * TYPE ALIAS SPECIFIERS The convention for postgresql built-in types is a % followed by the type alias. Every pgtype begins with pg. For example: %pgint4 %pgpolygon %pgbox 3rd party types can register their own specifiers, which is discussed int the TYPE HANDLER SYSTEM section. Type aliases must be unique. * TYPE HANDLER SYSTEM typedef struct pg_typeputargs { /* The out buffer will be at least 16K. If more room is needed, * use the PQ_TYPE_SETOUT to grow the buffer. In most cases, * 16K is plenty of room. */ char *out; /* the size in bytes of the out buffer */ int outl; /* Should not use directly, see PQ_TYPE_SETOUT. For the brave, * set to 1 if you point the out buffer at memory that should be * freed after your put callback returns. */ int free_out; /* The arguments to putf. Use PQ_TYPE_NEXTARG. */ va_list *ap; /* The type's alias name, like 'pgint8'. */ const char *type_alias; /* Sets an error message. This msg shows up in * PQparamErrorMessage(). */ int (*seterr)(struct pg_typeputargs *args, const char *format, ...); } PGtypePutArgs; typedef struct pg_typegetargs { const PGresult *res; int tup_num; int field_num; /* pointer to the output of PQgetvalue for this tup+field */ char *value; /* The arguments to getf. Use PQ_TYPE_NEXTARG. NOTE: the field_num * supplied to getf has already been pulled out of the va_list and * assigned to this structs field_num member. */ va_list *ap; /* The type's alias name, like 'pgint8'. */ const char *type_alias; /* Sets an error message. This msg shows up in * PQresultErrorMessage(). */ int (*seterr)(struct pg_typegetargs *args, const char *format, ...); } PGtypeGetArgs; #define PQ_TYPE_NEXTARG(typeArgs, type) va_arg(*(typeArgs)-ap, type) /* makes sure that putArgs-out is larger enough for new_outl */ #define PQ_TYPE_SETOUT(putArgs, new_outl) do{ \ if((new_outl) (putArgs)-outl) \ { \ (putArgs)-out = (char *)malloc(new_outl); \ if(!(putArgs)-out) \ return -1; \ *(putArgs)-out = 0; \ (putArgs)-outl = (new_outl); \ (putArgs)-free_out = 1; \ } \ } while(0) /* * Returns - the number of bytes put or -1 for error. */
Re: [HACKERS] VLDB Features
On Wed, Dec 12, 2007 at 08:26:16PM +0100, Markus Schiltknecht wrote: Isn't Gavin Sherry working on this? Haven't read anything from him lately... Me neither. Swallowed by Greenplum and France. Hm.. good for him, I guess! Yes, I'm around -- just extremely busy with a big release at Greenplum as well as other Real Life stuff. Thanks, Gavin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Trigger problem - conclusion
Nikolay Grebnev wrote: Good Day, I recently posted a message here (http://archives.postgresql.org/pgsql-hackers/2007-12/msg00340.php) that the trigger does not work as it should from time to time. Now the trigger works on C, before It was on TCL and it had the same problem. As the trigger works all right in 99.999 % cases (or even more) then we can make a conclusion that the trigger is written all right. And it was written in complete accordance with the documentation that is possible to find. I suggest to write down in documentation for PostgreSQL, that during big loads triggers can fail in some cases. I think you need to provide a reproducable, self-contained test case, if possible. Your previous mail said: In php where the queries are formed we see that the query is sent for execution and executed ant the base was modified, but the trigger seems just to skip it (does not work with the query). It would be far better to check that the statement has executed in the log, after turning on log_statement or log_min_duration_statement. Does it fail in inserts, updates, deletes, or all three? What platform are you using? And why are you compiling with -O3? cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] VLDB Features
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Greenplum as well as other Real Life stuff. For those of us here who have no idea what you are talking about can you define what Real Life is like? Joshua D. Drake - -- The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD' -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHYEmLATb/zqfZUUQRAhHJAJ9GD5DPZOlyd9LiBUG5TENIjuTgSwCaAnsf 5vdCZatl+XqD5S0+zMV/Ltk= =KyqY -END PGP SIGNATURE- ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Current code for function j2date does not have the same correct dynamic range as older code.
From: Dann Corbit Sent: Tuesday, December 11, 2007 5:58 PM To: pgsql-hackers@postgresql.org Subject: Current code for function j2date does not have the same correct dynamic range as older code. It may not matter to the PostgreSQL group, since nothing goes wrong until the year is 1,465,002 or larger. It may also be an artifact of the Microsoft Visual C++ compiler. At any rate, the modular math to compute the year month and day do not remain accurate nearly as long for the new code as the old. The old code here is j2dateOld(), and the new code is j2dateNew3(). I made a few tests with other internal types to try to extend the range of the new code, but neither long long or using mostly unsigned seems to restore the old range because the mathematics are not identical. At any rate, here is a unit test driver you can fiddle with, if you so choose. Correction: It is the new routines that are sound. The old routines had overflow problems. So *cough* nevermind. [snip of code] ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Recreating archive_status
During recovery procedures, there is a step that says If you didn't archive pg_xlog/ at all, then recreate it, and be sure to recreate the subdirectory pg_xlog/archive_status/ as well. If you forget to do this, you may not realise until the recovering server comes up and tries writing to the directory. The message that is spat out when this happens is LOG: could not create archive status file pg_xlog/archive_status/000103CE009E.ready: No such file or directory We could check this just as the server comes up and then re-create it if necessary. So we have one less step in the process to remember. Existing scripts which perform this automatically will not need changing. We can keep the message in case something removes the directory later. Views? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Recreating archive_status
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Wed, 12 Dec 2007 22:28:58 + Simon Riggs [EMAIL PROTECTED] wrote: We could check this just as the server comes up and then re-create it if necessary. So we have one less step in the process to remember. Existing scripts which perform this automatically will not need changing. We can keep the message in case something removes the directory later. Views? +1 Sincerely, Joshua D. Drake - -- The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD' -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHYGH0ATb/zqfZUUQRAlwiAJ4/GtkdBhtwFqOv71V8QIkJTg+WwgCglDip 3zLszrv1bmqEsDPRtuj1w04= =bFdr -END PGP SIGNATURE- ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [DOCS] distributed checkpoint
All, Just FYI, it's going to be difficult to replace the name of the feature in the PR docs at this point; I already have 11 translations. What's *wrong* with Load Distributed Checkpoint, which is what we've been calling it for 6 months? -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Recreating archive_status
Simon Riggs wrote: We could check this just as the server comes up and then re-create it if necessary. So we have one less step in the process to remember. Existing scripts which perform this automatically will not need changing. Oh please yes -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] WORM and Read Only Tables (v0.1)
On Wed, Dec 12, 2007 at 07:07:57PM +, Simon Riggs wrote: Enforcing uniqueness with a global index has a number of disadvantages. This is why I was trying to talk about constraints rather than global indexes. Just because we happen to implement them that way today does not mean that such constraints need be implemented that way in every case. I think especially for the sort of detached rows scenario I was dreaming about, a global index is never going to be good. A ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] VLDB Features
Josh Berkus [EMAIL PROTECTED] writes: Markus, Parallel Query Uh.. this only makes sense in a distributed database, no? I've thought about parallel querying on top of Postgres-R. Does it make sense implementing some form of parallel querying apart from the distribution or replication engine? Yes, but not for the reasons Josh describes. I'd say implementing a separate I/O worker would be the first step towards this; if we could avoid doing I/O in the same process/thread where we're doing row parsing it would speed up large scans by 100%. I know Oracle does this, and their large-table-I/O is 30-40% faster than ours despite having less efficient storage. Oracle is using Direct I/O so they need the reader and writer threads to avoid blocking on i/o all the time. We count on the OS doing readahead and buffering our writes so we don't have to. Direct I/O and needing some way to do asynchronous writes and reads are directly tied. Where Parallel query is useful is when you have queries that involve a substantial amount of cpu resources, especially if you have a very fast I/O system which can saturate the bandwidth to a single cpu. So for example if you have a merge join which requires sorting both sides of the query you could easily have subprocesses handle those sorts allowing you to bring two processors to bear on the problem instead of being limited to a single processor. On Oracle Parallel Query goes great with partitioned tables. Their query planner will almost always turn the partition scans into parallel scans and use separate processors to scan different partitions. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [DOCS] distributed checkpoint
Josh Berkus wrote: All, Just FYI, it's going to be difficult to replace the name of the feature in the PR docs at this point; I already have 11 translations. What's *wrong* with Load Distributed Checkpoint, which is what we've been calling it for 6 months? There was nothing *wrong* with the old wording, but the new wording is clearer? Do you disagree it is clearer? I don't think it makes sense to keep less-clear wording just to match press release translations. It is not like we are changing the wording 24 hours before final release. There will perhaps be other adjustments that might be needed for the press release. Also, the non-English press release isn't going to match the English release notes word-for-word anyway (they aren't in English) so is the new naming that big an issue? I suggest you update the English press release and ask as many translators who want to update theirs. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [DOCS] distributed checkpoint
On Wed, 12 Dec 2007, Josh Berkus wrote: What's *wrong* with Load Distributed Checkpoint, which is what we've been calling it for 6 months? One issue was that distributed has some association with distributed computing, which isn't actually the case. Spread is also more descriptive of what actually ended up being committed. Those are fairly subtle wording issues that I wouldn't necessarily expect to survive translation. The other problem was that the original description over-sold the feature a bit. It said prevent I/O spikes when it actually just reduces them. Still possible to have a spike, it probably won't be as big though. Your call on whether correcting that mischaracterization is worth bothering the translators over. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] little correction
Oleg Bartunov [EMAIL PROTECTED] writes: in http://www.postgresql.org/docs/8.3/static/textsearch-dictionaries.html#TEXTSEARCH-THESAURUS I think ALTER TEXT SEARCH CONFIGURATION russian ADD MAPPING FOR asciiword, asciihword, hword_asciipart WITH thesaurus_simple; should be ALTER TEXT SEARCH CONFIGURATION russian ALTER MAPPING FOR asciiword, asciihword, hword_asciipart WITH thesaurus_simple; since configuration russian already exists. Done. I notice that if you try to do it the other way, you get: regression=# ALTER TEXT SEARCH CONFIGURATION russian ADD MAPPING FOR asciiword, asciihword, hword_asciipart WITH simple; ERROR: duplicate key value violates unique constraint pg_ts_config_map_index This is not very good --- we usually try to provide a more friendly error message than unique constraint violation for duplicate system catalog entries. Not sure how hard it is to fix. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [DOCS] distributed checkpoint
Just FYI, it's going to be difficult to replace the name of the feature in the PR docs at this point; I already have 11 translations. What's *wrong* with Load Distributed Checkpoint, which is what we've been calling it for 6 months? Are you saying the PR was 'string freezed' before rc1? And before the actual backend? I wonder how reasonable that really is... That said we shouldn't change things around for no reason. IKn this case I think there was good motivation. /Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster