Re: [HACKERS] [GENERAL] Performance while loading data and indexing
I'll preface this by saying that while I have a large database, it doesn't require quite the performace you're talking about here. On Thu, Sep 26, 2002 at 02:05:44PM +0530, Shridhar Daithankar wrote: 1) Database load time from flat file using copy is very high 2) Creating index takes huge amount of time. 3) Any suggsestions for runtime as data load and query will be going in parallel. You're loading all the data in one copy. I find that INSERTs are mostly limited by indexes. While index lookups are cheap, they are not free and each index needs to be updated for each row. I fond using partial indexes to only index the rows you actually use can help with the loading. It's a bit obscure though. As for parallel loading, you'll be limited mostly by your I/O bandwidth. Have you measured it to take sure it's up to speed? Now the details. Note that this is a test run only.. Platform:- 4x Xeon2.4GHz/4GB RAM/4x48 SCSI RAID5/72 GB SCSI RedHat7.2/PostgreSQL7.1.3 Database in flat file: 125,000,000 records of around 100 bytes each. Flat file size 12GB Load time: 14581 sec/~8600 rows persec/~ an MB of data per sec. Create unique composite index on 2 char and a timestamp field: 25226 sec. Database size on disk: 26GB Select query: 1.5 sec. for approx. 150 rows. So you're loading at a rate of 860KB per sec. That's not too fast. How many indexes are active at that time? Triggers and foreign keys also take their toll. Important postgresql.conf settings sort_mem = 12000 shared_buffers = 24000 fsync=true (Sad but true. Left untouched.. Will that make a difference on SCSI?) wal_buffers = 65536 wal_files = 64 fsync IIRC only affects the WAL buffers now but it may be quite expensive, especially considering it's running on every transaction commit. Oh, your WAL files are on a seperate disk from the data? Initial flat data load: 250GB of data. This has gone up since last query. It was 150GB earlier.. Ongoing inserts: 5000/sec. Number of queries: 4800 queries/hour Query response time: 10 sec. That looks quite acheivable. 1) Instead of copying from a single 12GB data file, will a parallel copy from say 5 files will speed up the things? Limited by I/O bandwidth. On linux vmstat can tell you how many blocks are being loaded and stored per second. Try it. As long as sync() doesn't get done too often, it should be help. Couple MB of data per sec. to disk is just not saturating it. It's a RAID 5 setup.. No, it's not. You should be able to do better. 2) Sort mem.=12K i.e. 94MB, sounds good enough to me. Does this need further addition to improve create index performance? Should be fine. Admittedly your indexes are taking rather long to build. 3) 5K concurrent inserts with an index on, will this need a additional CPU power? Like deploying it on dual RISC CPUs etc? It shouldn't. Do you have an idea of what your CPU usage is? ps aux should give you a decent idea. 4) Query performance is not a problem. Though 4.8K queries per sec. expected response time from each query is 10 sec. But my guess is some serius CPU power will be chewed there too.. Should be fine. 5)Will upgrading to 7.2.2/7.3 beta help? Possibly, though it may be wirth it just for the features/bugfixes. All in all, in the test, we didn't see the performance where hardware is saturated to it's limits. So effectively we are not able to get postgresql making use of it. Just pushing WAL and shared buffers does not seem to be the solution. If you guys have any suggestions. let me know. I need them all.. Find the bottleneck: CPU, I/O or memory? Mysql is almost out because it's creating index for last 17 hours. I don't think it will keep up with 5K inserts per sec. with index. SAP DB is under evaluation too. But postgresql is most favourite as of now because it works. So I need to come up with solutions to problems that will occur in near future.. ;-) 17 hours! Ouch. Either way, you should be able to do much better. Hope this helps, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ There are 10 kinds of people in the world, those that can do binary arithmetic and those that can't. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] [HACKERS] Performance while loading data and indexing
On Thu, Sep 26, 2002 at 03:01:35PM +0530, Shridhar Daithankar wrote: Content-Description: Mail message body The index creation query was CREATE INDEX index1 ON tablename (esn,min,datetime); What if I put datetime ahead? It's likely the the datetime field will have high degree of locality being log data.. The order of fields depends on what you're using it for. For example, you can use the above index for a query using the conditions: esn = 'aaa' esn = 'bbb' and min = 'xxx' but not for queries with only datetime = '2002-09-26' min = 'ddd' and datetime = '2002-10-02' The fields can only be used left to right. This is where a single multicolumn index differs from multiple indexes of different columns. Have you used EXPLAIN ANALYSE to determine whether your indexes are being used optimally? -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ There are 10 kinds of people in the world, those that can do binary arithmetic and those that can't. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Rép
On Tue, Oct 08, 2002 at 12:35:22PM +0200, Erwan DUROSELLE wrote: What I understood from the Administrator's guide is: - Yes, PostgreSQL provides hot backup: it's the pg_dump utility. It'h hot because users can still be connected and work whil pg_dump is running ( though they will be slowed down). ( See Administrator's guide ch9) Correct. - No, PostgreSQL does NOT provide a way to restore a database up to the last commited transaction, with a reapply of the WAL, as Oracle or SQL Server ( and others, I guess) do. That would be a VERY good feature. See Administrator's guide ch11 Umm, I thought the whole point of WAL was that if the database crashed, the WAL would provide the info to replay to the last committed transaction. http://www.postgresql.org/idocs/index.php?wal.html ... because we know that in the event of a crash we will be able to recover the database using the log: ... These docs seem to corrobrate this. So, with Pg, if you backup your db every night with pg_dump, and your server crashes during the day, you will loose up to one day of work. I've never lost any data with postgres, even if it's crashed, even without WAL. -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ There are 10 kinds of people in the world, those that can do binary arithmetic and those that can't. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [GENERAL] Large databases, performance
On Mon, Oct 07, 2002 at 05:42:12PM +0200, Zeugswetter Andreas SB SD wrote: Hackers, do you think it's possible to hack together a quick and dirty patch, so that string length is represented by one byte? IOW can a database be built that doesn't contain any char/varchar/text value longer than 255 characters in the catalog? Since he is only using fixchar how about doing a fixchar implemetation, that does not store length at all ? It is the same for every row anyways ! Remember that in Unicode, 1 char != 1 byte. In fact, any encoding that's not Latin will have a problem. I guess you could put a warning on it: not for use for asian character sets. So what do you do if someone tries to insert such a string anyway? Perhaps a better approach is to vary the number of bytes used for the length. So one byte for lengths 64, two bytes for lengths 16384. Unfortunatly, two bits in the length are already used (IIRC) for other things making it a bit more tricky. -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ There are 10 kinds of people in the world, those that can do binary arithmetic and those that can't. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [pgsql-performance] [GENERAL] Large databases, performance
On Tue, Oct 08, 2002 at 11:14:11AM +0530, Shridhar Daithankar wrote: On 7 Oct 2002 at 11:21, Tom Lane wrote: Shridhar Daithankar [EMAIL PROTECTED] writes: I say if it's a char field, there should be no indicator of length as it's not required. Just store those many characters straight ahead.. Your assumption fails when considering UNICODE or other multibyte character encodings. Correct but is it possible to have real char string when database is not unicode or when locale defines size of char, to be exact? In my case varchar does not make sense as all strings are guaranteed to be of defined length. While the argument you have put is correct, it's causing a disk space leak, to say so. Well, maybe. But since 7.1 or so char() and varchar() simply became text with some length restrictions. This was one of the reasons. It also simplified a lot of code. -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ There are 10 kinds of people in the world, those that can do binary arithmetic and those that can't. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] I cant find it or I'm just lazy ?
On Mon, Feb 24, 2003 at 07:53:05PM +, Darko Prenosil wrote: I need two answers I did not find in documentation : How can I get exact number of rows in DECLARED CURSOR ? OK, I can FETCH until NULL, but this does not fits my needs ! You need to move to the end of the cursor. When you declare a cursor it doesn't run the query yet. You have to tell it to run the query before it can tell you how many rows it is. I think the command is MOVE. -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Support bacteria! They're the only culture some people have. pgp0.pgp Description: PGP signature
Re: [HACKERS] On-Disk Tuple Size
On Sat, Apr 20, 2002 at 05:22:20PM +0900, Curt Sampson wrote: ...and 6 bytes for the CTID, which I guess may be unnecessary. Really? How would things work without it? Well, from my examination of the on-disk data the CTID stored there is the same as its location in the file, so it could just be filled in while reading. Unless I'm misunderstanding its purpose. -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Canada, Mexico, and Australia form the Axis of Nations That Are Actually Quite Nice But Secretly Have Nasty Thoughts About America ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Documentation on page files
[Please CC any replies, I'm subscribed nomail] Hi, Chapter 7 of the Developers guide in about the Page Format on disk and it's a little out of date not to mention somewhat incomplete. 1. Is there documentation elsewhere (other than the source)? 2. If not, would patches be accepted to correct the situation? I've been looking into it a bit recently so I think I may be able to whip something useful up. Thanks in advance, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Canada, Mexico, and Australia form the Axis of Nations That Are Actually Quite Nice But Secretly Have Nasty Thoughts About America ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Documentation on page files
On Sun, Apr 21, 2002 at 03:46:07PM +0900, Curt Sampson wrote: On Sat, 20 Apr 2002, Tom Lane wrote: Martijn van Oosterhout [EMAIL PROTECTED] writes: 2. If not, would patches be accepted to correct the situation? Go for it. Yes, please! I'd be happy to review and updated version. Ok, my first attempt can be seen here: http://svana.org/kleptog/pgsql/page.sgml.txt I don't know whatever SGML format this is using, so the layout is not great, but the information should be accurate. I used it to create a program to dump the datafiles directly without the postmaster :). I'll submit a proper patch once we have something useful. One thing that would be good, since this is a developers' guide, would be to include references to the source files and dates from which the information comes. That way one could see if updates are necessary by doing a diff on those files between the given date and the head, to see what changes have been made since the description was written. Also good would be to have the data structures explicitly named so that when one dives into the source, one already has a good idea of what one's looking at. Well, I have included the names of the structures involved. Do you think it's worth adding filenames given that TAGS makes tracking them down easily enough? I can put in dates if you like. Issues to be dealt with: - Do I need to say more about TOAST? - Indexes? - Split into sections - How much detail is enough/too much? Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Canada, Mexico, and Australia form the Axis of Nations That Are Actually Quite Nice But Secretly Have Nasty Thoughts About America ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Documentation on page files
On Mon, Apr 22, 2002 at 11:14:36AM -0500, Ross J. Reedstrom wrote: On Sun, Apr 21, 2002 at 07:28:32PM +1000, Martijn van Oosterhout wrote: http://svana.org/kleptog/pgsql/page.sgml.txt I don't know whatever SGML format this is using, so the layout is not great, but the information should be accurate. I used it to create a program to dump the datafiles directly without the postmaster :). Excellent - since this is a FRP (Frequently Requested Program) how do you feel about dumping it in contrib? Even if it's hardcoded for your particular table structure, it could serve as a starting point for some poor DBA who's got to recover from a lost xlog, for example. Actually, it reads the table structure from the catalog. It also will find the right files to open. It reads files from both PG 6.5 and 7.2 although it shouldn't be too hard to make work for other versions. And if you people don't reorder the first few fields in pg_attribute, it will work for all future versions too. The dumping is more of an extra, the original idea was to check for errors in the datafiles. Hence the working name of pgfsck. At the moment the dumping dumps only tuples where xmax == 0 but I'm not sure if that's correct. It doesn't handle compressed tuples nor toasted ones, though thats more advanced really. And ofcourse outputing data in human readable format has to be added for each type. I only started writing it on Sunday, so let me give it a usable interface and I'll let people try it out. -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Canada, Mexico, and Australia form the Axis of Nations That Are Actually Quite Nice But Secretly Have Nasty Thoughts About America ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Documentation on page files
On Tue, Apr 23, 2002 at 09:15:22AM +0200, Hannu Krosing wrote: On Tue, 2002-04-23 at 01:29, Martijn van Oosterhout wrote: The dumping is more of an extra, the original idea was to check for errors in the datafiles. Hence the working name of pgfsck. At the moment the dumping dumps only tuples where xmax == 0 but I'm not sure if that's correct. AFAIK it is not. As Tom once explained me, it is ok for tuples xmax to be !=0 and still have a valid tuple. The validity is determined by some bits in tuple header. Well, from my thinking about how you would use these fields in a logical way, it seems it's possible for xmax to be non-zero if the transaction numbered xmax was not committed. But in that case (unless it was a delete) there would be a newer tuple with the same oid but xmax == 0 (and this uncommitted transaction as xmin). The problem is that inside the DB, you have a current transaction plus a list of committed transactions. Externally, you have no idea, so xmax == 0 is as valid a view as any other. This would have the effect of dumping out whatever would be visible if every transaction were committed. I think. If anyone knows a good document on MVCC implementations, let me know. But I think the most useful behaviour should be to dump system fields too, so mildly knowledgeable sysadmin can import the dump and do the right thing afterwards (like restore data as it was before transaction nr 7000) Well, i didn't think you could have statements of the form: insert into table (xmin,xmax,cmin,cmax,...) values (...); So you would have to leave it as a comment. In which case someone would have to go and by hand work out what would be in or out. I can make it an option but I don't think it would be particularly useful. Maybe --pretend-uncommitted xact Just a thought, if I did a delete from table accedently, and stopped the postmaster and twiddled the xlog for that transaction, would that have the effect of undeleting those tuples? -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Canada, Mexico, and Australia form the Axis of Nations That Are Actually Quite Nice But Secretly Have Nasty Thoughts About America ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] Table checking/dumping program
[Please CC any replies, I'm subscribed nomail] As promised I've given it a bit of polish and it's actually almost useful. You can have a look at it http://svana.org/kleptog/pgsql/pgfsck.html Just unpack the files into a directory. It's just a perl script with two modules so no compiling necessary. You can download the package directly at http://svana.org/kleptog/pgsql/pgfsck-0.01.tar.gz I've tested it on versions 6.5, 7.0 and 7.2 and it works. It shouldn't crash, no matter how bad a file you feed it. It can output insert statements also to help reconstruction. Here is an example of the program being run over a suitably hexedited file. # ./pgfsck -r 16559 kleptog website -- Detected database format 7.2 -- Table pg_class(1259):Page 1:Tuple 0: Unknown type _aclitem (1034) -- Table pg_class(1259):Page 1:Tuple 49: Unknown type _aclitem (1034) -- Table website(16559):Page 0:Tuple 7: Tuple incorrect length (parsed data=57,length=1638) -- Table website(16559):Page 0:Tuple 44: Decoding tuple runs off end: 627338916 69 -- Table website(16559):Page 0:Tuple 70: Bad tuple offset. Should be: 3784 = 11592 8192 Currently the following features are not supported: - Toasted / compressed tuples - Checking indexes doesn't work (should it?) - Views just produce empty output (because they are) - Arrays don't work - Since each type output has to be written, many types are not correctly output - Split tables (1GB) are not supported past the first part. - Some system tables in some versions have a strange layout. You may get many harmless warnings about the formats of pg_class, pg_attribute and/or pg_type. Most of these are basically because I don't know how they work, but with a bit of work some of these should be fixable. Have a nice day, -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ Canada, Mexico, and Australia form the Axis of Nations That Are Actually Quite Nice But Secretly Have Nasty Thoughts About America ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] [HACKERS] Linux Largefile Support In Postgresql RPMS
On Mon, Aug 12, 2002 at 11:30:36AM -0400, Andrew Sullivan wrote: The problem is not just a system-level one, but a filesystem-level one. Enabling 64 bits by default might be dangerous, because a DBA might think oh, it supports largefiles by default and therefore not notice that the filesystem itself is not mounted with largefile support. But I suspect that the developers would welcome autoconfig patches if someone offered them. Are there any filesystems in common use (not including windows ones) that don't support 32-bit filesizes? Linux (ext2) I know supports by default at least to 2TB (2^32 x 512bytes), probably much more. What about the BSDs? XFS? etc -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ There are 10 kinds of people in the world, those that can do binary arithmetic and those that can't. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] [HACKERS] Linux Largefile Support In Postgresql RPMS
On Tue, Aug 13, 2002 at 08:02:05AM -0500, Larry Rosenman wrote: On Tue, 2002-08-13 at 03:42, Mark Kirkwood wrote: Other operating systems where 64 bit file access can be disabled or unconfigured require more care - possibly (sigh) 2 binary RPMS with a distinctive 32 and 64 bit label ...(I think the big O does this for Solaris). Then, of course, there are systems where Largefiles support is a filesystem by filesystem (read mountpoint by mountpoint) option (E.G. OpenUNIX). I think this is going to be a pandoras box. I don't understand. Why would you want large-file support enabled on a per-filesystem basis? All your system programs would have to support the lowest common denomitor (ie, with large file support). Is it to make the kernel enforce a limit for the purposes of compatability? I'd suggest making it as simple as --enable-large-files and make it default in a year or two. -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ There are 10 kinds of people in the world, those that can do binary arithmetic and those that can't. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Re: [PATCH] To remove EXTEND INDEX
On Fri, Jul 13, 2001 at 05:49:56PM -0400, Tom Lane wrote: Let's drop the meta-discussions and cut to the chase: given that we are about to re-enable partial indexes, should we try to make EXTEND INDEX work too, or just remove it? Just a few clarifications: * The reason it didn't go to -hackers was because I wasn't subscribed to it and hence couldn't post to it. The only reason I can now is because I subscribed (nopost) about 2 minutes ago. * I discussed this with Tom Lane on -general a few days ago. I'm not sure how many people saw that though. Are most of the people on -hackers subscribed to -general as well? * I agree with Tom's assertion that it's an awful lot of complexity for such a marginal gain. Look at the size of the patch and the fact that it has all been useless for the last few years. * I didn't send it to -patches because it's not ready yet. * Only posted a URL, not the patch itself. Sorry for the confusion. Tom actually suggested doing this at the same time as re-enabling partial indices but I favoured a separate patch considering the large number of scattered changes. Anyway, is there a concensus, or shall I forget the whole thing? -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ It would be nice if someone came up with a certification system that actually separated those who can barely regurgitate what they crammed over the last few weeks from those who command secret ninja networking powers. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Re: [PATCH] To remove EXTEND INDEX
On Fri, Jul 13, 2001 at 06:34:22PM -0400, Bruce Momjian wrote: Let's drop the meta-discussions and cut to the chase: given that we are about to re-enable partial indexes, should we try to make EXTEND INDEX work too, or just remove it? We don't let people add columns to an existing index so I don't see why we should have EXTEND INDEX unless index twiddling is more common with partial indexes. We don't allow people currently to fiddle with indices at all. I don't understand the origin of EXTEND INDEX since I can't think of a situation where it would actually be useful. -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ It would be nice if someone came up with a certification system that actually separated those who can barely regurgitate what they crammed over the last few weeks from those who command secret ninja networking powers. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] Radical suggestion for plan executor?
I notice that the query executor currently has a lot of switch statements on the the type of node it is descending to. This means you get a call tree like: ExecProcNode ExecNestLoop ExecProcNode ExecMergeJoin ... Wouldn't it be nicer if the Plan had access to function pointers that already referred to the right function. So instead of: result = ExecProcNode( a, b ) you get: a-procs.exec( b ); It compresses the call tree down a bit. However, I'm not sure if it has many benefits other than maintainability. OTOH, you could keep ExecProcNode and just replace the switch with a function call. Any thoughts? -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ It would be nice if someone came up with a certification system that actually separated those who can barely regurgitate what they crammed over the last few weeks from those who command secret ninja networking powers. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
[HACKERS] Determining return type of polymorphic function
[Please CC any replies so I don't have to follow them via the archives] Hi, I'm trying to create a set of types that are going to share the INPUT and OUTPUT functions (written in C). For output you can determine the type from the arguments, but for INPUT you can't. The prototype is restricted (by CREATE TYPE) and you can't specify anyelement as the return type because none of the arguments use it. My current way around that is to create an alias to the function with different names for each type, but get_fn_expr_rettype() doesn't appear to be filled in anyway (fcinfo-flinfo-fn_expr == NULL). What I'm trying to do now is use fcinfo-flinfo-fn_oid to lookup pg_proc and get the return type from there, but something tells me there must be an easier way. Or to put it another way, if I define a function like: CREATE FUNCTION myfunction(cstring, oid, integer) RETURNS mytype AS 'mylib.so' LANGUAGE 'C'; How can I determine I'm supposed to return a mytype? I'm running 7.4 if it matters... Thanks in advance, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpw0xkpyZUhh.pgp Description: PGP signature
Re: [HACKERS] Determining return type of polymorphic function
[Please CC replies, thanks] On Thu, Aug 11, 2005 at 02:17:30PM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: What I'm trying to do now is use fcinfo-flinfo-fn_oid to lookup pg_proc and get the return type from there, but something tells me there must be an easier way. No, I think you're stuck. The internal calls for type I/O routines don't set up fn_expr (since there is no expression tree). One possibility, depending on your time horizon for this, is to change the getTypeIOParam rules so that ordinary types get their own OID as second argument. Hmm, I was thinking about that. While reading the documentation I was thinking surely they'd pass their own oid, giving zero would be silly so I was kind of surprised when I did get zero. I was thinking of actually also storing the oid in the typelem field but the docs imply this does something fancy with subscripting. I havn't traced the code paths for that yet. At the very least I think it would confuse anything looking for arrays. I also thought about typmod (the third argument) but that seems to almost always be -1. Would a patch to change the rules be accepted, or would it be considered a unnecessary backward incompatable change? Thanks in advance, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpi9FwgBzPbd.pgp Description: PGP signature
Re: [HACKERS] Determining return type of polymorphic function
On Thu, Aug 11, 2005 at 02:51:11PM -0400, Tom Lane wrote: Would a patch to change the rules be accepted, or would it be considered a unnecessary backward incompatable change? I wouldn't back-patch it, but it seems like something we could still put in for 8.1. Ok, here's a patch (with documentation update). I checked the regression tests (looked over, not run) but nothing there appears to test this anyway. I looked through all the datatype input functions but none of them even use the second argument except array and record types and they're explicitly unchanged. Note: the logic could be simplified if we could assume composite types can't have a non-zero typelem. From looking at the code, I think it may be assumed in places and I'm fairly sure it's non-sensical, but is it explicitly forbidden? I thought of writing a few simple tests but no language will accept cstring arguments except C. It can be added if you think it's worth regression testing. Unless there are other comments I'll post this to pgsql-patches later... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp8zqGHA6npj.pgp Description: PGP signature
Re: [HACKERS] [PATCH] Determining return type of polymorphic function
Forgot to attach it, oops. On Fri, Aug 12, 2005 at 09:56:47AM +0200, Martijn van Oosterhout wrote: Ok, here's a patch (with documentation update). I checked the regression tests (looked over, not run) but nothing there appears to test this anyway. I looked through all the datatype input functions but none of them even use the second argument except array and record types and they're explicitly unchanged. Note: the logic could be simplified if we could assume composite types can't have a non-zero typelem. From looking at the code, I think it may be assumed in places and I'm fairly sure it's non-sensical, but is it explicitly forbidden? I thought of writing a few simple tests but no language will accept cstring arguments except C. It can be added if you think it's worth regression testing. Unless there are other comments I'll post this to pgsql-patches later... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. Index: doc/src/sgml/ref/create_type.sgml === RCS file: /projects/cvsroot/pgsql/doc/src/sgml/ref/create_type.sgml,v retrieving revision 1.57 diff -u -r1.57 create_type.sgml --- doc/src/sgml/ref/create_type.sgml 14 Jul 2005 06:17:36 - 1.57 +++ doc/src/sgml/ref/create_type.sgml 12 Aug 2005 12:52:17 - @@ -102,9 +102,8 @@ declared as taking one argument of type typecstring/type, or as taking three arguments of types typecstring/type, typeoid/type, typeinteger/type. - The first argument is the input text as a C string, the second - argument is the element type's OID in case this is an array type - (or the type's own OID for a composite type), + The first argument is the input text as a C string, the second argument is + the OID of the type, except for arrays where it is the element type's OID and the third is the literaltypmod/ of the destination column, if known (-1 will be passed if not). The input function must return a value of the data type itself. Index: src/backend/utils/cache/lsyscache.c === RCS file: /projects/cvsroot/pgsql/src/backend/utils/cache/lsyscache.c,v retrieving revision 1.116 diff -u -r1.116 lsyscache.c --- src/backend/utils/cache/lsyscache.c 29 Aug 2004 05:06:50 - 1.116 +++ src/backend/utils/cache/lsyscache.c 12 Aug 2005 12:52:17 - @@ -1220,12 +1220,14 @@ /* * Composite types get their own OID as parameter; array types get -* their typelem as parameter; everybody else gets zero. +* their typelem as parameter; everybody else gets their own oid. */ if (typeStruct-typtype == 'c') return HeapTupleGetOid(typeTuple); - else + else if( typeStruct-typelem != 0 ) return typeStruct-typelem; + else + return HeapTupleGetOid(typeTuple); } /* pgpO2H7dLlWm6.pgp Description: PGP signature
[HACKERS] SPI: ERROR: no snapshot has been set
[Please CC any replies, thanks] Hi, I got this nice error, and according to Tom Lane[1]: Depending on what PG version you're using (which you did not say, naughty naughty) there are paths in interactive query entry that might try to execute datatype input functions before setting the query snapshot. But I don't believe it can happen down inside SPI. [1] http://www.dbforums.com/archive/index.php/t-1065541.html Good guess, I am inside a type input function trying to use SPI. My questions are: - I'm using 7.4, is this fixed/changed in later versions? - I'm triggering it with: select 'hello'::mytype; Is there an easy way of testing my function without triggering this? select (select 'hello'::mytype);-- Same problem select (select 'hello')::mytype -- gives something about casting unknown \copy .. from stdin -- Works but is inconvenient - Is there a way to detect if a snapshot has been started and creating one if there hasn't. I've seen the snapshot related functions but I don't feel comfortable sticking them in my type input function just to fix this. Is it safe? Thanks in advance, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpqb0TheSA5J.pgp Description: PGP signature
Re: [HACKERS] SPI: ERROR: no snapshot has been set
On Sat, Aug 13, 2005 at 12:36:42PM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: - Is there a way to detect if a snapshot has been started and creating one if there hasn't. I've seen the snapshot related functions but I don't feel comfortable sticking them in my type input function just to fix this. Is it safe? This would be a pretty bad idea IMHO. The setting of the first snapshot in a transaction is a significant event, and it should not happen in random places like type-specific input functions. I figured as much. In that case I've been thinking something like: if (SerializableSnapshot == NULL) elog( ERROR, Sorry, 'const'::mytype constructs are not supported in first transaction, please use mytype('const'::text) ); I hope this works, I havn't gotten around to the casts yet, I hope non-immutable casts are called late enough. Annoyingly, it breaks a straight: INSERT INTO table VALUES ('const','const','const',...) Which is what GUI frontends are going to try to do... My alternative is to cache results but that would make whether it works sporadic and unpredictable. A long-term solution to this might be to do what's envisioned in this comment in parse_coerce.c: snip I'm a bit concerned about the possible side-effects of this on existing behavior, though. In particular constructs like 'now'::timestamp would change behavior subtly. It's a good idea, eventually. All it would do is change the time it's evaluated, and in that example it doesn't change since 'now' is the transaction start time. However, in general it is a change of behaviour. Thanks in advance, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpMg7bDVLw9s.pgp Description: PGP signature
Re: [HACKERS] SPI: ERROR: no snapshot has been set
For future reference, I got around this error (no snapshot has been set) by removing the use of SPI and just using heap_open / heap_beginscan / heap_endscan / heap_close. It's only slightly more code but it works irrespective of the state of the backend. Have a nice day, On Sat, Aug 13, 2005 at 03:59:56PM +0200, Martijn van Oosterhout wrote: Good guess, I am inside a type input function trying to use SPI. My questions are: -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpAelbGsnm5E.pgp Description: PGP signature
[HACKERS] Typmod for user-defined types
[Please CC any replies, thanks] Hi, I've got a situation were I'd really like to be able to have a typmod for a user-defined type. In particular, I'd like to make use of the coerce_to_target_type()/coerce_type_typmod() chain. This only works if you have a typmod != -1. Even if you set the typmod in pg_type, when you create a table it's not copied to the attribute but set back to -1. This has been discussed before[1] and I notice Tom Lane posted a message about this back in June last year. I was wondering if it would be possible to allow user-defined types to declare a typmod function. As an experiment I added arguments to GenericType but that gave reduce/reduce conflicts. OTOH if you just add to SimpleTypename it just works. I imagine this limits the places it can work. Tom Lanes patch[2] looks like it may work, but would a mechanism to allow user-defined types to have a typmod function be accepted? Have a nice day, [1] http://archives.postgresql.org/pgsql-hackers/2004-06/msg00923.php [2] http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp66IZZSTitb.pgp Description: PGP signature
Re: [HACKERS] Typmod for user-defined types
AIUI, the issue is statements like: # select numeric(6,3) '5.6'; Bison can't look far enough ahead to determine whether the numeric is a type or a function. However, the desired grammer itself is not ambgiuous, it's just that Bison is not smart enough to make it work. It works now only because we hardwire the types that are allowed to have argument. Wouldn't a way around this be that to have grammer allow a type to look like a function call but convert the FuncExpr node to a TypeName node if the grammer determines it is a type after all. Something like: SimpleTypename: type_name { $$ = $1 } | func_expr { $$ = MakeTypeFromFuncCall( $1 ) } That function could then check that only constant integer arguments have been used, and then dispatch to a type specific function to turn the arg list into a typmod. Among other things, this would allow us to produce better error messages for things like bit(6,3) and numeric(l) instead of just syntax error. I was thinking of adding to each type a typmod_in and a typmod_out function. The first takes an array of integer in and returns an int32, the latter does the reverse. This is however quite a dramatic overhaul of the grammer and I'm not sure it's actually conflict free. But numeric, float, decimal, etc would no longer need to be matched by the lexer and you could call the timestamp function without double quotes. At least in preliminary tests it seems OUT as a function name would be a casualty due to the new IO/OUT parameters: function ( field IN OUT type ) It can't determine soon enough that OUT can't be a type in this context. And with this change type/function names would be the same list. I wonder if there is a way around this. Anyway, I'll try out some more stuff. Have a nice day, On Sat, Aug 27, 2005 at 11:09:09AM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Tom Lanes patch[2] looks like it may work, but would a mechanism to allow user-defined types to have a typmod function be accepted? [2] http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php Well, the question still stands: are we going to paint ourselves into a corner by requiring type and function names to be equivalent? regards, tom lane -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpXvQCsdbD2j.pgp Description: PGP signature
[HACKERS] Simple tester for MVCC in PostgreSQL
[Please CC any replies, thanks] Hi, I saw the discussion about an tester for MVCC. Since I'd never done anything with asyncronous queries before, I figured I'd try to write something useful with it. The result is at: http://svana.org/kleptog/pgsql/mvcctest.tar.gz It's a tester that takes a testing script, runs the queries one at a time through as many connections as you ask. It uses asyncronous queries so it can handle queries that don't return immediatly (using and wait model ala UNIX). It doesn't use threads. It can assign variables and handle expected errors and test that queries unblock at the right time. Whether this is enough for serious testing I have no idea. I've included 5 selftests and 3 real tests (which about covers my knowledge of transaction isolation levels). I hope it provides a useful basis for a real tool. It uses Perl and the Pg module from CPAN (included in many distributions, it's basically a wrapper for libpq). For more details see the README. A simple test harness type program is included. Just typing make will set it going. As an example I include a sample test script below (included as tests/in/test3.test). It opens two connections and tries several concurrent updates. In the first block, they don't conflict so there is no delay. In the second block the second update has to wait for the rollback, in the third it causes a serialization failure. I will not be able to answer my email for the next few hours due to sleep. Please be patient if it doesn't work straight away. Have a nice day, # This test tests simple serialisation failure 1I drop table t 1 SET default_transaction_isolation = serializable 2 SET default_transaction_isolation = serializable 1 create table t (grp text, value int4) 1 insert into t values ('a',10) 2 insert into t values ('a',20) 1 insert into t values ('b',30) 2 insert into t values ('b',40) # These updates don't conflict, all should be fine 1 begin 2 begin 1 update t set value = value + 10 where grp = 'a' 2 update t set value = value + 10 where grp = 'b' 1 commit 2 commit # These updates do conflict, but the second can proceed once first transaction aborts 1 begin 2 begin 1 update t set value = value + 10 where grp = 'a' 2 update t set value = value + 10 where grp = 'a' 1 abort 2 wait 2 commit # These updates conflict, the second transaction aborts 1 begin 2 begin 1 update t set value = value + 10 where grp = 'a' 2e update t set value = value + 10 where grp = 'a' 1 commit 2 wait 2 abort 1 drop table t -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpN4EY6Be9Ij.pgp Description: PGP signature
[HACKERS] On hardcoded type aliases and typmod for user types
Hi, As part of previous discussions about typmod for user type, Tom mentioned that you would need to make type and function names equivalent. As it turns out, if you refactor a few rules, you can actually make it work and manage them seperately. For this to work the current col_name_keyword non-terminal has to be divided into few more categories. The criterion is mostly whether it is followed by a left parenthsis. 1. Some can be moved to unreserved words, mostly the types BIT, VARCHAR, BIGINT etc since they don't need special rules anymore. 2. Words that have special productions (eg CONVERT, COALESCE etc), these can still only be used as column names, not types or (user-defined) functions. 3. Words which can be column names functions but not types. These never appear normally with a parenthesis, so they will be interpreted as a function if there is one. (eg SETOF, NATIONAL, etc) 4. Words that can be column names and types but not functions. These are artifacts of the grammer due to the fact that VARYING is unrestricted. These are BIT, NCHAR, CHAR_P and CHARACTER. After this, you can have user-functions that take an arbitrary set of parameters. There'll need to be a bit of code to verify the arguments. It results in a net gain of 15 keywords for functions and about 30 for types. My question is, should users be able to create types schema.int4 and schema.integer simultaneously. Currently it allows you but it's not handled very well (\dT doesn't list both). Should this be allowed? Should aliasing for DEC and DECIMAL - NUMERIC be done for user-defined types? Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpF8U8wTWMrX.pgp Description: PGP signature
Re: [HACKERS] On hardcoded type aliases and typmod for user types
On Wed, Aug 31, 2005 at 11:11:04AM -0400, Tom Lane wrote: IMHO, ideally the aliasing should *only* apply to the built-in types. The current hack only approximates this (IIRC, the translation happens for any unqualified type name, independently of one's search path). One possible approach is to remove the aliasing translation from the grammar altogether, and add a notion of alias entries in pg_type that would be found through normal lookup and then replaced by the underlying type by parse analysis rather than by the grammar. We could not do this in the existing system because of the need to handle typmods for some of the aliases ... but maybe it would work given generalized typmod support. There are still a few special cases though, like CHARACTER VARYING, which seems like it more or less has to be wired into the grammar. Yeah, I was thinking about alias entries. I was thinking that domains might already do a lot of the work. But then it's not really aliasing anymore. BTW, the proposed refactoring sounds messier to me than does decreeing type and function names equivalent ... Actually, it's not that bad. The non-terminals relating to types collapse into about three with only xxx VARYING and DOUBLE PRECISION and a few others remaining. The keywords are split into three sets like I mentioned. Then you make the rules for GenericType look enough like the expansion of func_expr, that bison can consider the context after before deciding it's a function or a type. The changes are not that great. The number of rules (according to the -v output) goes from 1610 to 1601 and states from 2810 to 2777. OTOH, a chunk of code moves from gram.y to the adt directory I guess. The grammar is just the beginning of the work. Though maybe the point is that we can take the easy way and implement the slightly more difficult if it turns out the be necessary. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpiG2H0tfrQG.pgp Description: PGP signature
Re: [HACKERS] On hardcoded type aliases and typmod for user types
On Wed, Aug 31, 2005 at 02:25:54PM -0400, Tom Lane wrote: I still like the idea of pushing the aliasing out of the grammar, though. Come to think of it, we could probably even handle the multiple-word stuff that way: let the grammar convert CHARACTER VARYING to character varying and have an alias with that name in the catalog. Currently, a user-defined type, function, table etc is one IDENT, I don't see any point in changing that. The standard wants character varying and we can support that but we don't need to extend that to user-defined types. If people really want that they can quote it. Besides, the character stuff is somewhat special as it has the character set stuff so will need special grammer anyway. One thing you'd need to look at is that format_type is aware of the special properties of the alias names: at present they never need to be schema-qualified, but this would no longer be certainly the case with the aliasing approach. A possible answer is for format_type to work by replacing (say) INT4OID with the OID of the alias type that has the desired spelling, and then use the same TypeIsVisible test as is applied to any user type. Another thing that is involved there is not double-quoting the generated names ... we don't want it to emit character varying but the user-type path would do that. I was thinking actually of setting the type searching code to search pg_catalog before the normal search_path. The types being hardwired into the grammer essentially implied this so I thought I would avoid surprises. This ofcourse would mean that all built-in types would automatically override user-defined ones, which actually sucks if PostgreSQL keeps including more types by default. OTOH, then types in pg_catalog never need to be qualified, making it easy for format_type. Now i think about it it may not be a good idea, for all its benefits. Perhaps only doing it for multiword types. Damn special casing. Hmm... actually there's a bit of an issue here, which is that it's not clear whether schema qualification makes sense for the multi-word type names. For instance pg_catalog.character varying It doesn't work. The current grammer, even now, treats anything schema qualified as non-special. You can't schema qualify char(4) even if you want to. Incidently, these typmod changes for user types would make this work as a side-effect. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgprnwknlHPj7.pgp Description: PGP signature
Re: [HACKERS] On hardcoded type aliases and typmod for user types
On Wed, Aug 31, 2005 at 05:14:13PM -0400, Tom Lane wrote: That strikes me as an unnecessary reduction in flexibility. As long as we make the hardwired type names translate to qualified names (same as they do now) we don't have to assume any such thing. Ack, there's fortunatly only a handful of those. The point about character sets is a bit distressing; here we are designing a new general-purpose mechanism and we can already see cases it doesn't handle. Can we fix that? Err, well. My thought was a certain group of type-suffix options would be permitted (only zero or one at a time), for example: WITH TIME ZONE WITHOUT TIME ZONE CHARACTER SET xxx And have the grammer accept these after any type. For example, the type NUMERIC WITH TIME ZONE would be syntactically valid but the code would then reject it. You have a issue then because the typmod function should then be able to return a completely different type because the system looked up timestamp and now the function determines that with that option, it should actually be timestamptz. As for the specific mechanism, well, my options were (in the TYPE declaration statement: TYPMODFUNC = function( intarray [, sometype] ) RETURNS int32 or intarray This restricts the arguments between the brackets to integers, is this reasonable? The sometype would be something to handle the suffix options. (Text pair? option,value). Returning an intarray if a new type is allowed. TYPMODFUNC = function( recordtype [, sometype ] ) RETURNS int32 or intarray The record type could then indicate what's supported, except you can't pass a variable number of arguments (for NUMERIC). How about fill up from the front, leave NULLs for all the unfilled ones. The STRICT flag could tell if all fields need to be filled (ugh). TYPMODFUNC = function( internal [, sometype ] ) RETURNS int32 or intarray Simply pass the (Node*) from the parser and let the function sort it out itself. Except now they have to be written in C. Is this unreasonable, it's called fairly early on, all the issues with no valid snapshot apply here and you can't defer the evaluation till later. I'm not sure how to choose, they all handle the current situation fine but what do we want to allow users to do in the future? Is the SQL standard likely to come up with SOMETYPE(ident) as a declaration, in which case we need the second or third options. Ident can be converted to a constant string for these purposes. And then there's output to consider, currently timestamp etc have special cases. But if you're going to allow CHARACTER SET xxx to apply to any type, you need a way to reconstruct the values for output. Requireing the user the provide an inverse function is one (possibly unreliable) way. Storing the arguments directly is another. And is one int32 typmod sufficient? This character set per column has been talked about for a while, but where was the information going to be stored? There's several issues to be sorted out yet, I fear. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgplhXwyrOAOR.pgp Description: PGP signature
Re: [HACKERS] On hardcoded type aliases and typmod for user types
On Thu, Sep 01, 2005 at 11:18:04AM +0200, Dennis Bjorklund wrote: String types have 3 modifiers, the length, the charset and the collation. The syntax of these are defined by the standard so at least that syntax ought to be allowed (even if there are more work to actually do anything with charset and collation info). From a quick browse in google it looks like: CHAR(4) COLLATE xxx CHARACTER SET yyy is valid syntax. So I guess that becomes 0 or more modifiers. And a single int32 is not going to cut it. More fields (typmod2, typmod3)? Fields explicitly for this purpose (typcollate and typcharset), array of int32? Which would make the second argument to the typmod function ARRAY OF (textpair) or some such. If the function doesn't accept that then this is a quick indicator that no options are allowed. Quick thing, should 'mytype' and 'mytype()' be considered the same and should they default to typmod -1? Currently '()' is not even accepted. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpKBsVJBWalY.pgp Description: PGP signature
Re: [HACKERS] On hardcoded type aliases and typmod for user types
On Thu, Sep 01, 2005 at 11:12:26AM +0300, Hannu Krosing wrote: Maybe make the last one WITH CHARACTER SET xxx and promote WITH to a real keyword. It seems a good idea to have WITH as a real keyword anyway, as at least ANSI/ISO syntax for recursive queries seem to require it too. Sorry, CHARACTER SET is defined by SQL standard. I don't understand what it is there for though, I thought the point of UNICODE/UTF-8 was to get rid of all this crap. I also can't find the bit that explains what should happen if two strings of different character sets are concatinated. The only thing I can think this useful for is default input/output charset, overriding client_encoding, and internally everything is still UNICODE. The COLLATE stuff is neat, if we can get it work. Maybe CHARSET is a roundabout way to specify the COLLATE order? Incidently, I just downloaded the SQL99 spec and am slightly confused by some of the things they'd added. Am I the only one? -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpExRZavzpkf.pgp Description: PGP signature
Re: [HACKERS] On hardcoded type aliases and typmod for user types
On Thu, Sep 01, 2005 at 08:50:27AM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Simply pass the (Node*) from the parser and let the function sort it out itself. Except now they have to be written in C. Is this unreasonable, Nope. You're not going to be writing any interesting datatypes without using C, anyway. I'd go with this one to minimize flexibility problems. Ack I'd be inclined to stick with the convention that internally typmod is a single int32; that is really wired into way too many APIs to consider changing. varchar could do something like using 24 bits for the length and 8 bits for an encoded indication of the charset. With the unfortunate effect that strings are limited to 16Mb instead of 1Gb. Not sure if people will be happy with that one. For my locale experiments I used my taggedtypes module to embed the locale into the data itself, I imagine something similar could be used. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpdGAU0Ug6IH.pgp Description: PGP signature
[HACKERS] Using multi-locale support in glibc
Browsing the glibc stuff for locales I noticed that glibc does actually allow you to specify the collation order to strcoll and friends. The feature is however marked with: Attention: all these functions are *not* standardized in any form. This is a proof-of-concept implementation. They do however work fine. I used my taggedtypes module to create a type that binds the collation order to the text strings and the results can be seen below. 1. Is something supported by glibc usable for us (re portability to non-glibc platforms)? 2. Should we be trying to use an interface that's specifically marked as unstable? 3. What's the plan to support multiple collate orders? There was a message about it last year but I don't see much progress. 4. It makes some things more difficult. For example, my database is UNICODE and until I specified a UTF8 locale it didn't come out right. AFAIK the only easy way to determine if something is UTF8 compatable is to use locale -k charmap. The C interface is hidden. It should be possible to compile a list of locales and allow only ones matching the database. Or automatically convert the strings, the conversion functions exist. 5. Maybe we should evaluate the interface and give feedback to the glibc developers to see if it can be made more stable. If you want to have a look to see what's available, use: rgrep -3 locale_t /usr/include/ |less Have a nice day, PS. The code to test this can be found at: http://svana.org/kleptog/pgsql/taggedtypes.html --- TEST OUTPUT --- test=# select strings from taggedtypes.locale_test order by locale_text( strings, 'C' ); strings - Test2 Tést1 Tëst1 test1 tèst2 (5 rows) test=# select strings from taggedtypes.locale_test order by locale_text( strings, 'en_US' ); strings - Tëst1 Tést1 tèst2 test1 Test2 (5 rows) test=# select strings from taggedtypes.locale_test order by locale_text( strings, 'nl_NL' ); ERROR: Locale 'nl_NL' not supported by library test=# select strings from taggedtypes.locale_test order by locale_text( strings, 'en_AU.UTF-8' ); strings - test1 Tést1 Tëst1 Test2 tèst2 (5 rows) -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpCaa0baPW34.pgp Description: PGP signature
Re: [HACKERS] Using multi-locale support in glibc
On Thu, Sep 01, 2005 at 01:46:00PM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: 1. Is something supported by glibc usable for us (re portability to non-glibc platforms)? Nope. Sorry. Do we have some platforms that don't have any multi-language support? I mean, we don't have a complete thread library but a wrapper around the ones used on the platform. Couldn't we make a similar wrapper that used glibc if it was available, windows native if it's available, etc... That way we conform to the platform rather than a version of the unicode collating set that postgresql happens to ship with it. For example, Windows doesn't use standard Unicode sorting rules, do we care if people come complaining that postgresql sorts different from their app? -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgprsJeImdmfb.pgp Description: PGP signature
[HACKERS] Proof of concept COLLATE support with patch
Supports any glibc platform and possibly Win32. Adds: SELECT ... ORDER BY expr COLLATE 'locale' CREATE INDEX locale_index ON table(expr COLLATE 'locale') Index scan used when COLLATE order permits This is just a proof of concept patch. I didn't send it to -patches because as Tom pointed out, there's no hope of it getting in due to platform dependant behaviour. This patch does not use setlocale and is completely orthoganal to any locale support already in the backend. As it turns out, meaningful locale support only needs a handful of support functions to work. These are listed at the bottom. My patch only uses the first two, but the third will be needed at some stage. The use of the last one depends on how the backend ends up support locales. Both glibc and wine32 have locale sensetive versions of many functions including: toupper_l, tolower_l, strfmon_l, strtoul_l, strtof_l, strftime_l, is*_l A windows function list is at: http://msdn2.microsoft.com/library/wyzd2bce(en-us,vs.80).aspx Patch available here: http://svana.org/kleptog/pgsql/collate1.patch Implementation notes follow and table of functions is at the bottom. I hope this helps whenever someone gets around to full COLLATE support. Have a nice day, Notes: * It works by replacing (expr COLLATE 'locale') with pg_strxfrm(expr, pg_findlocale(locale)) in the parsetree. pg_findlocale returns an opaque pointer to the locale. It is STRICT IMMUTABLE and is optimised away in the final query. pg_strxfrm takes the string and the locale and returns a bytea. bytea comparison uses memcmp so is safe from other locale effects in the backend. * Use of COLLATE for an index will probably double the diskspace required for that index due to the strxfrm. * I had to add the functions to pg_proc.h because CREATE FUNCTION couldn't find them. So they have OIDs I made up. You may need to initdb, I'm not sure. You can compile pg_xlocale.c as an shared object and load them that way too if you want to avoid the initdb. * Internally they are defined as taking and returning internal. CREATE FUNCTION doesn't like that so specify opaque or oid instead. The declarations are: create function pg_findlocale(text) returns oid as 'pg_findlocale' language internal strict immutable; create function pg_strxfrm(text,oid) returns bytea as 'pg_strxfrm' language internal strict immutable; * The clause ORDER BY 1 COLLATE 'en_AU' breaks, it treats the 1 like a constant. I couldn't quickly work out how to reference the columns the right way. Long term that code should be in the sorting code anyway. * The locale needs to be in quotes, otherwise the parser converts it to lower-case. Locale names are case-sensetive on many systems. * There is a text function strcoll_l for testing collation: create function pg_strcoll_l(text,text,text) returns int4 as 'pg_strcoll_l' language internal strict immutable; * Yes this is the easy way out, implementing the inheritence of the COLLATE attribute will be much more invasive. This gives most people what they want though. * Although these functions are documented on Windows, they are not for glibc, so it is an unstable insterface. Function Needed glibc Win32 - Function returing opaquenewlocale _create_locale pointer to locale data strxfrm with locale parameter strxfrm_l _strxfrm_l Method finding encoding for nl_langinfo_l ??? locale strcoll with locale parameter strcoll_l _strcoll_l -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpmj3S2BGvSy.pgp Description: PGP signature
Re: [HACKERS] Proof of concept COLLATE support with patch
On Fri, Sep 02, 2005 at 03:04:20PM +0200, Martijn van Oosterhout wrote: Supports any glibc platform and possibly Win32. MacOS X [1] supports this also apparently. And for glibc it appears to have been accepted as part of the API since 2.3.2 and formally accepted into LSB3.0. Win32 claims to have supported this since '98. But even though the MacOS X manpage says BSD Library Functions at the top of the page, neither FreeBSD or OpenBSD doesn't appear to have it at all. Not really a lot of chance that we could pull portions of the Darwin libc into PostgreSQL, huh? Maybe the easiest thing would be to download the libc locale support of one of the BSDs, remove the global variable and use that... [1] http://www.hmug.org/man/3/newlocale.php Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpBsEixB7sps.pgp Description: PGP signature
Re: [HACKERS] Proof of concept COLLATE support with patch
On Fri, Sep 02, 2005 at 12:44:00PM -0400, Tom Lane wrote: Hmm, the more general page seems to be http://www.hmug.org/man/3/xlocale.php This seems to be pretty much exactly what we want, at least API-wise. Now, if we can find an implementation of this with a BSD license ;-) ... Yes it is, it's exactly the same interface as glibc. Windows has them all with an underscore prefix. [ I don't recall at the moment whether Apple publishes all of Darwin under a straight BSD license, but that would surely be a good place to look first. ] libc is listed as APSL licence, whatever that means. Something with that many clauses can't be BSD compatable. What I wonder is how come Apple implemented all this in their version yet none of the BSDs got around to it. I've looked around for Citrus, it appears that NetBSD contains the latest version and while there's a lot of stuff for LC_CTYPE and charset conversion, LC_COLLATE didn't appear to be high on their priorities. I especially liked these fragments from the OpenBSD and NetBSD CVS repositories. Tom, you've comvinced me, relying on the platform is silly. We have platforms that don't support LC_COLLATE in one locale, let alone multiple. FreeBSD thankfully does support it. http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libc/string/strcoll.c?rev=HEAD http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/string/strcoll.c?rev=HEAD --- snip --- /* * Compare strings according to LC_COLLATE category of current locale. */ int strcoll(s1, s2) const char *s1, *s2; { _DIAGASSERT(s1 != NULL); _DIAGASSERT(s2 != NULL); /* LC_COLLATE is unimplemented, hence always C */ return (strcmp(s1, s2)); } -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp1mXXdNMgaL.pgp Description: PGP signature
Locale implementation questions (was: [HACKERS] Proof of concept COLLATE support with patch)
On Fri, Sep 02, 2005 at 11:42:21AM -0400, Tom Lane wrote: The objection is fundamentally that a platform-specific implementation cannot be our long-term goal, and so expending effort on creating one seems like a diversion. If there were a plan put forward showing how this is just a useful way-station, and we could see how we'd later get rid of the glibc dependency without throwing away the work already done, then it would be a different story. Well, my patch showed that useful locale work can be acheived with precisely two functions: newlocale and strxfrm_l. I'm going to talk about two things: one, the code from Apple. Two, how we present locale support to users. --- Now, it would be really nice to take Apple's implementation in Darwin and use that. What I don't understand is the licence of the code in Darwin. My interpretation is that stuff in: http://darwinsource.opendarwin.org/10.4.2/Libc-391/locale/ is Apple stuff under APSL, useless to us. And that stuff in: http://darwinsource.opendarwin.org/10.4.2/Libc-391/locale/FreeBSD/ are just patches to FreeBSD and this under the normal BSD license (no big header claiming the licence change). The good news is that the majority of what we need is in patch form. The bad news is that the hub of the good stuff (newlocale, duplocale, freelocale) is under a big fat APSL licence. Does anyone know if this code can be used at all by BSD projects or did they blanket relicence everything? --- Now, I want to bring up some points relating to including a locale library in PostgreSQL. Given that none of the BSDs seem really interested in fixing the issue we'll have to do it ourselves (I don't see anyone else doing it). We can save ourselves effort by basing it on FreeBSDs locale code, because then we can use their datafiles, which we *definitly* don't want to maintain ourselves. Now: 1. FreeBSDs locale list is short, some 48 compared with glibc's 217. Hopefully Apple can expand on that in a way we can use. But given the difference we should probably give people a way of falling back to the system libraries in case there's a locale we don't support. On the other hand, lots of locales are similar so maybe people can find ones close enough to work. No, glibc and FreeBSD use different file formats, so you can't copy them. Do we want this locale data just for collation, or do we want to be able to use it for formatting monetary amounts too? This is even more info to store. Lots of languages use ISO/IEC 14651 for order. 2. Locale data needs to be combined with a charset and compiled to work with the library. PostgreSQL supports at least 15 charsets but we don't want to ship compiled versions of all of these (Debian learnt that the hard way). So, how do we generate the files people need. a. Auto-compile on demand. First time a locale is referenced spawn the compiler to create the locale, then continue. (Ugh) b. Add a CREATE LOCALE english AS 'en_US' WITH CHARSET 'utf8'. Then require the COLLATE clause to refer to this identifier. This has some appeal, seperating the system names from the PostgreSQL names. It also gives some info regarding charsets. c. Should users be allowed to define new locales? d. Should admins be required to create the external files using a program, say pg_createlocale. Remember, if you use a latin1 locale to sort utf8 you'll get the wrong result, so we want to avoid that. 3. Compiled locale files are large. One UTF-8 locale datafile can exceed a megabyte. Do we want the option of disabling it for small systems? 4. Do we want the option of running system locale in parallel with the internal ones? 5. I think we're going to have to deal with the very real possibility that our locale database will not be as good as some of the system provided ones. The question is how. This is quite unlike timezones which are quite standardized and rarely change. That database is quite well maintained. Would people object to a configure option that selected: --with-locales=internal (use pg database) --with-locales=system (use system database for win32, glibc or MacOS X) --with-locales=none (what we support now, which is neither) I don't think it will be much of an issue to support this, all the functions take the same parameters and have almost the same names. 6. Locales for SQL_ASCII. Seems to me you have two options, either reject COLLATE altogether unless they specify a charset, or don't care and let the user shoot themselves in the foot if they wish... BTW, this MacOS locale supports seems to be new for 10.4.2 according to the CVS log info, can anyone confirm this? Anyway, I hope this post didn't bore too much. Locale support has been one of those things that has bugged me for a long time and it would be nice if there could be some real movement. Have a nice weekend, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent
Re: Locale implementation questions (was: [HACKERS] Proof of concept COLLATE support with patch)
On Sat, Sep 03, 2005 at 05:44:50PM -0400, Greg Stark wrote: [...] Nor is it simpler for sysadmins to have to maintain an entirely separate set of locales independently from the system locales. Indeed, I was already coming up with mechanisms to determine what locales the system uses and try to autogenerate them. I agree though, it's not useful for systems that already have complete locale support. Why add to the burden? Anyway, my reading of the specs says that we must support the syntax. It doesn't say we need to support any orderings other than the default (ie what we do now). If you really are unhappy enough with OS setlocale implementations to want to try to do this then it would be more helpful to do it outside of Postgres. Package up the Apple setlocale library as a separate package that anyone can install on Solaris, BSD, Linux or whatever. Then Postgres can just say it works fine with your OS library but your OS library might be very slow. Here's a third-party library that you can install that is fast and may relieve any problems you have with collation performance. That's why I asked about the patches and files that Apple wrote. What are the licence restrictions? Would we be able to download the, what, 20 files and distribute it as a library. Being APSL we couldn't include it in the tarball, but it could be a pgfoundry project or something. If somebody knows a reason why this could not be done, speak up now because my reading of the APSL licence tells me it's fine. But I think that's getting ahead of things. Until Postgres even supports collations using the OS libraries you won't even know if that's even necessary. Well, I added COLLATE support for ORDER BY and CREATE INDEX and it worked in under 200 lines. I'm thinking ahead and I don't think the COLLATE rules are that hard. Implementing them seems a bit fiddly. It may be easiest to consider COLLATE a non-associative operator. I'm still unsure if I should turn the string comparison operators into three-argument functions. Anyway, I'll look into the library issue first. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpY0njaqfLIH.pgp Description: PGP signature
Re: [HACKERS] Locale implementation questions
On Sun, Sep 04, 2005 at 10:25:36PM +0900, Tatsuo Ishii wrote: 3. Compiled locale files are large. One UTF-8 locale datafile can exceed a megabyte. Do we want the option of disabling it for small systems? To avoid the problem, you could dynmically load the compiled tables. The charset conversion tables are handled similar way. That's not the point, ofcourse they are loaded dynamically. The question is, when do we create the files in the first place. There are 48*15 = 750 combinations which would amount to tens of megabytes of essentially useless data. *When* you create the files is an important question. Compile time is out. Charset conversion is completely different, there just arn't that many combinations. Also I think it's important to allow user defined collate data. To implement the CREATE COLLATE syntax, we need to have that capability anyway. Most OS's allow you to create collate data yourself anyway, why do we need to implement this too? To be honest, I don't understand why we have to rely on (often broken) system locales. I don't think building our own locale data is too hard, and once we make up it, the maintenace cost will be very small since it should not be changed regularly. Moreover we could enjoy the benefit that PostgreSQL handles collations in a corret manner on any platform which PostgreSQL supports. You say building our own locale data is not hard. I disagree, it's a waste of time we can do without. Unless you know the language yourself you cannot check changes made by anybody else. If there's an error in locale ordering, take it up with your OS distributor. I also think we open ourselves to questions like: 1. My locale is supported by the system but not by PostgreSQL, why? 2. My locale was supported last release but not this one, why? 3. Why does PostgreSQL sort differently from 'sort' or any other app on my system? Right. We Japanese (and probably Chinese too) have been bugged by the broken mutibyte locales for long time. Using C locale help us to a certain extent, but for Unicode we need correct locale data, othewise the sorted data will be completely chaos. Ok, is glibc still wrong or are they just implementing the unicode standard and that's what's wrong. All I'm saying is that we need to allow use of system locales until our native locale support is mature. In the end something like ICU (http://icu.sourceforge.net/) will end up obsoleting us. Nobody (in free-software anyway) uses it yet, but eventually it may be viable to require that to allow system independant locales. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp43wJ8YwMLK.pgp Description: PGP signature
[HACKERS] Install Darwin's locale library on your system :)
Well, it was pointed out the other day that the Darwin C library supports the non-standard extensions to the POSIX locale interface and that this might be ported to other systems so PostgreSQL could use it. So, I have written a few scripts which download the libc and locale library from darwinsource, shuffle some files around and build the result into a library called libdummylocale.so. It basically completely replaces your locale support on whatever system you use it on. It's all under the APSL, though some parts may be BSD licenced. Let me say right now, the locale support here sucks, no two ways about it. It doesn't support a single UTF-8 locale. Oh, it lets you specify them, but when you ask for the CHARSET it still says US-ASCII. It does support a number of other different charsets. (Not for collation though). So my challenge to those people who think maintaining a locale library is easy: make *one* locale in FreeBSD (or Darwin or this lib) support full UTF-8 collation in whatever locale and/or charset you choose. It's all downhill from there. While it builds simple programs, I don't think it's totally safe. You'd need to rename the headers at least. And building on Darwin will probably blow up due to the way it plays fast and loose with Darwin specific #defines. But it's a beginning if anyone is interested. It builds in my glibc system. I'm going to drop the idea of making a locale library, there's just nothing good enough. glibc is the only thing that comes close. From here on I'm going to work on COLLATE for systems that support xlocale, with an eye on ICU if/when it becomes standard enough. Download: http://svana.org/kleptog/pgsql/dummylocale.tar.gz Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp5fPUhr4aFM.pgp Description: PGP signature
[HACKERS] Mysteriously lost values in nodes
[Please CC any replies, thanks] I added a field to each of Var, Const, FuncExpr and OpExpr which is set during parse_expr. But somewhere between the parsing and execution the values of these fields get reset back to NULL. But only for FuncExpr and OpExpr, for Var and Const it all works as expected. I've traced with the debugger and confirmed that the field is set but that it's copied somewhere before execution and that copy didn't copy this field. The copyFuncExpr worked, it's just that another place did a copy some other way. grep reveals several places where new nodes are created but rather than just changing them all, is there a particular phase where these changes are made that I should be looking at? Thanks in advance, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpSmnU6pclcN.pgp Description: PGP signature
Re: [HACKERS] Simple tester for MVCC in PostgreSQL
On Tue, Sep 06, 2005 at 03:51:41PM +, Matt Miller wrote: On Tue, 2005-08-30 at 00:56 +0200, Martijn van Oosterhout wrote: http://svana.org/kleptog/pgsql/mvcctest.tar.gz I've started using it in some simple cases and it seems to be a good tool. The feature set looks to me to be a pretty solid core on which to build. Very nice. I too think the base is sufficient for quite complicated tests. I actually wrote a script which tested all pairs of locks to ensure they blocked exactly as the documentation said they should. And it passed. Thanks for the feedback, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpLnhVggo4be.pgp Description: PGP signature
Re: [HACKERS] Mysteriously lost values in nodes
On Tue, Sep 06, 2005 at 01:51:25PM -0400, Tom Lane wrote: Grepping for makeNode(OpExpr) might help you. Offhand I'd finger eval_const_expressions as the likely culprit. clauses.c has some other code you'd better look at too. Yeah, eval_const_expressions was the culprit in this case, though I think operators might have some more. Personally, when I want to add a field to a node, I grep for every reference to one or two of the existing fields to make sure I've found all the places I need to touch. So there's no shortcut, I'll remember that :) Thanks for the help, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpULJaurnt0f.pgp Description: PGP signature
[HACKERS] Suggestion to simplify installation of external modules
There's been some discussion about loading of external modules and getting the right function definitions. I was wondering if it would be an idea to have the install scripts inside the module itself. Create a command called: INSTALL word | 'full path' Would search for $libdir/{word}.so (or full path if specified) or whatever is appropriate for the platform. The module would then be queried for the installation script. I made a little script sql2install.pl [1] which takes an SQL file and turns it into a C file that exports a single function, __PG_INSTALL_FUNCTION. This returns a list of strings which are the commands in the SQL file. Just link this into the final module. While this does simplify installation, there are other benefits: - There could be a maintained list of installed modules, for frontends. - If the process created dependancies between the module and the functions created, pg_dump could avoid outputting those functions and types and instead emit INSTALL blah in the dump. This makes upgrades much easier because the definitions are no longer in the dump. - Dumps become more portable across architectures, because the library name is not stored directly. - Saner error messages. I guess in the specific case of languages, the proposed table would be redundant as the CREATE LANGUAGE definition would neither be hardcoded in the backend, nor in the dump, but in the module that implements it. Note, it would be just as easy to return a structure with author name, copyright info, version, CATVERSION, etc. Ideally, the module author would provide a properly formatted file and the pgxs scripts would take care of the conversion to C and linking. Note: I also though of UNINSTALL but that would seem to be a great shoot-in-foot capability. Thoughts? [1] http://svana.org/kleptog/pgsql/sql2install.pl -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpXT06S5bKJY.pgp Description: PGP signature
Re: [HACKERS] Case insensitive indexing in posgtres?
On Fri, Sep 09, 2005 at 04:00:57PM +0530, sandeep satpal wrote: Dear , One of the difference in mysql and postgresql is case insensitiveness. I want to midify the code of posgresql so that it can support case insensitiveness in indexing and in join operation. Use citext: http://gborg.postgresql.org/project/citext/projdisplay.php -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpEddh7fOSvK.pgp Description: PGP signature
Re: [HACKERS] random system table corruption ...
On Sun, Sep 11, 2005 at 01:12:34PM +0200, Hans-Jürgen Schönig wrote: in the past we have faced a couple of problems with corrupted system tables. this seems to be a version independent problem which occurs on hackers' from time to time. i have checked a broken file and i have seen that the corrupted page has actually been zeroed out. Near as I can tell, the only times pages are zeroed out is if zero_damaged_pages is set (destroying the evidence) or during WAL recovery. my question is: are there any options to implement something which makes system tables more robust? the problem is: the described error happens only once i an while and cannot be reproduced. maybe there is a way to add some more sanity checks before the page is actually written. Well, the most common causes are dodgy memory. Other than that I guess you could arrange for bgwriter to check the pages it is writing. I imagine it already does check the header, checking the data requires knowledge about the actual table and attributes. And about the only thing that says I'm broken is a varlena value with a long value. As they say, the only thing sure would be to have a backup. the only thing I can imagine being really useful would be a restore mode where you feed it the schema so it can reconstruct the pg_class and pg_attribute just enough for you to dump it to reconstruct everything... You know, VACUUM FREEZE BACKUP on pg_catalog, physically copy the datafiles and offer the option to blat your catalog with an old one... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp7YJB3n07fv.pgp Description: PGP signature
Re: [HACKERS] -fPIC
On Sun, Sep 11, 2005 at 10:32:51PM -0400, Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: ... mixing -fpic and -fPIC libraries is a problem. Is it? I would think having two options would be essentially unworkable if so. The thing is, on i386 it makes no difference, it's only on some archtechtures where it matters. And it has to do with both the size of the symbol table and the size of the code. Given that you don't know what you need to use until you compile it, if people are compiling all their stuff with -fPIC you can at least be sure that it won't break on other architectures. The new gcc visibility stuff gives you way of shrinking the symbol table and improving performance. There is a performance difference between -fpic and -fPIC, whether it's big enough to care about... You can shrink the symbol table with --version-script in LD, you provide a script like: { global: pg_finfo_* other exported symbols local: * } Whether it's enough... For people who want to know the gory details, read this (by Ulrich Drepper). http://people.redhat.com/drepper/dsohowto.pdf -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpRLUi4GonQ4.pgp Description: PGP signature
Re: [HACKERS] Hard drive failure leads to corrupt db
Your problem is that your database was initialised with locale 'en_US.ISO8859-1' but your system no longer recognises it. You need to create the locale somehow. On Linux it's /etc/locale.gen but you should probably search the locale manpage for how to do it on Solaris. Changing the locale requires a pg_dump and restore... Hope this helps, On Tue, Sep 13, 2005 at 08:49:23AM -0400, Brusser, Michael wrote: Just occurred to me: perhaps we don't have a database corruption, instead after replacement of the boot drive the locale on the host changed from en_US.ISO8859-1 to 'C' Still I am not sure what to do. Is changing the locale back to en_US.ISO8859-1 the right thing to do now? Mike. -Original Message- Our customer reported a problem resulting from the hard drive failure. Database server would not start, generating this message: PANIC: The database cluster was initialized with LC_COLLATE 'en_US.ISO8859-1', which is not recognized by setlocale(). It looks like you need to initdb. They are running v.7.3.2 on Solaris, where all locale parameters are set to C. Is there anything we can do to restore the database without data loss? Thank you, Michael. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp2pBDrGCfWu.pgp Description: PGP signature
Re: [HACKERS] Spinlocks, yet again: analysis and proposed patches
On Tue, Sep 13, 2005 at 08:14:47PM -0400, Stephen Frost wrote: I suppose another option would be to provide seperate packages... Could this be done as a shared library so it's more 'plug-and-play' to switch between the two? I dunno, just trying to think about how to deal with this without making it suck terribly bad for me (as a Debian user and maintainer). Note that the Linux kernel has played with moving spinlock code out of line. Due to the effect of having so many, it ended up that the memory saved by moving the code out of line actually benefitted overall. An unconditional call to a function can't be that expensive, surely. However, it would *have* to be in the same compile unit, no shared libraries. ELF imposes some overhead for calls in different compile units, using function pointers won't save you (see Ulrich Drepper paper on it). However, to make it flexible you would need a pointer to a function pointer. Fortunatly these variables won't change often so the function pointer and the function itself should be in the cache if used often enough. Finally, the kernel solves the problem by saying, if you compile for uniprocessor, optimize the code away, since a lot of the issues don't apply. My main concern is how you even detect the number of processors in a portable way... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpLWptHgoZDN.pgp Description: PGP signature
Re: [HACKERS] Constraint Type Coercion issue?
On Wed, Sep 14, 2005 at 02:23:29PM -0400, Tom Lane wrote: I've been thinking about this off and on, and would like to solve it in the 8.2 time frame, but it's not happening for 8.1. At a minimum it'll require some significant changes in our concept of what an operator class is. The half-jelled ideas I have involve inventing [snip] How much discussion has there been on this? I've been working my way through COLLATE support and indexes and realised that what I really want is to allow the comparison functions in operator classes to be three argument functions. The two things to compare and the collate order. A descending index is really just another collate order, albeit one easily imposed from the outside. Although numbers tend not to have many interesting collate orders, complex numbers do, as do obviously strings. To some extent, collate implies a sort of parameterised operator class... Definitly 8.2 stuff, and it's not simple stuff either... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpVP4Exb2Tl2.pgp Description: PGP signature
Re: [HACKERS] Constraint Type Coercion issue?
On Wed, Sep 14, 2005 at 05:28:42PM -0400, Tom Lane wrote: To some extent, collate implies a sort of parameterised operator class... Hmm. But an index couldn't support more than one collation order AFAICS. It'd probably make more sense to create operators and an operator class for each collation you want to support; the mapping to a call of a common support function would be embedded inside the operator definition. Otherwise we have to pass around an additional parameter through an awful lot of places... Well yes, but given the number of possible locales, creating one class for each seems excessive. And each class would have to create 5 operators (with underlying functions) and 1 comparitor function. Unless you could shortcut something like: CREATE OPERATOR CLASS ... OPERATOR 1 (text,text,'en_US') ... FUNCTION 1 mycompare(text,text,'en_US') ... COLLATE en_us; Otherwise you end up with lots of functions which have be created on the fly as the user decides what collate orders he wants. Invoking SQL functions in the btree index create cycle doesn't seem efficient. You would have to come up with new names for the operators each time because the argument types are going to be the same. Although I guess you could call it OPERATOR(en_us), it's not like people are going to use it directly. Maybe it should be that we allow users to specify three argument operators, and have an extra entry in the operator class which defines the extra argument to pass. It's not easy, like you say, there are a lot of places where an extra argument would need to be passed... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpMc8fzZLktq.pgp Description: PGP signature
Re: [HACKERS] Constraint Type Coercion issue?
On Wed, Sep 14, 2005 at 10:42:36PM -0400, Tom Lane wrote: The thing that's still fairly unclear to me is whether the collation information is attached to the operators/functions or to the data. I recall there's been some discussion of sticking collation IDs into individual text Datums, which is a completely different path than what you are positing above. Does the SQL spec mandate one or the other of these approaches? If it does, do we want to believe it? (The more I read of SQL2003, the less impressed I get...) The standard doesn't care I think. My reading is that the COLLATE status is determined at parse time. From there you can plan however you like. AFAIUI, collate is a parameter/property of fields and domains and affects operators and function, not the data. It only applies to comparisons, not the output. You could add it as a property to the data. I wrote a module, taggedtypes [1], which basically implemented this. My main issue with it is that for the '' operator, the same collate property has to be on both arguments or it has to bail. The only reason why you can attach COLLATE to fields and domains is to give a default in case the user doesn't specify anything. But if the COLLATE is given explicitly, it overrides anything. By way of example: CREATE TABLE t (a text collate c1, b text collate c2); select * from t where a b; -- ERROR: Indeterminate collate select * from t where a b COLLATE c3; -- Order by c3 My worry about adding the collate to the Datum is that your execution tree becomes more complex. The two types, with and without COLLATE data are not interchangable and you would have to add or remove them continuously. In the above example, you would have to create an executor node whose sole purpose is to add the collate bit to a and b before passing it to the '' operator. An index can only support one collation at a time also. I don't think there is an easy way out... [1] http://svana.org/kleptog/pgsql/taggedtypes.html -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpZxzfFhqYrs.pgp Description: PGP signature
[HACKERS] Does anybody use ORDER BY x USING y?
Hi, PostgreSQL's grammer allows you to specify the operator to sort with in the ORDER BY clause. Various bits of the backend support this feature, yet it appears to partially undocumented. I can't find it in the ORDER BY [1] section but there is a paragraph on it under the SELECT documentation [2]. I'm asking because SQL COLLATE support is really doing something similar. I was wondering if instead of adding something in parallel just replace sortop with collateid. This means all the code relating to pathkeys won't need to change since we still use OIDs for the pathkeys, they're just not operator oids anymore. We can continue to support USING [op] as long as [op] is one of the GT or LT operators in the OPERATOR CLASS. This restriction may exist already, I can't tell. All we lose is the ability to say USING [arbitrary op]. Does anybody use this. Would people object to requiring the operator after USING to be part of an operator class? Have a nice day, [1] http://www.postgresql.org/docs/8.0/interactive/queries-order.html [2] http://www.postgresql.org/docs/8.0/interactive/sql-select.html -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp7TMawPFCRG.pgp Description: PGP signature
Re: [HACKERS] Does anybody use ORDER BY x USING y?
On Sun, Sep 18, 2005 at 12:34:10PM -0700, Josh Berkus wrote: All we lose is the ability to say USING [arbitrary op]. Does anybody use this. Would people object to requiring the operator after USING to be part of an operator class? Hmmm ... would this prevent the hackish workaround for case-insensitive sort? Err, which hackish workaround would that be? The right solution is citext which creates it's own operator class. This doesn't have anything to do with functional indexes either. I've been using Google to find any interesting use of the USING clause but havn't found any yet. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpQyeqL55uBv.pgp Description: PGP signature
Re: [HACKERS] Does anybody use ORDER BY x USING y?
On Sun, Sep 18, 2005 at 04:19:06PM -0400, Andrew Dunstan wrote: Err, which hackish workaround would that be? The right solution is citext which creates it's own operator class. This doesn't have anything to do with functional indexes either. Last time I looked it appeared to have significant limitations, and some considerable inefficiencies (e.g, copying the strings and folding them to canonical case on every comparison). I would certainly be extremely wary of just saying that's the solution. Ok, so citext has its limitations. Case-insensetive sort is hard [1]. My real question was, what was the solution he was referring to using the USING clause? [1] http://lafstern.org/matt/col2_new.pdf -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp8mAO26x7IC.pgp Description: PGP signature
Re: [HACKERS] Does anybody use ORDER BY x USING y?
On Mon, Sep 19, 2005 at 06:26:10AM +1000, John Hansen wrote: I was actually of the impression that that was exacty what it was for: specifying what op(class) to use for the sort in case you wanted to use a non-default opclass for the type, and/or if the less-than operator wasn't called ''. That's my thought. However, the code doesn't seem to restrict you to that so I was wondering if there was any other use out there that we should consider supporting... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpZSXRjLfg74.pgp Description: PGP signature
Re: [HACKERS] Does anybody use ORDER BY x USING y?
On Sun, Sep 18, 2005 at 11:23:01PM -0400, Tom Lane wrote: snip class families to relate opclasses for different datatypes. Basically I'd like to solve most of these issues by constructing a new layer atop opclasses, not by deciding that an opclass doesn't convey the full story about the behavior of an index column. Where I'm currently going is creating a table of COLLATE orders. These collate orders would refer to operator classes but tweak them. For example, things like: - Sort ascending or descending (descending reverses the bt*cmp test) - NULLs first or last - Locale for text types - etc They could be declared in the operator class definition, or generated automatically. You could then do things like: CREATE INDEX ... (field1 COLLATE ascending, field2 COLLATE descending) for those queries where you want ascending on one column and descending on another. Or perhaps: CREATE INDEX ... (textfield COLLATE ignore_case) CREATE INDEX ... (textfield COLLATE locale_us) CREATE INDEX ... (textfield COLLATE optimise_regex) CREATE INDEX ... (point COLLATE distance) However, I can't see how this can relate families of operator classes like you talk about Tom. ISTM that needs to dealt with somewhere else, given that it's unrelated to order. This is going way out of spec though... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpZ1XhWSBlJF.pgp Description: PGP signature
Re: [HACKERS] Does anybody use ORDER BY x USING y?
On Mon, Sep 19, 2005 at 11:13:05AM +0300, Hannu Krosing wrote: (1) IS NULL is not an indexable operation, so no, not without significant overhaul of the index AM API. But we do store NULLs in indexes, so why is it not indexable? This is either an interface bug (not making use of stored info) or storage bug (wasting space storing unneccessary info) Err, indexes used to not store NULLs to save space. However, it turns out that SQL UNIQUE has something to say about NULLs in unique columns so they had to be included. However, the machinary to decide if an index is usable assumes that usable operators have two arguments and IS NULL isn't really an operator in the PostgreSQL sense and doesn't have two arguments either. *If* that can be fixed, then we can be more flexible. But if it were easy it would have been done long ago... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpdwzN3w7ZIG.pgp Description: PGP signature
Re: [HACKERS] Does anybody use ORDER BY x USING y?
much discussion on collation and ordering I'm going to take from this discussion that there is no use for the USING clause with operators not in an operator class and that if this changes we won't be seriously inconveniencing anybody. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp0oxevEpZEf.pgp Description: PGP signature
Re: [HACKERS] passing parameters to CREATE INDEX
On Tue, Sep 20, 2005 at 11:26:26PM +0400, Oleg Bartunov wrote: it's desirable to be able to pass parameters to CREATE INDEX for GiST indices. Does SQL standard has something about that so we could implement it for 8.2 ? As has been pointed out, INDEXes arn't in the SQL spec at all, so you can do just about anything. Example from real life project - performance of tsearch2 could be greatly improved if decrease signature size in gistidx.h, which is currently hardcoded and one should compile and install tsearch2 into differnet location and use it for specific database. It's impossible to have different signature length for different fts indices because we have no possibility to pass parameters to CREATE INDEX command. What syntax were you envisioning? Does this value just need to be passed to GiST at the creation of the the index, or does it actually need to remembered by the backend and passed each call? At the moment there is some discussion on changes to the index interface so now is the time to ask for what you want... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpK0N8FPyZ9w.pgp Description: PGP signature
Re: [HACKERS] passing parameters to CREATE INDEX
On Wed, Sep 21, 2005 at 08:47:04AM +0300, Hannu Krosing wrote: On K, 2005-09-21 at 09:01 +0400, Oleg Bartunov wrote: it'd be nice if parameters could be passed at the creation time only and somehow stored, so other functions could retrieve them. It's not convenient but also safe. If not changing syntax is essential, then these could be passed by some GUC variables at index create time, then stored. This way one could have as many configurables a one likes . The only major problem with that is that the parameters won't survive a dump/restore. I don't know enough about what's it's needed for to know if that's a problem... So even if an index can store the parameter itself, there would need to be a way for pg_dump to extract it. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpp5w9mVJ5or.pgp Description: PGP signature
Re: [HACKERS] Table Partitioning is in 8.1
On Wed, Sep 21, 2005 at 06:10:15PM +0100, Simon Riggs wrote: Table Partitioning is in 8.1 I've just read Peter Eisentraut's presentation to the Dutch gov (very good BTW). On the last page I read that Table Partitioning is a future for PostgreSQLwhich is strange because Constraint Exclusion is an 8.1 feature. I think it depends on how closely you relate Table Partitioning and Constraint Exclusion. While you can use the latter to implement the former, I think people expect a database that supports Table Partitioning to have a CREATE PARTITION command (or something similar). Is it possible that the Release Notes do not fully explain the Constraint Exclusion feature? Or is it the consensus that it works but not quite well enough to make a song and dance about yet? IMHO, I think Constraint Exclusion is more useful in general and once it is in and tested future releases will have a partitioning system based upon it. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpqOBQjmGisQ.pgp Description: PGP signature
Re: [HACKERS] Releasing memory during External sorting?
On Fri, Sep 23, 2005 at 06:39:35PM +0200, Pailloncy Jean-Gerard wrote: On most platforms it's quite unlikely that any memory would actually get released back to the OS before transaction end, because the memory blocks belonging to the tuplesort context will be intermixed with blocks belonging to other contexts. So I think this is pretty pointless. (If you can't afford to have the sort using all of sort_mem, you've set sort_mem too large, anyway.) On OpenBSD 3.8 malloc use mmap(2) and no more sbrk. So, as soon as the bloc is free, it returns to the OS. Access to the freed pointer crashs immediatly. Interesting point. Glibc also uses mmap() but only for allocations greater than a few K, otherwise it's a waste of space. I guess you would have to look into the postgresql allocator to see if it doesn't divide the mmap()ed space up between multiple contexts. Large allocations certainly appear to be passed off to malloc() but I don't think execSort allocates all it's space in one go, it just counts the space allocated by palloc(). So, unless someone goes and adds changes the tuplesort code to allocate big blocks and use them only for tuples, I think you're going to run into issues with data interleaved, meaning not much to give back to the OS... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpZK896EQRZq.pgp Description: PGP signature
Re: [HACKERS] stack depth limit exceeded problem.
On Sat, Sep 24, 2005 at 10:34:42AM +0200, Thomas Hallgren wrote: Oliver Jowett wrote: I assume this means you have a single lock serializing requests to the backend? Yes, of course. I also make sure that the main thread cannot return until another thread that is servicing a backend request has completed. There's absolutely no way two threads can execute backend code simultaniously. Ok, I have a question. PostgreSQL uses sigsetjmp/siglongjmp to handle errors in the backend. If you're changing the stack, how do you avoid the siglongjmp jumping back to a different stack? Or do you somehow avoid this problem altogether? I though about that. The drawback is that each and every call must spawn a new thread, no matter how trivial that call might be. If you do a select from a table with 10,000 records and execute a function for each record, you get 20,000 context switches. Avoiding that kind of overhead is one of the motivating factors for keeping the VM in-process. Well, on linux at least context switches are quite cheap. However, how does Java handle the possibility that functions never return. Do you wrap each call in a PG_TRY/PG_CATCH to propegate errors? Tricky issues... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpWDeiFxvZex.pgp Description: PGP signature
Re: [HACKERS] stack depth limit exceeded problem.
On Sat, Sep 24, 2005 at 12:26:58PM +0200, Thomas Hallgren wrote: Yes. All backend exceptions are cought in a PG_CATCH and then propagated to Java as a ServerException. If there's no catch in the Java code, they are rethrown by the java_call_handler. This time with jump buffer that was setup by the backend when it invoked the call_handler. There's also a barrier that will prevent any further calls from the Java code once an exception has been thrown by the backend unless that call was wrapped in a savepoint construct. A savepoint rollback will unlock the barrier (this is not related to the thread issue of course). Well, you seem to have dealt with the obvious issues I can see. I imagine you need also to worry about things like signal handling. Is there no way to reserve a stack just for PostgreSQL and switch to that stack, rather than switch threads (although, the stack is really the only thing that differentiates threads anyway...). Linux has sigaltstack so you can catch the stack overflow signal (and other signals obviously, but that's its main use), but it's not terribly portable. What you really need to do is set the stack_base_ptr every time you execute postgres with a new stack; that preserves existing semantics. Signals are the only way the kernel can pass control unexpectedly so if you handle those, postgres would never know it's threaded. I do wonder if there are any other assumptions made... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpAoT6QNSsBT.pgp Description: PGP signature
Re: [HACKERS] stack depth limit exceeded problem.
On Sat, Sep 24, 2005 at 02:38:35PM +0200, Thomas Hallgren wrote: Martijn van Oosterhout wrote: Linux has sigaltstack so you can catch the stack overflow signal (and other signals obviously, but that's its main use), but it's not terribly portable. I rely on the signal handler that the JVM uses for page-faults (which a stack overflow generally amounts to) and fpe exeptions so I know that they will generate java exceptions in a controlled way (which I in turn translate to elog(ERROR) on the main thread). Well, actually, what I was thinking is if someone sends a -INT or -TERM to the backend, which thread will catch it? You have to block it in every thread except the one you want to catch it in if you want to control it. This means that for any signal handler that PostgreSQL installs, you need to intercept it with a wrapper function to make sure it runs in the right stack. Actually, while running backend code, you're probably fine since the elog stuff will handle it. But if a signal is received while the JVM is running, the signal handler will get the stack of the JVM. Now, PostgreSQLs signal handlers tend not to do much so you may be safe. They tend not to throws errors, but who knows... Still, this is all solvable I think... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpLeqyX5gIeM.pgp Description: PGP signature
[HACKERS] \x output blowing up
Hi, On CVS tip, whenever I do \x output, it dies with an internal error in glibc on free. If you run it under valgrind, it complains about these lines of code: 700{ 701char *my_cell = pg_local_malloc(cell_w[i] + 1); 702 703 [Inv write 1 byte] strcpy(my_cell, *ptr); 704if (opt_align[i % col_count] == 'r' opt_numeric_locale) 705format_numeric_locale(my_cell); 706if (opt_border 2) 707fprintf(fout, %s\n, my_cell); 708else 709 [Inv read 1 byte] fprintf(fout, %-s%*s |\n, my_cell, dwidth - cell_w[i], ); 710free(my_cell); 711} Now, apart from the fact that the cell width != strlen in multibyte encodings, there must be something else because this is just select * from pg_proc and there are no multiple characters there AFAIK. I can't see it though. Hope this helps, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpid8LNrp3Xr.pgp Description: PGP signature
Re: [HACKERS] \x output blowing up
On Sat, Sep 24, 2005 at 11:45:08PM +0200, Martijn van Oosterhout wrote: Hi, On CVS tip, whenever I do \x output, it dies with an internal error in glibc on free. If you run it under valgrind, it complains about these lines of code: snip Ok, I worked out the direct cause, pg_wcswidth only returns the length upto the first newline and the line it breaks on is the multiline definition in _pg_expandarray. The quick fix should be to only allocate memory if it's going to call format_numeric_locale(), since then you know it's a number. It makes the code slightly more convoluated but it should be slightly more efficient. I actually have a working psql that handles and displays newlines properly, but it's too late for 8.1. It fixes all these issues properly. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpSxRFAxuskf.pgp Description: PGP signature
Re: [HACKERS] \x output blowing up
On Sat, Sep 24, 2005 at 07:18:16PM -0400, Bruce Momjian wrote: Well, it seems we are going to have to fix it somehow for 8.1. It is not crashing here so I can't work up a patch. Can you submit a minimal fix for 8.1? Thanks. Ah, it would only happen if your encoding was UTF-8 since that's the only case psql handles differently. I've attached a patch which fixes it. With a bit more rearrangement you could probably simplify it a bit but this works. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. Index: print.c === RCS file: /projects/cvsroot/pgsql/src/bin/psql/print.c,v retrieving revision 1.74 diff -c -r1.74 print.c *** print.c 24 Sep 2005 17:53:27 - 1.74 --- print.c 25 Sep 2005 09:16:01 - *** *** 697,714 else fputs( , fout); { char *my_cell = pg_local_malloc(cell_w[i] + 1); strcpy(my_cell, *ptr); ! if (opt_align[i % col_count] == 'r' opt_numeric_locale) ! format_numeric_locale(my_cell); if (opt_border 2) fprintf(fout, %s\n, my_cell); else fprintf(fout, %-s%*s |\n, my_cell, dwidth - cell_w[i], ); free(my_cell); } } if (opt_border == 2) --- 697,726 else fputs( , fout); + /* We seperate the cases for numeric_locale because for UTF-8 (and +* other) encodings, cell_w = strlen(*ptr). We can't just use +* strlen() in the malloc because there needs to be enough room for +* format_numeric_locale() to store its result. */ + + if (opt_align[i % col_count] == 'r' opt_numeric_locale) { char *my_cell = pg_local_malloc(cell_w[i] + 1); strcpy(my_cell, *ptr); ! format_numeric_locale(my_cell); if (opt_border 2) fprintf(fout, %s\n, my_cell); else fprintf(fout, %-s%*s |\n, my_cell, dwidth - cell_w[i], ); free(my_cell); } + else + { + if (opt_border 2) + fprintf(fout, %s\n, *ptr); + else + fprintf(fout, %-s%*s |\n, *ptr, dwidth - cell_w[i], ); + } } if (opt_border == 2) pgpOenUi8TQtR.pgp Description: PGP signature
[HACKERS] Questions about proper newline handling in psql output
Hi, I basically have a functional version for aligned output, examples at the bottom of this email. It handles multiline data values and multiline headers. However, there are some areas where I could use some input. 1. To be able to control the spacing, psql now has to be very careful about its output. eg \r is printed as \r, ascii control characters are output as \x00 style and other control chars as \u. This is a change from previous behaviour, yet you're pretty much forced to if you want to control the output. Is this change acceptable? 2. Currently I've changed the aligned outputs but not the unaligned ones. Given you're not worrying about alignment there anyway, why do the work? Also, we recommend unaligned output for script users so I don't feel right changing it. Is this distinction acceptable? 3. How to show that a value is continued? As you can see below I use ':' before columns that have data. This obviously doesn't work for first column if there's no outer border. If your border style is 0 you're totally out of luck. I remember a discussion on this before but couldn't find it in the archives. Either a reference or some other hints would be appreciated. 4. Some system output like pg_views has really really long strings, would it be acceptable to change the output there to add newlines at various places to make it output nicer with this change? 5. Auto string folding. If a string is really long, fold it so it fits in a screen width, perhaps with '\' continuation. I havn't done this but I can imagine some people (including me) would love it. 6. Currently I've implemented support for UTF-8 and all ASCII compatable single-byte encodings. Given that psql didn't support the others anyway maybe no-one cares, but I have to ask: does anyone care? If so, I need info on *how* to support an encoding. Thanks for your attention. See you tomorrow. Query is: select oid, prosrc as HdrLine1 HdrLine2, proacl from pg_proc limit 1; Border style is 1. oid | HdrLine1 | proacl | HdrLine2 | ---+-+ 17009 | select 1 union all select 2 union all select 3 union all| : select 4 union all select 5 union all select 6 union all : select 7 union all select 8 union all select 9 union all : select 10 union all select 11 union all select 12 union all : select 13 union all select 14 union all select 15 union all : select 16 union all select 17 union all select 18 union all : select 19 union all select 20 union all select 21 union all : select 22 union all select 23 union all select 24 union all : select 25 union all select 26 union all select 27 union all : select 28 union all select 29 union all select 30 union all : select 31 union all select 32 (1 row) Expanded display is on. -[ RECORD 1 ]- oid | 17009 HdrLine1 | select 1 union all select 2 union all select 3 union all HdrLine2 : select 4 union all select 5 union all select 6 union all : select 7 union all select 8 union all select 9 union all : select 10 union all select 11 union all select 12 union all : select 13 union all select 14 union all select 15 union all : select 16 union all select 17 union all select 18 union all : select 19 union all select 20 union all select 21 union all : select 22 union all select 23 union all select 24 union all : select 25 union all select 26 union all select 27 union all : select 28 union all select 29 union all select 30 union all : select 31 union all select 32 proacl | # select chr(8); chr -- \x08 (1 row) -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpMYIVtyDxCp.pgp Description: PGP signature
Re: [HACKERS] expected authentication request from server, but received...
On Mon, Sep 26, 2005 at 09:59:53AM +0200, Micha? J?czalik wrote: Connection with psql works OK. Connecting from different (older) hosts with DBI/PHP works perfect as well. Something must have changed in recent versions of libpq and now it doesn't work. I looked into source code and found that the server's message at the very start of conversation appears to be something else than the libpq expects. But how to fix it?! Has the protocol change or what? Umm, I think the protocol version is up to 3 or 4 now. I think libpq supports all the way back to 7.0, I don't know if anyone tests earlier than that. If you really need this to work, I suggest pulling the libpq from that release (CVS or tarball) or slightly later and installing that on the new machines. But you should upgrade, 6.4 hasn't had support for year, who know how many bugs... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpteoQYstgAk.pgp Description: PGP signature
Re: [HACKERS] Questions about proper newline handling in psql output
On Tue, Sep 27, 2005 at 03:02:29PM +0200, Peter Eisentraut wrote: I don't think this is necessary. If you put control characters into your text, then you clearly don't care about aligned output, so you don't get any. About \r, I think that needs to be figured into the alignment calculation, to work well on Windows and Mac. Good point. On UNIX however, outputting a \r will jump back to the beginning of the line, overwriting whatever was there. But you raise a good question, should the output of psql be console specific. i.e. if you insert a value with unix newlines then on a Mac the output won't have linebreaks. However, part of this discussion was because we were considering changing the output of \df to display more info using multiple lines. And if so we want it to display consistantly on all platforms, right? My argument for the other control characters is: given the work to make this work for \r, \t, and \n needs all this anyway, why not just fix it for *all* control characters in one go and be completely solved of the problem, for now and forever. If someone embeds the control characters to change the title of your xterm, change the font, clear the screen, etc should psql just blat that out? I realise it does it now and that it's not a strong argument, but since we're here already... 3. How to show that a value is continued? As you can see below I use ':' before columns that have data. This obviously doesn't work for first column if there's no outer border. If your border style is 0 you're totally out of luck. I think you need to keep the normal delimiter and need some extra mark within the table cells. (Think about how it would have to look in an HTML table.) Umm, I wasn't thinking of changing the HTML output at all, it doesn't need it since whatever displays the HTML will take care of alignment. Same for troff and CSV. Straight aligned text output is the only one we care about AFAICS (and the only one that requires work to make it happen). Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpkWnlydcyP5.pgp Description: PGP signature
Re: [HACKERS] Questions about proper newline handling in psql output
On Tue, Sep 27, 2005 at 12:12:15AM -, Greg Sabino Mullane wrote: 4. Some system output like pg_views has really really long strings, would it be acceptable to change the output there to add newlines at various places to make it output nicer with this change? I'd say no, until you propose a more concrete set of rules on how and when. Well, I was thinking before the keywords SELECT, FROM, WHERE, AND, ORDER BY, GROUP BY and HAVING. For bonus indent subqueries also. But I'm not too fussed, it was just a thought. In fact, we already do it for the output of \d for views, some maybe a hint from there... Query is: select oid, prosrc as HdrLine1 HdrLine2, proacl from pg_proc limit 1; Some smaller samples with a third column (and a large 2nd one) might be nice. Ok, multiline strings in the pg_catalog are thin on the ground, I was hoping to use real data rather than stuff I made up. (Real data has a habit of showing weaknesses far better than things you dream up). But I'll give it a shot. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpJ5nCy3zGIr.pgp Description: PGP signature
Re: [HACKERS] effective SELECT from child tables
On Wed, Sep 28, 2005 at 11:00:46PM +0400, Ilia Kantor wrote: Your suggestion is essentially the same as mine.. There exists tableoid, pretty much suited to tell between tables in the case of inheritance.. I can't see a real need to add a special class classifier to each table.. This solution is a workaround. It will work, just can't make myself love it. I wonder if it would be possible to tweak the constraints exclusion code so that if it sees something of the form tableoid = X to exclude other tables... You know, assume each table has a constraint tableoid = OID. Still, it is a fairly unusual feature. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp13avInTZnX.pgp Description: PGP signature
Re: [HACKERS] [PATCHES] Proposed patch for sequence-renaming problems
On Wed, Sep 28, 2005 at 10:41:12PM -0400, Bruce Momjian wrote: Also, why is the nextval ::text casting output by pg_dump anyway? AFAICS, pg_dump outputs serial (at least in 7.4.7 which is what I have to hand) when it should meaning that dumps restored will get the new syntax anyway. Or am I missing something? Isn't this what adddepend was for? I can beleive there are version dependancies here... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpLmwKsmtxrG.pgp Description: PGP signature
Re: [HACKERS] Socket problem using beta2 on Windows-XP
On Thu, Sep 29, 2005 at 08:50:30AM +0200, Thomas Hallgren wrote: Hi, I've installed PostgreSQL 8.1-beta2 as a service on my Windows-XP box. It runs fine but I get repeated messages like this in the log: 2005-09-29 00:41:09 FATAL: could not duplicate socket 1880 for use in backend: error code 10038 That's from postmaster.c:write_inheritable_socket(). Error 10038 is WSAENOTSOCK. Very odd, time to get out the debugger? Get a backtrace at least. Hope this helps, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpH0H2mRTE8a.pgp Description: PGP signature
Re: [HACKERS] Found small issue with OUT params
On Fri, Sep 30, 2005 at 10:20:34AM -0500, Tony Caduto wrote: Tom, I hardly think the overhead would be significant on modern processors, I don't think the majority of users are running on Pentium 90s.( I am assuming you mean a performance overhead) Um, please read the documention. Returning a tuple is *significantly* more expensive than returning a single value. You have to get the tuple descriptor, allocate memory for the tuple, fill in all the fields with your data... For a single value you just return it. See here for all the details, you really don't want to do it if you don't need to. http://www.postgresql.org/docs/8.0/interactive/xfunc-c.html#AEN30497 Now, you could fudge the parser to automatically alter the name of the value in the function but I'm have no idea how hard that would be... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp0BOnOk9s4S.pgp Description: PGP signature
Re: [HACKERS] effective SELECT from child tables
On Fri, Sep 30, 2005 at 06:30:10PM -0500, Jim C. Nasby wrote: On Wed, Sep 28, 2005 at 07:25:46PM +0100, Simon Riggs wrote: Include the Discriminator as a column in A and it will be inherited by all A1, A2, A3. e.g. concrete_class char(1) not null snip This will add 1 byte per row in your superclass... and requires no I thought char was actually stored variable-length...? I know there's a type that actually acts like char does on most databases, but I can't remember what it is off-hand (it should be mentioned in docs 8.3...) IIRC, this is the difference between char and char(1). The latter is variable length and can store any character per current encoding, hence the variable length. char on the other hand is a one byte (presumably ASCII) character. It's used mainly in the system catalogs... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpNCkP1i0Zwi.pgp Description: PGP signature
Re: [HACKERS] effective SELECT from child tables
On Sat, Oct 01, 2005 at 10:05:22AM -0400, [EMAIL PROTECTED] wrote: It has the 'side or additional benefit' being requested here. The ability to filter the child table by some attribute. For example, if the child tables are used for partitioning, and the attribute were to keep a date range, the field restriction optimization could be used to automatically determine the set of tables to use for the date range specified. With such a change, it would even work automatically if the date ranges overlapped for some reason. Selecting a table name by date is hacky. This sort of solution would be a general solution to the problem. This is what Constraint Exclusion does. It uses CHECK constraints on a table to filter out tables that obviously don't apply to a query. It's just the the specific case of tableoid = XXX is not supported right now. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpYstb0MItkV.pgp Description: PGP signature
Re: [HACKERS] [PERFORM] A Better External Sort?
On Sat, Oct 01, 2005 at 10:22:40AM -0400, Ron Peacetree wrote: Assuming we get the abyssmal physical IO performance fixed... (because until we do, _nothing_ is going to help us as much) I'm still not convinced this is the major problem. For example, in my totally unscientific tests on an oldish machine I have here: Direct filesystem copy to /dev/null 21MB/s10% user 50% system (dual cpu, so the system is using a whole CPU) COPY TO /dev/null WITH binary 13MB/s55% user 45% system (ergo, CPU bound) COPY TO /dev/null 4.4MB/s 60% user 40% system \copy to /dev/null in psql 6.5MB/s 60% user 40% system This machine is a bit strange setup, not sure why fs copy is so slow. As to why \copy is faster than COPY, I have no idea, but it is repeatable. And actually turning the tuples into a printable format is the most expensive. But it does point out that the whole process is probably CPU bound more than anything else. So, I don't think physical I/O is the problem. It's something further up the call tree. I wouldn't be surprised at all it it had to do with the creation and destruction of tuples. The cost of comparing tuples should not be underestimated. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpEek2beyKVc.pgp Description: PGP signature
Re: [HACKERS] effective SELECT from child tables
Hmm, I'm trying to understand here. If every row in log_8 should have the same project_id, couldn't this be acheived by having each row in log_other contain the tableoid of the table it refers to. Then a join will return the info you're looking for. Or am I missing something? On Sat, Oct 01, 2005 at 10:57:27AM -0500, Jim C. Nasby wrote: To clarify, this is a hard-coded implementation of what I'm asking for: http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ in a nutshell: CREATE TABLE log_other ( project_id smallint NOT NULL ... ) CREATE TABLE log_8 ( -- No project_id ... ) CREATE TABLE log_24, log_25, log_5... CREATE VIEW log AS SELECT * FROM log_other UNION ALL SELECT 8 AS project_id, * FROM log_8 ... So the end result is that for cases where project_id is 5, 8, 24, or 25, the data will be stored in tables that don't have the project_id. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp2qncxxNGXp.pgp Description: PGP signature
Re: [HACKERS] [PERFORM] A Better External Sort?
[removed -performance, not subscribed] On Sat, Oct 01, 2005 at 01:42:32PM -0400, Ron Peacetree wrote: You have not said anything about what HW, OS version, and pg version used here, but even at that can't you see that something Smells Wrong? Somewhat old machine running 7.3 on Linux 2.4. Not exactly speed daemons but it's still true that the whole process would be CPU bound *even* if the O/S could idle while it's waiting. PostgreSQL used a *whole CPU* which is its limit. My point is that trying to reduce I/O by increasing CPU usage is not going to be benficial, we need CPU usage down also. Anyway, to bring some real info I just profiled PostgreSQL 8.1beta doing an index create on a 2960296 row table (3 columns, table size 317MB). The number 1 bottleneck with 41% of user time is comparetup_index. It was called 95,369,361 times (about 2*ln(N)*N). It used 3 tapes. Another 15% of time went to tuplesort_heap_siftup. The thing is, I can't see anything in comparetup_index() that could take much time. The actual comparisons are accounted elsewhere (inlineApplySortFunction) which amounted to 10% of total time. Since nocache_index_getattr doesn't feature I can't imagine index_getattr being a big bottleneck. Any ideas what's going on here? Other interesting features: - ~4 memory allocations per tuple, nearly all of which were explicitly freed - Things I though would be expensive, like: heapgettup and myFunctionCall2 didn't really count for much. Have a nice weekend, % cumulative self self total time seconds secondscalls s/call s/call name 43.63277.81 277.81 95370055 0.00 0.00 comparetup_index 16.24381.24 103.43 5920592 0.00 0.00 tuplesort_heap_siftup 3.76405.1723.93 95370055 0.00 0.00 inlineApplySortFunction 3.18425.4220.26 95370056 0.00 0.00 btint4cmp 2.82443.3717.95 11856219 0.00 0.00 AllocSetAlloc 2.52459.4416.07 95370055 0.00 0.00 myFunctionCall2 1.71470.3510.91 2960305 0.00 0.00 heapgettup 1.26478.38 8.03 11841204 0.00 0.00 GetMemoryChunkSpace 1.14485.67 7.29 5920592 0.00 0.00 tuplesort_heap_insert 1.11492.71 7.04 2960310 0.00 0.00 index_form_tuple 1.09499.67 6.96 11855105 0.00 0.00 AllocSetFree 0.97505.83 6.17 23711355 0.00 0.00 AllocSetFreeIndex 0.84511.19 5.36 5920596 0.00 0.00 LogicalTapeWrite 0.84516.51 5.33 2960314 0.00 0.00 slot_deform_tuple -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpfhYaBi5xFJ.pgp Description: PGP signature
Re: [HACKERS] [PERFORM] A Better External Sort?
On Sat, Oct 01, 2005 at 11:26:07PM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Anyway, to bring some real info I just profiled PostgreSQL 8.1beta doing an index create on a 2960296 row table (3 columns, table size 317MB). 3 columns in the index you mean? What were the column datatypes? Any null values? Nope, three columns in the table, one column in the index, no nulls. The indexed column was integer. I did it once with around 6500 values repeated over and over, lots of duplicate kays. And once on a serial column but it made no descernable difference either way. Although the comparison function was called less (only 76 million times), presumably because it was mostly sorted already. The number 1 bottleneck with 41% of user time is comparetup_index. ... The thing is, I can't see anything in comparetup_index() that could take much time. The index_getattr and heap_getattr macros can be annoyingly expensive. And yet they are optimised for the common case. nocache_index_getattr was only called 7 times, which is about what you expect. I'm getting annotated output now, to determine which line takes the time... Actually, my previous profile overstated stuff a bit. Profiling turned off optimisation so I put it back and you get better results but the order doesn't change much. By line results are below. The top two are the index_getattr calls in comparetup_index. Third and fourth are the HEAPCOMPARES in tuplesort_heap_siftup. Then comes the inlineApplySortFunction call (which isn't being inlined, despite suggesting it should be, -Winline warns about this). Looks to me that there are no real gains to be made in this function. What is needed is an algorithmic change to call this function less often... Have a nice weekend, % cumulative self self total time seconds secondscalls ms/call ms/call name 9.40 22.5622.56 comparetup_index (tuplesort.c:2042 @ 8251060) 5.07 34.7312.17 comparetup_index (tuplesort.c:2043 @ 82510c0) 4.73 46.0911.36 tuplesort_heap_siftup (tuplesort.c:1648 @ 825074d) 3.48 54.45 8.36 tuplesort_heap_siftup (tuplesort.c:1661 @ 82507a9) 2.80 61.18 6.73 comparetup_index (tuplesort.c:2102 @ 8251201) 2.68 67.62 6.44 comparetup_index (tuplesort.c:2048 @ 8251120) 2.16 72.82 5.20 tuplesort_heap_siftup (tuplesort.c:1652 @ 825076d) 1.88 77.34 4.52 76025782 0.00 0.00 comparetup_index (tuplesort.c:2016 @ 8251010) 1.82 81.70 4.36 76025782 0.00 0.00 inlineApplySortFunction (tuplesort.c:1833 @ 8251800) 1.73 85.85 4.15 readtup_heap (tuplesort.c:2000 @ 8250fd8) 1.67 89.86 4.01 AllocSetAlloc (aset.c:568 @ 824bec0) 1.61 93.72 3.86 comparetup_index (tuplesort.c:2025 @ 825102f) 1.47 97.25 3.53 76025785 0.00 0.00 btint4cmp (nbtcompare.c:74 @ 80924a0) 1.11 99.92 2.67 readtup_datum (tuplesort.c:2224 @ 82517c4) 1.10102.55 2.64 comparetup_index (tuplesort.c:2103 @ 82511e7) % cumulative self self total time seconds secondscalls s/call s/call name 28.34 68.0168.01 76025782 0.00 0.00 comparetup_index 13.56100.5432.53 7148934 0.00 0.00 tuplesort_heap_siftup 8.66121.3320.79 76025782 0.00 0.00 inlineApplySortFunction 4.43131.9610.63 13084567 0.00 0.00 AllocSetAlloc 3.73140.90 8.94 76025785 0.00 0.00 btint4cmp 2.15146.07 5.17 6095625 0.00 0.00 LWLockAcquire 2.02150.92 4.85 2960305 0.00 0.00 heapgettup 1.98155.66 4.74 7148934 0.00 0.00 tuplesort_heap_insert 1.78159.94 4.28 2960312 0.00 0.00 slot_deform_tuple 1.73164.09 4.15 readtup_heap 1.67168.09 4.00 6095642 0.00 0.00 LWLockRelease 1.53171.76 3.68 2960308 0.00 0.00 index_form_tuple 1.44175.21 3.45 13083442 0.00 0.00 AllocSetFree 1.28178.28 3.07 8377285 0.00 0.00 LogicalTapeWrite 1.25181.29 3.01 8377285 0.00 0.00 LogicalTapeRead 1.11183.96 2.67 readtup_datum 1.06186.51 2.551 2.55 123.54 IndexBuildHeapScan -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do
Re: [HACKERS] effective SELECT from child tables
On Sun, Oct 02, 2005 at 03:57:37PM +0300, Hannu Krosing wrote: On P, 2005-10-02 at 01:24 -0400, Greg Stark wrote: Of course that only works if the reason they want to set fill the rows with the default value isn't precisely because NULL is a perfectly reasonable thing for the column to have (but not what they want for the existing rows). Sure. What would be needed for adding new colums with default filling would be some end-of-tuple marker or stored column count or tuple version nr, and then a rule (or just default behaviour) of showing default value for *missing* columns (colno nr of stored columns). Actually, PostgreSQL does know the number of columns in a tuple. It would be possible get change heap_getattr to return the default value. However, from a semantic point of view, it would be a bit strange. If you added a column, updated some rows then set a default, that default might end up applying to every row, except the ones you already modified. With careful coding you may be able to get around this. However, a good argument can be made that setting the DEFAULT for a column shouldn't change data anywhere. What about if I want to change the default for new values but not for old ones. That wouldn't work if the database starts adding values randomly, depending on when they are read... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpxc3tWm4D4W.pgp Description: PGP signature
Re: [HACKERS] Socket problem using beta2 on Windows-XP
On Sun, Oct 02, 2005 at 12:20:05PM +0200, Thomas Hallgren wrote: I added some traces to the code. I know that the following happens when I start a postmaster. snip In the second iteration of ServerLoop, pgstat_forkexec will again call will call internal_fork_exec. This time it fails. According to the log it fails on line: write_inheritable_socket(param-pgStatSock, pgStatSock, childPid); Well, pgStatSock is the only SOCK_DGRAM socket, all the others are SOCK_STREAM, maybe that's the difference? It's also connected to itself, although for DGRAM sockets that's not that special. The documentation isn't totally clear about this. Yet the error thrown should terminate the process, yet it obviously isn't. Very odd. Any Windows programmers with ideas? -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpKbZv1lHAcM.pgp Description: PGP signature
Re: [HACKERS] [PERFORM] A Better External Sort?
Ok, I tried two optimisations: 1. By creating a special version of comparetup_index for single key integer indexes. Create an index_get_attr with byval and len args. By using fetch_att and specifying the values at compile time, gcc optimises the whole call to about 12 instructions of assembly rather than the usual mess. 2. By specifying: -Winline -finline-limit-1500 (only on tuplesort.c). This causes inlineApplySortFunction() to be inlined, like the code obviously expects it to be. default build (baseline)235 seconds -finline only 217 seconds (7% better) comparetup_index_fastbyval4 only221 seconds (6% better) comparetup_index_fastbyval4 and -finline203 seconds (13.5% better) This is indexing the integer sequence column on a 2.7 million row table. The times are as given by gprof and so exclude system call time. Basically, I recommend adding -Winline -finline-limit-1500 to the default build while we discuss other options. comparetup_index_fastbyval4 patch attached per example. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. --- pgsql-clean/src/include/access/itup.h 2005-10-02 21:30:20.327464320 +0200 +++ pgsql-sort/src/include/access/itup.h2005-10-02 16:04:00.0 +0200 @@ -126,6 +126,34 @@ ) \ ) +#define index_get_attr(tup, attnum, tupleDesc, attbyval, attlen, isnull) \ +( \ + AssertMacro(PointerIsValid(isnull) (attnum) 0), \ + *(isnull) = false, \ + !IndexTupleHasNulls(tup) ? \ + ( \ + (tupleDesc)-attrs[(attnum)-1]-attcacheoff = 0 ? \ + ( \ + fetch_att((char *) (tup) + IndexInfoFindDataOffset((tup)-t_info) \ + + (tupleDesc)-attrs[(attnum)-1]-attcacheoff, attbyval, attlen) \ + ) \ + : \ + nocache_index_getattr((tup), (attnum), (tupleDesc), (isnull)) \ + ) \ + : \ + ( \ + (att_isnull((attnum)-1, (char *)(tup) + sizeof(IndexTupleData))) ? \ + ( \ + *(isnull) = true, \ + (Datum)NULL \ + ) \ + : \ + ( \ + nocache_index_getattr((tup), (attnum), (tupleDesc), (isnull)) \ + ) \ + ) \ +) + /* routines in indextuple.c */ extern IndexTuple index_form_tuple(TupleDesc tupleDescriptor, --- pgsql-clean/src/backend/utils/sort/tuplesort.c 2005-09-24 23:23:39.0 +0200 +++ pgsql-sort/src/backend/utils/sort/tuplesort.c 2005-10-02 21:29:39.349086302 +0200 @@ -375,6 +375,8 @@ unsigned int len); static int comparetup_index(Tuplesortstate *state, const void *a, const void *b); +static int comparetup_index_fastbyval4(Tuplesortstate *state, +const void *a, const void *b); static void *copytup_index(Tuplesortstate *state, void *tup); static void writetup_index(Tuplesortstate *state, int tapenum, void *tup); static void *readtup_index(Tuplesortstate *state, int tapenum, @@ -498,8 +500,12 @@ int workMem, bool randomAccess) { Tuplesortstate *state = tuplesort_begin_common(workMem, randomAccess); + TupleDesc tupDes = RelationGetDescr(indexRel); - state-comparetup = comparetup_index; + if( tupDes-natts == 1 tupDes-attrs[0]-attbyval == 1 tupDes-attrs[0]-attlen == 4 ) + state-comparetup = comparetup_index_fastbyval4; + else + state-comparetup = comparetup_index; state-copytup = copytup_index; state-writetup = writetup_index; state-readtup = readtup_index; @@ -2102,6 +2108,92 @@ return 0; } +static int +comparetup_index_fastbyval4(Tuplesortstate *state, const void *a, const void *b) +{ + /* +* This is almost the same as _bt_tuplecompare(), but we need to keep +* track of whether any null fields are present. Also see the special +* treatment for equal keys at the end. +*/ + IndexTuple tuple1 = (IndexTuple) a; + IndexTuple tuple2 = (IndexTuple) b; + Relationrel = state-indexRel; + ScanKey scankey = state-indexScanKey; + TupleDesc tupDes; + boolequal_hasnull = false; + + tupDes = RelationGetDescr(rel); + + ScanKey entry = scankey[0]; + Datum datum1, + datum2; + boolisnull1, + isnull2; + int32 compare; + + datum1 = index_get_attr(tuple1, 1, tupDes, 1, 4, isnull1
Re: [HACKERS] effective SELECT from child tables
On Sun, Oct 02, 2005 at 11:51:27PM +0300, Hannu Krosing wrote: Right. Actually the default value returned for missing columns should be different from the default for new values and should be settable only once, when adding the column, else issues would become really really weird. Right, the only way I could really imagine it working is have a flag, attautodefault which if set would return the default instead of NULL. It would only ever be set if the ADD COLUMN SET DEFAULT happened together. But does this mean you have one magic default fixed for the life of the column and the actual default which can be changed anytime? Seems messy, though possible... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpf0P5jSQf7Z.pgp Description: PGP signature
Re: [HACKERS] PG Killed by OOM Condition
On Mon, Oct 03, 2005 at 11:03:06PM +1000, John Hansen wrote: Might it be worth while protecting the postmaster from an OOM Kill on Linux by setting /proc/{pid}/oom_adj to -17 ? (Described vaguely in mm/oom_kill.c) Has it actually happened to you? PostgreSQL is pretty good about its memory usage. Besides, seems to me it should be an system admisitrator descision. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpZ0ko2iHAwg.pgp Description: PGP signature
Re: [HACKERS] Tuning current tuplesort external sort code for 8.2
On Mon, Oct 03, 2005 at 09:35:30PM +0100, Simon Riggs wrote: Based upon profiling of the initial stage of external sorting, it seems that this stage is overall CPU bound, with hotspots in comparetup_* accounting for around 50% of CPU time; lets just call that too much, since your exact experience may vary. Indeed, however as I pointed out, if you arrange for inlineApplySortFunction() actually be inlined, you can cut costs, especially in the index creation case. snip values from that could be cached ready for the next call. Caching would reduce number of *_getattr calls from 2N to N+1, where N is likely to go My profiling indicates that the second getattr is half the cost of the first, gcc optimisation at work. Note that setting CFLAGS=-pg for configure disables optimisation, I missed that the first time. Ofcourse, every call saved is time saved. 2. In comparetup_ the second attr value is always fetched, even when the first attr is null. When the first attr is null the value of the second need never be checked, just whether the second attr is null or not, so the full cost of the *_getattr need not actually be paid at all. The relevance of this is not reduced as a result of the caching suggested in (1). Actually, attribute is null is the cheap case because you only need to check the bitmap. But you could optimise stuff by expanding the *_getattr calls and optimising directly. Possible problem with caching: if you're called by the system qsort, can you assume anything about the order of the comparisons? Please note: if inlineApplySortFunction() is actually inlined (it isn't by default), gcc does get very smart about this and sometimes optimises out the Datum fetches depending on the isNull flags. So we need to check we're actually making an improvement over the compiler. snip is a subset of the PK (a typical one-many relationship) and groupings also. In the majority of cases, these attrs are at the start of a tuple. The *_getattr macros are particularly poor at handling NULLs. When *_getattr sees *any* NULL is present for a tuple it checks the nullability of all attrs up to the current attrnum before returning using the cached offsets. The macro could be altered so that if the current attrnum firstNullableAttrnum (which we can set once for the Maybe easier, in the macro use: bitmap ((1attnum)-1) to quickly check that no nulls precede the value we're looking for and hence we can use the fast path anyway. Along the lines of: #define index_getattr(tup, attnum, tupleDesc, isnull) \ ( \ AssertMacro(PointerIsValid(isnull) (attnum) 0), \ *(isnull) = false, \ !IndexTupleHasNulls(tup) || (attnum 32 (NullBitmap(tup) ((1attnum)-1)) == 0 ) ? \ ( \ (tupleDesc)-attrs[(attnum)-1]-attcacheoff = 0 ? \ Nice ideas though, a seperate run just for NULL keys is interesting. If you only have one sort key it becomes a whole tape which doesn't need to be sorted anymore, just emit it at the beginning or end. Could be helpful. Mind you, if you start creating seperate routines for different cases you can go a long way. Elsewhere on this list I created a special case for single-key integer index columns and got an 8% speed increase. Not exactly a viable solution though. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpWTkP5cRja6.pgp Description: PGP signature
Re: [HACKERS] PG Killed by OOM Condition
On Mon, Oct 03, 2005 at 11:47:57PM -0700, Jeff Davis wrote: I think that I've run into the OOM killer without a fork() being involved, but I could be wrong. Is it possible to be hit by the OOM killer if no applications use fork()? fork() is the obvious overcomitter. If Netscape wants to spawn a new process, it first has to fork 50MB of memory, then free probably most of it because it execs some little plugin. If processes mmap() a large block and then doesn't use it until later. Similar idea with brk(). If you run out of swap at the wrong moment... Recent versions are more clever about who to kill. Sometimes you just get unlucky... It's always killed the right process for me (Mozilla derivative leaked masses of memory over long period). -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgppqV21TxiXq.pgp Description: PGP signature
Re: [HACKERS] Tuning current tuplesort external sort code for 8.2
On Tue, Oct 04, 2005 at 12:37:51AM +0100, Simon Riggs wrote: On Mon, 2005-10-03 at 23:25 +0200, Martijn van Oosterhout wrote: Please note: if inlineApplySortFunction() is actually inlined (it isn't by default) Can you explain your last post some more. Thats not what I get. The inline keyword is just a flag that you would like the compiler to inline it. GCC will decide itself. It has a limit on the size of functions that it will consider for inlining. Quote the gcc manual: -finline-limit-N By default, gcc limits the size of functions that can be inlined. This flag allows the control of this limit for functions that are explicitly marked as inline (ie marked with the inline keyword or defined within the class definition in c++). N is the size of functions that can be inlined in number of pseudo instructions (not counting parameter handling). It goes in to say that the limit is 1 for gcc 2.95, but if you examine the manual for gcc 3.3 it has the limit at 600. So it's entirely possible that at the time the person wrote that code, it *was* being inlined, but it sure isn't on some versions of some compilers. I experimented and found that -finline-limit-1500 causes it to start inlining. The -Winline flag causes gcc to print a warning if you specified inline but gcc didn't inline it, for whatever reason. Hopefully this clears that up. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpSmmXmACi5l.pgp Description: PGP signature
Re: [HACKERS] [PERFORM] A Better External Sort?
On Mon, Oct 03, 2005 at 10:51:32PM +0100, Simon Riggs wrote: Basically, I recommend adding -Winline -finline-limit-1500 to the default build while we discuss other options. I add -Winline but get no warnings. Why would I use -finline-limit-1500? I'm interested, but uncertain as to what difference this makes. Surely using -O3 works fine? Different versions of gcc have different ideas of when a function can be inlined. From my reading of the documentation, this decision is independant of optimisation level. Maybe your gcc version has a limit higher than 1500 by default. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgppgbguCvZK1.pgp Description: PGP signature
Re: [HACKERS] Tuning current tuplesort external sort code for 8.2
On Tue, Oct 04, 2005 at 11:55:58AM +0200, Martijn van Oosterhout wrote: It goes in to say that the limit is 1 for gcc 2.95, but if you examine the manual for gcc 3.3 it has the limit at 600. So it's entirely possible that at the time the person wrote that code, it *was* being inlined, but it sure isn't on some versions of some compilers. I experimented and found that -finline-limit-1500 causes it to start inlining. From searching the web, it appears the inline limit was dropped from 1 to 600 between gcc 3.0.0 and 3.0.1 in response to complaints about gcc memory usage. Any function that could be inlined needed to be kept in memory in semicompiled form and in C++ where lots of inlinable functions call eachother, the memory usage blew up completely. The difference between -O2 and -O3 is that the latter will consider any function for inlining, even if you didn't ask. For C programs that basically means any function declared static. Also, the number is pseudo-instructions and their meaning can change from version to version. Since we're pretty much relying on gcc to inline for performance, I still think we should add -Winline by default so we can tell when it's not doing what we want. Hope this helps, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpkTvFoZ4PXA.pgp Description: PGP signature
Re: [HACKERS] memory bug debugging
On Tue, Oct 04, 2005 at 01:11:41PM +0200, Markus Schiltknecht wrote: Hello hackers, I'm fiddling around with the backend and have created a memory bug. I guess I'm overriding a palloced chunk, but can't figure out where. There are some defines (MEMORY_CONTEXT_CHECKING and CLOBBER_FREED_MEMORY) which can help find leaks. Valgrind works too, with a bit of attention. I was going to add some directives to allow Valgrind to handle PostgreSQL's memory allocator, but have got there yet. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpM1s4YvBPVo.pgp Description: PGP signature
Re: [HACKERS] [PERFORM] A Better External Sort?
On Tue, Oct 04, 2005 at 12:24:54PM +0100, Simon Riggs wrote: How did you determine the 1500 figure? Can you give some more info to surround that recommendation to allow everybody to evaluate it? [EMAIL PROTECTED]:~/dl/cvs/pgsql-local/src/backend/utils/sort$ gcc -finline-limit-1000 -Winline -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wendif-labels -fno-strict-aliasing -g -I../../../../src/include -D_GNU_SOURCE -c -o tuplesort.o tuplesort.c tuplesort.c: In function 'applySortFunction': tuplesort.c:1833: warning: inlining failed in call to 'inlineApplySortFunction' tuplesort.c:1906: warning: called from here tuplesort.c: In function 'comparetup_heap': tuplesort.c:1833: warning: inlining failed in call to 'inlineApplySortFunction' tuplesort.c:1937: warning: called from here tuplesort.c: In function 'comparetup_index': tuplesort.c:1833: warning: inlining failed in call to 'inlineApplySortFunction' tuplesort.c:2048: warning: called from here tuplesort.c: In function 'comparetup_datum': tuplesort.c:1833: warning: inlining failed in call to 'inlineApplySortFunction' tuplesort.c:2167: warning: called from here [EMAIL PROTECTED]:~/dl/cvs/pgsql-local/src/backend/utils/sort$ gcc -finline-limit-1500 -Winline -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wendif-labels -fno-strict-aliasing -g -I../../../../src/include -D_GNU_SOURCE -c -o tuplesort.o tuplesort.c no warnings A quick binary search puts the cutoff between 1200 and 1300. Given version variation I picked a nice round number, 1500. Ugh, that's for -O2, for -O3 and above it needs to be 4100 to work. Maybe we should go for 5000 or so. I'm using: gcc (GCC) 3.3.5 (Debian 1:3.3.5-13) Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpadXPF53tUp.pgp Description: PGP signature
Re: [HACKERS] [PERFORM] A Better External Sort?
On Tue, Oct 04, 2005 at 10:06:24AM -0400, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: I'm using: gcc (GCC) 3.3.5 (Debian 1:3.3.5-13) I don't know what the units of this number are, but it's apparently far too gcc-version-dependent to consider putting into our build scripts. Using gcc version 4.0.1 20050727 (current Fedora Core 4 compiler) on i386, and compiling tuplesort.c as you did, I find: -O2: warning goes away between 800 and 900 -O3: warning is always there (tried values up to 1000) (the latter behavior may indicate a bug, not sure). Facsinating. The fact that the warning goes away if you don't specify -finline-limit seems to indicate they've gotten smarter. Or a bug. We'd have to check the asm code to see if it's actually inlined or not. Two options: 1. Add -Winline so we can at least be aware of when it's (not) happening. 2. If we can't get gcc to reliably inline, maybe we need to consider other options? In particular, move the isNull test statements out since they are ones the optimiser can use to best effect. Add if we put in -Winline, it would be visible to users while compiling so they can tweak their own build options (if they care). -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgppKVdbv5luV.pgp Description: PGP signature
Re: [HACKERS] [PERFORM] A Better External Sort?
On Tue, Oct 04, 2005 at 03:56:53PM +0100, Simon Riggs wrote: I've been using gcc 3.4 and saw no warning when using either -Winline or -O3 -Winline. Ok, I've just installed 3.4 and verified that. I examined the asm code and gcc is inlining it. I concede, at this point just throw in -Winline and monitor the situation. As an aside, the *_getattr calls end up a bit suboptimal though. It's producing code like: cmp attlen, 4 je $elsewhere1 cmp attlen, 2 je $elsewhere2 ld byte here: --- much later --- elsewhere1: ld integer jmp $here elsewhere2: ld short jmp $here No idea whether we want to go down the path of hinting to gcc which size will be the most common. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgp4B0wVNkzsI.pgp Description: PGP signature