Re: [HACKERS] ANSI-strict pointer aliasing rules
On 4/27/06, Zeugswetter Andreas DCP SD [EMAIL PROTECTED] wrote: Can you please explain what exactly was not working ? xlc has in the past shown warnings that were actually problematic code that gcc did not show (and the cc variant of xlc also does not show). This has nothing to do with warnings. With xlc version 6, this code: Value * makeString(char *str) { Value *v = makeNode(Value); v-type = T_String; v-val.str = str; return v; } Will return objects whose type field is T_Value (650), because the compiler reorders the assignment that makeNode makes with that of the main function. -- Taral [EMAIL PROTECTED] You can't prove anything. -- Gödel's Incompetence Theorem ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] ANSI-strict pointer aliasing rules
On 4/27/06, Tom Lane [EMAIL PROTECTED] wrote: Greg Stark [EMAIL PROTECTED] writes: There are other ways of achieving the same thing. Structs containing a union for the subclass fields for example. Doesn't achieve the same thing, unless you mandate that every part of the system use the identical massively-overloaded union struct to refer to every node. If we do subclassing like this: struct Node { ... }; struct Value { struct Node; ... }; etc. do we still run into the alias problem? -- Taral [EMAIL PROTECTED] You can't prove anything. -- Gödel's Incompetence Theorem ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] ANSI-strict pointer aliasing rules
On 4/27/06, Taral [EMAIL PROTECTED] wrote: If we do subclassing like this: struct Node { ... }; struct Value { struct Node; ... }; etc. do we still run into the alias problem? Nope, it appears to get rid of the alias problem completely. But it requires anonymous structure support (C99?) to work without changing anything other than headers. As a bonus, if we ever change Node, we don't have to update any other structures... -- Taral [EMAIL PROTECTED] You can't prove anything. -- Gödel's Incompetence Theorem ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] ANSI-strict pointer aliasing rules
On 27 Apr 2006 15:25:45 -0400, Greg Stark [EMAIL PROTECTED] wrote: It would be pretty cool to have a type-safe codebase. It just seems like too an awful lot of work for a mostly aesthetic improvement. Does anyone have some benchmarks I can run? I can run tests to see if this aliasing makes a noticeable difference or not... -- Taral [EMAIL PROTECTED] You can't prove anything. -- Gödel's Incompetence Theorem ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] ANSI-strict pointer aliasing rules
I ran afoul of these rules the other day when compiling pgsql 8.1 on AIX. The configure scripts are set up to look for xlc instead of cc, and that command invokes cc with -qalias=ansi, the ANSI-strict pointer aliasing mode. Specifically, in this mode, the compiler assumes that accesses via pointers of different types never alias. Unfortunately, this breaks all of the Value construction code, because we end up with (for example): Node *n = palloc0fast(...); n-tag = T_VALUE; Value *v = (Value *) n; v-tag = T_STRING; And aggressive compiler optimization can reorder these two tag assignments, resulting in the bizarre Unrecognized node type: 650 message. The fix is one of two things: 1. Change the tag element of structures to be a Node, and access the tag via it: Major code change, allows Node to change in the future for instrumentation et cetera. 2. Make the makeNode macro cast to the derived structure before assigning the tag: Minor code change, makes assumptions about derived structures. 3. Get configure to select cc instead of xlc: No code change, loses some performance. -- Taral [EMAIL PROTECTED] You can't prove anything. -- Gödel's Incompetence Theorem ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] ANSI-strict pointer aliasing rules
On 4/26/06, Martijn van Oosterhout kleptog@svana.org wrote: Well, there are a number of fixes, it's questionable whether it's worth the effort. In GCC you can mark structures that are aliased to avoid the problem (attribute((may_alias)) iirc), but we don't do that. There's also C99's restrict. 4. Find the option for disabling strict alias and get configure to add that. You'll still lose performance, but the option is -qalias=noansi. -- Taral [EMAIL PROTECTED] You can't prove anything. -- Gödel's Incompetence Theorem ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: No index maximum? (was Re: [HACKERS] No merge sort?)
On Mon, Mar 17, 2003 at 11:23:47AM -0600, Taral wrote: Yes, that's exactly it. It's an index _scan_. It should simply be able to read the maximum straight from the btree. Still doesn't work, even with rewritten query. It sort a Limit(Sort(Index Scan)), with 1333 rows being pulled from the index. -- Taral [EMAIL PROTECTED] This message is digitally signed. Please PGP encrypt mail to me. Most parents have better things to do with their time than take care of their children. -- Me pgp0.pgp Description: PGP signature
Re: No index maximum? (was Re: [HACKERS] No merge sort?)
On Sat, Mar 15, 2003 at 09:23:28AM -0600, Bruno Wolff III wrote: On Fri, Mar 14, 2003 at 14:19:46 -0600, Taral [EMAIL PROTECTED] wrote: Same setup, different query: test= explain select max(time) from test where id = '1'; NOTICE: QUERY PLAN: Aggregate (cost=5084.67..5084.67 rows=1 width=0) - Index Scan using idx on test (cost=0.00..5081.33 rows=1333 width=0) Since the index is (id, time), why isn't the index being used to retrieve the maximum value? It looks like an index scan is being done. If the index was on (time, id) instead of (id, time), then you could get a further speed up by rewriting the query as: select time from test where id = '1' order by time desc limit 1; Yes, that's exactly it. It's an index _scan_. It should simply be able to read the maximum straight from the btree. -- Taral [EMAIL PROTECTED] This message is digitally signed. Please PGP encrypt mail to me. Most parents have better things to do with their time than take care of their children. -- Me pgp0.pgp Description: PGP signature
No index maximum? (was Re: [HACKERS] No merge sort?)
Same setup, different query: test= explain select max(time) from test where id = '1'; NOTICE: QUERY PLAN: Aggregate (cost=5084.67..5084.67 rows=1 width=0) - Index Scan using idx on test (cost=0.00..5081.33 rows=1333 width=0) Since the index is (id, time), why isn't the index being used to retrieve the maximum value? On Thu, Mar 13, 2003 at 03:10:49PM -0600, Taral wrote: I have a table test that looks like this: CREATE TABLE test ( id BIGINT, time INTEGER ); There is an index: CREATE INDEX idx ON test(id, time); The table has been loaded with 2M rows, where time ranges sequentially from 0 to 199 and id is random values from 0 to 4. -- Taral [EMAIL PROTECTED] This message is digitally signed. Please PGP encrypt mail to me. Most parents have better things to do with their time than take care of their children. -- Me pgp0.pgp Description: PGP signature
Re: [HACKERS] No merge sort?
On Fri, Mar 14, 2003 at 10:43:30PM -0500, Tom Lane wrote: Sure. That's why we have a planner that distinguishes between startup cost and total cost, and interpolates when a LIMIT is involved. But if this mergesort idea only helps for small-limit cases, that's another restriction on its scope of usefulness... I don't think so, since even in the non-limit case it avoids having to do a full sort if the number of initial streams is finite and small (as in the case I demonstrated), reducing time complexity to O(N). -- Taral [EMAIL PROTECTED] This message is digitally signed. Please PGP encrypt mail to me. Most parents have better things to do with their time than take care of their children. -- Me pgp0.pgp Description: PGP signature
[HACKERS] No merge sort?
I tried general, but no response. Anyone here can shed some light on the issue? Do I need to code merge sort into postgresql? - Forwarded message from Taral [EMAIL PROTECTED] - From: Taral [EMAIL PROTECTED] To: [EMAIL PROTECTED] Date: Wed, 12 Mar 2003 17:54:35 -0600 Subject: [GENERAL] No merge sort? Message-ID: [EMAIL PROTECTED] I have a table test that looks like this: CREATE TABLE test ( id BIGINT, time INTEGER ); There is an index: CREATE INDEX idx ON test(id, time); The table has been loaded with 2M rows, where time ranges sequentially from 0 to 199 and id is random values from 0 to 4. This query: SELECT * FROM idx WHERE id IN (...) AND time 198000 AND time 199800 ORDER BY time DESC LIMIT 20; has an EXPLAIN ANALYZE of: Limit (cost=3635.28..3635.28 rows=20 width=12) (actual time=22.94...22.96 rows=14 loops=1) - Sort (cost=3635.28..3635.28 rows=23 width=12) (actual time=22.93..22.93 rows=14 loops=1) - Index Scan using idx, idx, ..., idx, idx on test (cost=0.00...3634.77 rows=23 width=12) (actual time=1.01..22.10 rows=14 loops=1) Total runtime: 29.12 msec This query: SELECT * FROM idx WHERE id IN (...) AND time 199800 ORDER BY time DESC LIMIT 20; has an EXPLAIN ANALYZE of: Limit (cost=14516.46..14516.46 rows=20 width=12) (actual time=1448..83..1448.86 rows=20 loops=1) - Sort (cost=14516.46..14516.46 rows=2527 width=12) (actual time=1448.82..1448.83 rows=21 loops=1) - Index Scan using idx, idx, ..., idx, idx on test (cost=0.00...14373.67 rows=2527 width=12) (actual time=0.14..1437.33 rows=2048 loops=1) Total runtime: 1454.62 msec Since the index will output 'time' sorted data for each 'id', why isn't a merge sort being used here? A merge sort would reduce the execution time back to 30 ms. -- Taral [EMAIL PROTECTED] This message is digitally signed. Please PGP encrypt mail to me. Most parents have better things to do with their time than take care of their children. -- Me pgp0.pgp Description: PGP signature
Re: [HACKERS] No merge sort?
On Thu, Mar 13, 2003 at 04:28:34PM -0500, Tom Lane wrote: Seems like a waste of effort to me. I find this example less than compelling --- the case that could be sped up is quite narrow, and the potential performance gain not all that large. (A sort is a sort however you slice it, with O(N log N) runtime...) Actually, it's O(N) time. The index can produce time sorted data for each id in linear time, and the merge sort can merge them in linear time. Also, the existing system insists on loading _all_ candidate rows whereas this method can benefit from the limit. If you don't want to code it, I will. I need it for the livejournal mysql-postgresql transition. (No, mysql doesn't do it right either.) But a few pointers to the right places to look in the code would be helpful. Also, the direction we'd likely be going in in future is to merge multiple indexscans using bitmap techniques, so that the output ordering of the scans couldn't be counted on anyway. I don't understand this. What do these bitmap techniques do? -- Taral [EMAIL PROTECTED] This message is digitally signed. Please PGP encrypt mail to me. Most parents have better things to do with their time than take care of their children. -- Me pgp0.pgp Description: PGP signature
Re: [HACKERS] No merge sort?
On Thu, Mar 13, 2003 at 10:30:27PM -0500, Tom Lane wrote: The idea is you look at the index to make a list of main-table tuple positions you are interested in, which you represent compactly as a compressed bitmap. (There is some finagling needed because PG actually uses block/line number rather than a pure tuple number to identify tuples, but you can fake it with a reasonably small amount of overhead.) Then you can combine multiple index inputs by ANDing or ORing bitmaps (the OR case applies to your example). Finally, you traverse the heap, accessing the desired rows in heap-location order. This loses in terms of producing presorted output --- but it often saves enough in I/O costs to more than justify doing the sort in memory. And it loses bigtime in the case of LIMIT. If the unlimited query returns 4,000 records and I only want 20, you're retrieving 200x too much data from disk. -- Taral [EMAIL PROTECTED] This message is digitally signed. Please PGP encrypt mail to me. Most parents have better things to do with their time than take care of their children. -- Me pgp0.pgp Description: PGP signature