Re: [PATCHES] WIP: avoiding tuple construction/deconstruction overhead
Tom Lane wrote: a_ogawa [EMAIL PROTECTED] writes: (1)We can improve compare_heap() by using TableTupleSlot instead of HeapTuple. Please see attached patch. Did you measure any performance improvement from that? I considered it but thought it would likely be a wash or a loss, because in most cases only one attribute will be pulled from a tuple during comparetup_heap. slot_getattr cannot improve on heap_getattr in that case, and is quite likely to be slower. I measured performance of heap_getattr and slot_getattr in comparetup_heap. I made the table which had ten varchar attributes, and registered data for tests. (Attached file includes SQL doing this.) I carried out the following tests. (case 1) test1: select * from sort_test order by v1 limit 100; test2: select * from sort_test order by v1, v2 limit 100; test3: select * from sort_test order by v1, v2, v3 limit 100; test4: select * from sort_test order by v1, v2, v3, v4 limit 100; test5: select * from sort_test order by v1, v2, v3, v4, v5 limit 100; result:test1test2test3test4test5 --- heap_getattr 2.149s 2.602s 3.204s 3.830s 4.159s slot_getattr 2.523s 3.422s 3.977s 4.453s 4.721s (case 2) test1: select * from sort_test order by v10 limit 100; test2: select * from sort_test order by v10, v9 limit 100; test3: select * from sort_test order by v10, v9, v8 limit 100; test4: select * from sort_test order by v10, v9, v8, v7 limit 100; test5: select * from sort_test order by v10, v9, v8, v7, v6 limit 100; result:test1test2test3test4test5 --- heap_getattr 3.654s 5.549s 6.575s 7.367s 7.870s slot_getattr 4.027s 4.930s 5.249s 5.555s 5.756s (case 3) test1: select * from sort_test order by v5 limit 100; test2: select * from sort_test order by v5, v6 limit 100; test3: select * from sort_test order by v5, v6, v7 limit 100; test4: select * from sort_test order by v5, v6, v7, v8 limit 100; test5: select * from sort_test order by v5, v6, v7, v8, v9 limit 100; result:test1test2test3test4test5 --- heap_getattr 2.657s 4.207s 5.194s 6.179s 6.662s slot_getattr 3.126s 4.233s 4.806s 5.271s 5.557s In most cases, heap_getattr is fast. When the following conditions occurred, slot_getattr is fast. (1)Tuple have varlen attributes. (2)Sort key have more than two attributes. (3)A position of a sort key is far from the head of tuple. (4)As for the data of a sort key, there be many repetition. Actually it will be rare that these conditions are occurred. Thinking from a result, I think that we had better continue using heap_getattr in comparetup_heap. regards, --- Atsushi Ogawa make_test_data.sql Description: Binary data ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] WIP: avoiding tuple construction/deconstruction overhead
I am very excited there has been so much reduction in tuple processing overhead in the past few weeks. This is always and area I thought needed improvement, and its great to see it. We will certainly have some big performance improvements in 8.1 because we already have several (e.g. SMP) and we have many more months to go. --- a_ogawa wrote: Tom Lane wrote: a_ogawa [EMAIL PROTECTED] writes: (1)We can improve compare_heap() by using TableTupleSlot instead of HeapTuple. Please see attached patch. Did you measure any performance improvement from that? I considered it but thought it would likely be a wash or a loss, because in most cases only one attribute will be pulled from a tuple during comparetup_heap. slot_getattr cannot improve on heap_getattr in that case, and is quite likely to be slower. I measured performance of heap_getattr and slot_getattr in comparetup_heap. I made the table which had ten varchar attributes, and registered data for tests. (Attached file includes SQL doing this.) I carried out the following tests. (case 1) test1: select * from sort_test order by v1 limit 100; test2: select * from sort_test order by v1, v2 limit 100; test3: select * from sort_test order by v1, v2, v3 limit 100; test4: select * from sort_test order by v1, v2, v3, v4 limit 100; test5: select * from sort_test order by v1, v2, v3, v4, v5 limit 100; result:test1test2test3test4test5 --- heap_getattr 2.149s 2.602s 3.204s 3.830s 4.159s slot_getattr 2.523s 3.422s 3.977s 4.453s 4.721s (case 2) test1: select * from sort_test order by v10 limit 100; test2: select * from sort_test order by v10, v9 limit 100; test3: select * from sort_test order by v10, v9, v8 limit 100; test4: select * from sort_test order by v10, v9, v8, v7 limit 100; test5: select * from sort_test order by v10, v9, v8, v7, v6 limit 100; result:test1test2test3test4test5 --- heap_getattr 3.654s 5.549s 6.575s 7.367s 7.870s slot_getattr 4.027s 4.930s 5.249s 5.555s 5.756s (case 3) test1: select * from sort_test order by v5 limit 100; test2: select * from sort_test order by v5, v6 limit 100; test3: select * from sort_test order by v5, v6, v7 limit 100; test4: select * from sort_test order by v5, v6, v7, v8 limit 100; test5: select * from sort_test order by v5, v6, v7, v8, v9 limit 100; result:test1test2test3test4test5 --- heap_getattr 2.657s 4.207s 5.194s 6.179s 6.662s slot_getattr 3.126s 4.233s 4.806s 5.271s 5.557s In most cases, heap_getattr is fast. When the following conditions occurred, slot_getattr is fast. (1)Tuple have varlen attributes. (2)Sort key have more than two attributes. (3)A position of a sort key is far from the head of tuple. (4)As for the data of a sort key, there be many repetition. Actually it will be rare that these conditions are occurred. Thinking from a result, I think that we had better continue using heap_getattr in comparetup_heap. regards, --- Atsushi Ogawa [ Attachment, skipping... ] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] WIP: avoiding tuple construction/deconstruction overhead
Tom Lane wrote: Attached is the current state of a patch to reduce the overhead of passing tuple data up through many levels of plan nodes. It is a good idea. I think that this patch improves performance of the whole executor. I have three comments. (1)We can improve compare_heap() by using TableTupleSlot instead of HeapTuple. Please see attached patch. (2)In ExecStoreTuple(), we can omit initialization of slot-tts_nvalid. If slot-tts_isempty is false, tts_nvalid is initialized by ExecClearTuple(). If it is true, tts_nvalid is always zero. (3)There is a description of slot-val in comment of execTuple.c. This had better revise it. Finally, I have made some progress towards making the tuple access routines consistently use "bool isNull" arrays as null markers, instead of the char 'n' or ' ' convention that was previously used in some but not all contexts. I agree. I think that this progress improves readability. regards, --- Atsushi Ogawa compare_heap.patch Description: Binary data ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] WIP: avoiding tuple construction/deconstruction overhead
a_ogawa [EMAIL PROTECTED] writes: (1)We can improve compare_heap() by using TableTupleSlot instead of HeapTuple. Please see attached patch. Did you measure any performance improvement from that? I considered it but thought it would likely be a wash or a loss, because in most cases only one attribute will be pulled from a tuple during comparetup_heap. slot_getattr cannot improve on heap_getattr in that case, and is quite likely to be slower. (3)There is a description of slot-val in comment of execTuple.c. This had better revise it. Drat, how'd I miss that? Thanks. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster