Re: [HACKERS] Bitmap index thoughts (another segfault)
On Sat, 7 Apr 2007, Mark Kirkwood wrote: > Mark Kirkwood wrote: > bitmap=# SELECT count(*) FROM bitmaptest > WHERE val1 in (1,7) > AND val0 IN (4,3) > ; > > ERROR: XX000: unknown stream type 2 > LOCATION: stream_add_node, tidbitmap.c:1033 Thanks. Turned out to be a typo in stream_add_node()! I'll post a new patch soon. Thanks for the test kit and your testing! Gavin ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Bitmap index thoughts (another segfault)
> I'm seeing a segfault on a size TPC-H size 10 database. The patch and > code are: > - bitmap patch from 12 Mar > - 8.3 dev from 27 Mar Thanks Mark. I tracked this down. I'll post a new patch soon. Gavin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Bitmap index thoughts (another segfault)
Mark Kirkwood wrote: I'm seeing a segfault on a size TPC-H size 10 database. The patch and code are: - bitmap patch from 12 Mar - 8.3 dev from 27 Mar SELECT count(distinct(o_orderkey)) FROM orders orders_alias WHERE o_orderpriority IN ('1-URGENT', '3-MEDIUM') AND o_orderstatus='P'; (gdb) bt #0 0x in ?? () #1 0x08155eb5 in bitmap_stream_free (opaque=0x84e4070) at tidbitmap.c:1336 #2 0x08142914 in ExecEndBitmapHeapScan (node=0x8405548) at nodeBitmapHeapscan.c:463 #3 0x0813789a in ExecutorEnd (queryDesc=0x83ed948) at execMain.c:992 #4 0x081134ef in PortalCleanup (portal=0x83ee018) at portalcmds.c:302 #5 0x0823e2d2 in PortalDrop (portal=0x83ee018, isTopCommit=0 '\0') at portalmem.c:382 #6 0x081b2182 in exec_simple_query ( query_string=0x83a8018 "SELECT count(distinct(o_orderkey))\nFROM orders orders_alias\nWHERE o_orderpriority IN ('1-URGENT', '3-MEDIUM') AND o_orderstatus='P';") at postgres.c:964 #7 0x081b4350 in PostgresMain (argc=4, argv=0x833a638, username=0x833a610 "postgres") at postgres.c:3488 #8 0x0818faab in ServerLoop () at postmaster.c:2985 #9 0x081911b1 in PostmasterMain (argc=1, argv=0xbfbfec30) at postmaster.c:967 #10 0x08153592 in main (argc=1, argv=0xbfbfec30) at main.c:188 Not a SIGSEGV but another stream related issue: bitmap=# \d bitmaptest; Table "public.bitmaptest" Column | Type | Modifiers +-+--- id | integer | not null val0 | integer | val1 | integer | val2 | integer | fil| text| Indexes: "bitmaptest_id" btree (id) "bitmaptest_val0" bitmap (val0) "bitmaptest_val1" bitmap (val1) "bitmaptest_val2" bitmap (val2) bitmap=# SELECT count(*) FROM bitmaptest WHERE val1 in (1,7) AND val0 IN (4,3) ; ERROR: XX000: unknown stream type 2 LOCATION: stream_add_node, tidbitmap.c:1033 I could not reproduce this with the TPC-H dataset, so here's a link to the files to generate this one: http://homepages.paradise.net.nz/markir/download/bizgres/bitmaptest.tar.gz Cheers Mark ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Bitmap index thoughts (another segfault)
I'm seeing a segfault on a size TPC-H size 10 database. The patch and code are: - bitmap patch from 12 Mar - 8.3 dev from 27 Mar SELECT count(distinct(o_orderkey)) FROM orders orders_alias WHERE o_orderpriority IN ('1-URGENT', '3-MEDIUM') AND o_orderstatus='P'; (gdb) bt #0 0x in ?? () #1 0x08155eb5 in bitmap_stream_free (opaque=0x84e4070) at tidbitmap.c:1336 #2 0x08142914 in ExecEndBitmapHeapScan (node=0x8405548) at nodeBitmapHeapscan.c:463 #3 0x0813789a in ExecutorEnd (queryDesc=0x83ed948) at execMain.c:992 #4 0x081134ef in PortalCleanup (portal=0x83ee018) at portalcmds.c:302 #5 0x0823e2d2 in PortalDrop (portal=0x83ee018, isTopCommit=0 '\0') at portalmem.c:382 #6 0x081b2182 in exec_simple_query ( query_string=0x83a8018 "SELECT count(distinct(o_orderkey))\nFROM orders orders_alias\nWHERE o_orderpriority IN ('1-URGENT', '3-MEDIUM') AND o_orderstatus='P';") at postgres.c:964 #7 0x081b4350 in PostgresMain (argc=4, argv=0x833a638, username=0x833a610 "postgres") at postgres.c:3488 #8 0x0818faab in ServerLoop () at postmaster.c:2985 #9 0x081911b1 in PostmasterMain (argc=1, argv=0xbfbfec30) at postmaster.c:967 #10 0x08153592 in main (argc=1, argv=0xbfbfec30) at main.c:188 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Bitmap index thoughts
On Thu, 8 Feb 2007, Heikki Linnakangas wrote: > Gavin Sherry wrote: > > I will update the code tomorrow. The focus will be cleaning up the > > executor modifications. Please look else where for now. > > I'm getting a segfault with this test script: > > > CREATE TABLE bmtest (i int); > > INSERT INTO bmtest SELECT 1 FROM generate_series(1,10) a; > INSERT INTO bmtest SELECT 10 FROM generate_series(1,100) a; > DELETE FROM bmtest WHERE i = 1; > VACUUM bmtest; > > CREATE INDEX i_bmtest ON bmtest USING bitmap (i); > > INSERT INTO bmtest SELECT 10 FROM generate_series(1,10) a; > > Hmm... this triggers a bug in the newly rewritten update code I think. I'll post a fix soon. Thanks for testing! Gavin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Bitmap index thoughts
Gavin Sherry wrote: I will update the code tomorrow. The focus will be cleaning up the executor modifications. Please look else where for now. I'm getting a segfault with this test script: CREATE TABLE bmtest (i int); INSERT INTO bmtest SELECT 1 FROM generate_series(1,10) a; INSERT INTO bmtest SELECT 10 FROM generate_series(1,100) a; DELETE FROM bmtest WHERE i = 1; VACUUM bmtest; CREATE INDEX i_bmtest ON bmtest USING bitmap (i); INSERT INTO bmtest SELECT 10 FROM generate_series(1,10) a; -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Bitmap index thoughts
Gavin Sherry wrote: On Thu, 1 Feb 2007, Bruce Momjian wrote: Where are we on this patch? Does it have performance tests to show where it is beneificial? Is it ready to be reviewed? Here's an updated patch: http://www.alcove.com.au/~swm/bitmap-2007-02-02.patch In this patch, I rewrote the index build system. It was fast before for well clustered data but for poorly clustered data, it was very slow. Now, it is pretty good for each distribution type. I have various test cases but the one which showed bitmap a poor light was a table of 600M rows. The key to the table had a cardinality of 100,000. When the table was loaded with keys clustered, the build time was 1000 seconds with bitmap (2200 with btree). With poorly clustered data (e.g., the index key was (1, 2, 3, ..., 6000, 1, 2, 3, ...)), the build time for bitmap was 14000 seconds! So, I rewrote this to compress data using HRL encoding (the same scheme we use in the bitmap AM itself). Now, clustered data is just as fast and unclustered data is 2000 seconds. The select performance at a cardinality of 100,000 is similar to btree but faster with lower cardinalities. Jie also contributed a rewrite of the WAL code to this patch. Not only is the code faster now, but it handles the notion of incomplete actions -- like btree and friends do. The executor code still needs some work from me -- Jie and I have dirtied things up while experimenting -- but we would really like some review of the code so that this can get squared away well before the approach of 8.3 feature freeze. One of the major deficiencies remaining is the lack of VACUUM support. Heikki put his hand up for this and I'm holding him to it! ;-) Thanks :). I'll take a look at it. I'm a bit worried that vacuuming can get complicated if an index is in fact an index + a heap + a btree. To remove empty lov items and the entries in the auxiliary heap and b-tree, you need to: 1. Memorize empty lov-items 2. Scan the heap, and mark the heap tuples corresponding the empty lov-items as dead 3. Scan the b-tree, removing pointers to dead heap tuples 4. Remove dead heap tuples 5. Remove empty lov-items Maybe it's possible to call the existing vacuuming code recursively, but it feels quite horrible. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Bitmap index thoughts
On Thu, 1 Feb 2007, Bruce Momjian wrote: > > Where are we on this patch? Does it have performance tests to show > where it is beneificial? Is it ready to be reviewed? Here's an updated patch: http://www.alcove.com.au/~swm/bitmap-2007-02-02.patch In this patch, I rewrote the index build system. It was fast before for well clustered data but for poorly clustered data, it was very slow. Now, it is pretty good for each distribution type. I have various test cases but the one which showed bitmap a poor light was a table of 600M rows. The key to the table had a cardinality of 100,000. When the table was loaded with keys clustered, the build time was 1000 seconds with bitmap (2200 with btree). With poorly clustered data (e.g., the index key was (1, 2, 3, ..., 6000, 1, 2, 3, ...)), the build time for bitmap was 14000 seconds! So, I rewrote this to compress data using HRL encoding (the same scheme we use in the bitmap AM itself). Now, clustered data is just as fast and unclustered data is 2000 seconds. The select performance at a cardinality of 100,000 is similar to btree but faster with lower cardinalities. Jie also contributed a rewrite of the WAL code to this patch. Not only is the code faster now, but it handles the notion of incomplete actions -- like btree and friends do. The executor code still needs some work from me -- Jie and I have dirtied things up while experimenting -- but we would really like some review of the code so that this can get squared away well before the approach of 8.3 feature freeze. One of the major deficiencies remaining is the lack of VACUUM support. Heikki put his hand up for this and I'm holding him to it! ;-) I will update the code tomorrow. The focus will be cleaning up the executor modifications. Please look else where for now. Thanks, Gavin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Bitmap index thoughts
On Thu, 1 Feb 2007, Bruce Momjian wrote: > > Where are we on this patch? Does it have performance tests to show > where it is beneificial? Is it ready to be reviewed? I've got an updated patch which adds significant performance improvements for worse case data distributions. It also contains a rewrite of the WAL code to handle incomplete actions. I haven't worked on the stuff discussed below with Heikki. It's a lot of work and probably more suitable for a second generation. I've just got to finish testing the merge of Tom's operator family stuff and then I'll send off the patch and performance figures. Thanks, Gavin ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Bitmap index thoughts
Where are we on this patch? Does it have performance tests to show where it is beneificial? Is it ready to be reviewed? --- Heikki Linnakangas wrote: > I've been skimming through the bitmap index patch... > > A scan needs to access at least five pages: > > 1. B-tree index (root+others, depending on depth) > 2. The auxiliary heap page > 3. bitmap index meta page > 4. LOV page > 5. bitmap page > > That seems like a lot of indirection. A high startup cost is probably ok > for typical bitmap index use cases and most of the needed pages should > stay in memory, but could we simplify this? Why do we need the auxiliary > heap, couldn't we just store the blk+offset of the LOV item directly in > the b-tree index item? > > And instead of having separate LOV pages that store a number of LOV > items, how about storing each LOV item on a page of it's own, and using > the rest of the page to store the last chunk of the bitmap. That would > eliminate one page access, but more importantly, maybe we could then get > rid of all the bm_last_* attributes in BMLOVItemData that complicate the > patch quite a bit, while preserving the performance. > > -- >Heikki Linnakangas >EnterpriseDB http://www.enterprisedb.com > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings -- Bruce Momjian [EMAIL PROTECTED] EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Bitmap index thoughts
On 12/27/06 3:16 AM, "Gavin Sherry" <[EMAIL PROTECTED]> wrote: > On Wed, 27 Dec 2006, Heikki Linnakangas wrote: > >> Jie Zhang wrote: >>> The "bitmap data segment" sounds good in terms of space. The problem is that >>> one bitmap is likely to occupy more pages than before, which may hurt the >>> query performance. >> >> We could have segments of say 1/5 of page. When a bitmap grows larger >> than that, the bitmap would be moved to a page of its own. That way we >> wouldn't get unnecessary fragmentation with large bitmaps, but small >> bitmaps would be stored efficiently. > > Yes. I see. This sounds good. I will think more about this. Thanks, Jie ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Bitmap index thoughts
On 12/27/06 3:10 AM, "Gavin Sherry" <[EMAIL PROTECTED]> wrote: > On Wed, 27 Dec 2006, Heikki Linnakangas wrote: > >> Gavin Sherry wrote: >>> On Tue, 26 Dec 2006, Heikki Linnakangas wrote: for typical bitmap index use cases and most of the needed pages should stay in memory, but could we simplify this? Why do we need the auxiliary heap, couldn't we just store the blk+offset of the LOV item directly in the b-tree index item? >>> >>> The problem is, the b-tree code is very much tied to the heap. I don't >>> want to modify the b-tree code to make bitmap indexes work (better). >>> What's really tempting is to just manage our own balanced tree within the >>> bitmap index file(s) itself. It would start from the metapage and simply >>> spill to other 'special' index pages when necessary. The problem is, we do >>> not have b-tree code generic enough that it would allow us to do this >>> trivially -- consider concurrency and WAL in particular, which we >>> currently get for free. I guess this is why I've been ignoring this issue >>> :-). >> >> Maybe we could reuse the code in ginbtree.c. Looks like Teodor & Oleg >> had the same problem :). >> >> Modifying the nbtree code doesn't seem that difficult either. AFAICS, >> the only places where the heap is from within nbtree code is in index >> building and uniqueness checks. > > I'll take a look and think about it. > I will take a look at it as well. I would also like to add eventually the values for bm_last_tid_location in all bitmap page to the lov heap. Combining this with the attribute values, we can easily jump into a specific bitmap page. This will be useful during vacuuming and insertion after vacuuming -- we can jump into a page to modify a bit. I am not sure if this will affect the use of the btree code without the heap. But if we can get rid of the lov heap, that would be great. >> And instead of having separate LOV pages that store a number of LOV items, how about storing each LOV item on a page of it's own, and using the rest of the page to store the last chunk of the bitmap. That would eliminate one page access, but more importantly, maybe we could then get rid of all the bm_last_* attributes in BMLOVItemData that complicate the patch quite a bit, while preserving the performance. >>> >>> That's an interesting approach. We would still need a concept of >>> last_word, at the very least, and probably last_comp_word for convenience. >> >> Why? > > The two last words of the bitmap vector, stored in the LOV item, provide a > few things: they buffer the addition of new matches in the vector and they > ease the calculation of the current position of the end of the vector. > Jie, do you want to add more? > > I think we could do this differently by calculating everything based on > the other data stored in the lov item and data page (last_tid_location and > num_hrl_words_used, from memory). But, I'm not sure. Jie? Sure. When appending a new bit into an existing bitmap, the last two words are needed to compute new HRL words, and they are the only words affected by this new appending bit. So we need a way to know these two words easily. Like Gavin said, separating them from other words is for convenience. Also, we won't be able to get rid of bm_last_setbit. We need bm_last_setbit to tell us how many zeros are there between two bit 1s. There is no other way to know this. Thanks, Jie ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Bitmap index thoughts
On Wed, 2006-12-27 at 22:16 +1100, Gavin Sherry wrote: > > But actually I'm not convinced we need to worry about efficient storage > > of small bitmaps at all. The typical use case for bitmap indexes is > > large tables with small number of distinct values, and the problem > > doesn't really arise in that scenario. Let's keep it simple for now, we > > can enhance it in later releases. > > The scenario I'm concerned about is where a sales data base, say, has > 100,000 products. However, only 500 or 1000 products are popular. They > dominate, say >99% of the sales. The other 99,900 products consume a > little bit over 8K each for very little benefit :-(. > > This is pretty contrived but it seem real world enough... Well, that seems to me to be the typical case. It's called a Zipf Distribution and applies to categorical data everywhere. The main question is how you design your database. We might think of indexing the product group (a higher level of the Product Dimension), since these tend to have a low number of values but these may have been normalised (or "placed into a Dimension table"). This might leave you with only the ProductId values in the large Fact table, which as Gavin points out, may be sparsely populated. Another idea might to store the first N heap pointers in the auxiliary heap, rather than allocating them a whole page for the first value. That would be even more space efficient than allocating a fixed size part of a page. At least that would provide some utility for that part of the storage mechanism. However, I think Heikki's KISG approach seems good for now. The infrequent values will probably be infrequently accessed, so we may never need to read them at all (wishful thinking?). If we can get something ready for commit soon, that will leave lots of time before 8.3 freeze to measure where things can be further improved in terms of space and performance. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Bitmap index thoughts
On Wed, Dec 27, 2006 at 03:42:36PM +, Heikki Linnakangas wrote: > >I wonder what would happen if somebody implemented automatic index > >exclusion conditions after use of an INDEX proved to be in the realm > >of the worst case scenario? :-) > I'm sorry, I don't understand that sentence... I was joking about a rather magical automatic INDEX restriction modifier. For example, if the index becomes large enough to matter (100 Mbytes?) and any single key takes up more than, say, 20% of the index, it might be cool if it would automatically add the value to the restriction list, and prune the now wasted index records. Sorry for inserting a silly joke in a serious discussion... :-) Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Bitmap index thoughts
[EMAIL PROTECTED] wrote: On Wed, Dec 27, 2006 at 10:16:54PM +1100, Gavin Sherry wrote: On Wed, 27 Dec 2006, Heikki Linnakangas wrote: But actually I'm not convinced we need to worry about efficient storage of small bitmaps at all. The typical use case for bitmap indexes is large tables with small number of distinct values, and the problem doesn't really arise in that scenario. Let's keep it simple for now, we can enhance it in later releases. The scenario I'm concerned about is where a sales data base, say, has 100,000 products. However, only 500 or 1000 products are popular. They dominate, say >99% of the sales. The other 99,900 products consume a little bit over 8K each for very little benefit :-(. This is pretty contrived but it seem real world enough... Seems like a good candidate for CREATE INDEX WHERE :-) Yeah, that was my first thought as well. However, in Gavin's example it would be a nightmare to manually update the list popular products, and recreate the index when it changes. Something clever inside bitmap indexam would clearly be better. But even in that scenario, 99000*8k pages ~= 750 megabytes of wasted space might still be acceptable. Or not. I wonder what would happen if somebody implemented automatic index exclusion conditions after use of an INDEX proved to be in the realm of the worst case scenario? :-) I'm sorry, I don't understand that sentence... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Bitmap index thoughts
On Wed, Dec 27, 2006 at 10:16:54PM +1100, Gavin Sherry wrote: > On Wed, 27 Dec 2006, Heikki Linnakangas wrote: > > But actually I'm not convinced we need to worry about efficient storage > > of small bitmaps at all. The typical use case for bitmap indexes is > > large tables with small number of distinct values, and the problem > > doesn't really arise in that scenario. Let's keep it simple for now, we > > can enhance it in later releases. > The scenario I'm concerned about is where a sales data base, say, has > 100,000 products. However, only 500 or 1000 products are popular. They > dominate, say >99% of the sales. The other 99,900 products consume a > little bit over 8K each for very little benefit :-(. > This is pretty contrived but it seem real world enough... Seems like a good candidate for CREATE INDEX WHERE :-) I wonder what would happen if somebody implemented automatic index exclusion conditions after use of an INDEX proved to be in the realm of the worst case scenario? :-) Cheers, mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Bitmap index thoughts
On Wed, 27 Dec 2006, Heikki Linnakangas wrote: > Jie Zhang wrote: > > The "bitmap data segment" sounds good in terms of space. The problem is that > > one bitmap is likely to occupy more pages than before, which may hurt the > > query performance. > > We could have segments of say 1/5 of page. When a bitmap grows larger > than that, the bitmap would be moved to a page of its own. That way we > wouldn't get unnecessary fragmentation with large bitmaps, but small > bitmaps would be stored efficiently. Yes. > > > I have been thinking along the lines of increasing the > > number of last bitmap words stored in each LOV item, but not to occupy one > > page. This may prevent some cases Gavin indicated here, but not all. > > That sounds like more special cases and complexity. I like the segment > idea more. > > But actually I'm not convinced we need to worry about efficient storage > of small bitmaps at all. The typical use case for bitmap indexes is > large tables with small number of distinct values, and the problem > doesn't really arise in that scenario. Let's keep it simple for now, we > can enhance it in later releases. The scenario I'm concerned about is where a sales data base, say, has 100,000 products. However, only 500 or 1000 products are popular. They dominate, say >99% of the sales. The other 99,900 products consume a little bit over 8K each for very little benefit :-(. This is pretty contrived but it seem real world enough... Gavin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Bitmap index thoughts
On Wed, 27 Dec 2006, Heikki Linnakangas wrote: > Gavin Sherry wrote: > > On Tue, 26 Dec 2006, Heikki Linnakangas wrote: > >> for typical bitmap index use cases and most of the needed pages should > >> stay in memory, but could we simplify this? Why do we need the auxiliary > >> heap, couldn't we just store the blk+offset of the LOV item directly in > >> the b-tree index item? > > > > The problem is, the b-tree code is very much tied to the heap. I don't > > want to modify the b-tree code to make bitmap indexes work (better). > > What's really tempting is to just manage our own balanced tree within the > > bitmap index file(s) itself. It would start from the metapage and simply > > spill to other 'special' index pages when necessary. The problem is, we do > > not have b-tree code generic enough that it would allow us to do this > > trivially -- consider concurrency and WAL in particular, which we > > currently get for free. I guess this is why I've been ignoring this issue > > :-). > > Maybe we could reuse the code in ginbtree.c. Looks like Teodor & Oleg > had the same problem :). > > Modifying the nbtree code doesn't seem that difficult either. AFAICS, > the only places where the heap is from within nbtree code is in index > building and uniqueness checks. I'll take a look and think about it. > > >> And instead of having separate LOV pages that store a number of LOV > >> items, how about storing each LOV item on a page of it's own, and using > >> the rest of the page to store the last chunk of the bitmap. That would > >> eliminate one page access, but more importantly, maybe we could then get > >> rid of all the bm_last_* attributes in BMLOVItemData that complicate the > >> patch quite a bit, while preserving the performance. > > > > That's an interesting approach. We would still need a concept of > > last_word, at the very least, and probably last_comp_word for convenience. > > Why? The two last words of the bitmap vector, stored in the LOV item, provide a few things: they buffer the addition of new matches in the vector and they ease the calculation of the current position of the end of the vector. Jie, do you want to add more? I think we could do this differently by calculating everything based on the other data stored in the lov item and data page (last_tid_location and num_hrl_words_used, from memory). But, I'm not sure. Jie? Thanks, Gavin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Bitmap index thoughts
Jie Zhang wrote: The "bitmap data segment" sounds good in terms of space. The problem is that one bitmap is likely to occupy more pages than before, which may hurt the query performance. We could have segments of say 1/5 of page. When a bitmap grows larger than that, the bitmap would be moved to a page of its own. That way we wouldn't get unnecessary fragmentation with large bitmaps, but small bitmaps would be stored efficiently. I have been thinking along the lines of increasing the number of last bitmap words stored in each LOV item, but not to occupy one page. This may prevent some cases Gavin indicated here, but not all. That sounds like more special cases and complexity. I like the segment idea more. But actually I'm not convinced we need to worry about efficient storage of small bitmaps at all. The typical use case for bitmap indexes is large tables with small number of distinct values, and the problem doesn't really arise in that scenario. Let's keep it simple for now, we can enhance it in later releases. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Bitmap index thoughts
Gavin Sherry wrote: On Tue, 26 Dec 2006, Heikki Linnakangas wrote: for typical bitmap index use cases and most of the needed pages should stay in memory, but could we simplify this? Why do we need the auxiliary heap, couldn't we just store the blk+offset of the LOV item directly in the b-tree index item? The problem is, the b-tree code is very much tied to the heap. I don't want to modify the b-tree code to make bitmap indexes work (better). What's really tempting is to just manage our own balanced tree within the bitmap index file(s) itself. It would start from the metapage and simply spill to other 'special' index pages when necessary. The problem is, we do not have b-tree code generic enough that it would allow us to do this trivially -- consider concurrency and WAL in particular, which we currently get for free. I guess this is why I've been ignoring this issue :-). Maybe we could reuse the code in ginbtree.c. Looks like Teodor & Oleg had the same problem :). Modifying the nbtree code doesn't seem that difficult either. AFAICS, the only places where the heap is from within nbtree code is in index building and uniqueness checks. And instead of having separate LOV pages that store a number of LOV items, how about storing each LOV item on a page of it's own, and using the rest of the page to store the last chunk of the bitmap. That would eliminate one page access, but more importantly, maybe we could then get rid of all the bm_last_* attributes in BMLOVItemData that complicate the patch quite a bit, while preserving the performance. That's an interesting approach. We would still need a concept of last_word, at the very least, and probably last_comp_word for convenience. Why? PS: Another versio of the patch shall be forthcoming shortly. I've been working on compressing the data in memory during CREATE INDEX instead of just managing arrays of TIDs in memory as we did previously. The array of TIDs works great for well clustered data but it stinks for poorly clustered data as we approach maintenance_word_mem and have to swap a lot. Ok, sounds good. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Bitmap index thoughts
>> And instead of having separate LOV pages that store a number of LOV >> items, how about storing each LOV item on a page of it's own, and using >> the rest of the page to store the last chunk of the bitmap. That would >> eliminate one page access, but more importantly, maybe we could then get >> rid of all the bm_last_* attributes in BMLOVItemData that complicate the >> patch quite a bit, while preserving the performance. > > That's an interesting approach. We would still need a concept of > last_word, at the very least, and probably last_comp_word for convenience. > Your idea doesn't require any extra space, either, which is good. > Something I've been working through is the idea of a 'bitmap data > segment'. Currently, we store the HRL compressed bitmap data to the extent > of the page. That is, sizeof(BMBitmap) is as close to BLCKSZ as we can > make it. The problem is that we may have some bitmaps where a few values > occur only a small number of times and are well clustered at the beginning > of the heap. In that circumstance, we use a whole page to store a small > number of words and the free space cannot be used by any other vector. > Now, say a segment was some fraction the size of BLCKSZ, we use less space > for those vectors with few tuples in the heap. Just an idea... The "bitmap data segment" sounds good in terms of space. The problem is that one bitmap is likely to occupy more pages than before, which may hurt the query performance. I have been thinking along the lines of increasing the number of last bitmap words stored in each LOV item, but not to occupy one page. This may prevent some cases Gavin indicated here, but not all. Thanks, Jie ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Bitmap index thoughts
Hey Heikki, On Tue, 26 Dec 2006, Heikki Linnakangas wrote: > I've been skimming through the bitmap index patch... > > A scan needs to access at least five pages: > > 1. B-tree index (root+others, depending on depth) > 2. The auxiliary heap page > 3. bitmap index meta page > 4. LOV page > 5. bitmap page > > That seems like a lot of indirection. A high startup cost is probably ok You're right, this is excessive and it was something I'd hoped to have ripped out, but... > for typical bitmap index use cases and most of the needed pages should > stay in memory, but could we simplify this? Why do we need the auxiliary > heap, couldn't we just store the blk+offset of the LOV item directly in > the b-tree index item? The problem is, the b-tree code is very much tied to the heap. I don't want to modify the b-tree code to make bitmap indexes work (better). What's really tempting is to just manage our own balanced tree within the bitmap index file(s) itself. It would start from the metapage and simply spill to other 'special' index pages when necessary. The problem is, we do not have b-tree code generic enough that it would allow us to do this trivially -- consider concurrency and WAL in particular, which we currently get for free. I guess this is why I've been ignoring this issue :-). > And instead of having separate LOV pages that store a number of LOV > items, how about storing each LOV item on a page of it's own, and using > the rest of the page to store the last chunk of the bitmap. That would > eliminate one page access, but more importantly, maybe we could then get > rid of all the bm_last_* attributes in BMLOVItemData that complicate the > patch quite a bit, while preserving the performance. That's an interesting approach. We would still need a concept of last_word, at the very least, and probably last_comp_word for convenience. Your idea doesn't require any extra space, either, which is good. Something I've been working through is the idea of a 'bitmap data segment'. Currently, we store the HRL compressed bitmap data to the extent of the page. That is, sizeof(BMBitmap) is as close to BLCKSZ as we can make it. The problem is that we may have some bitmaps where a few values occur only a small number of times and are well clustered at the beginning of the heap. In that circumstance, we use a whole page to store a small number of words and the free space cannot be used by any other vector. Now, say a segment was some fraction the size of BLCKSZ, we use less space for those vectors with few tuples in the heap. Just an idea... Thanks, Gavin PS: Another versio of the patch shall be forthcoming shortly. I've been working on compressing the data in memory during CREATE INDEX instead of just managing arrays of TIDs in memory as we did previously. The array of TIDs works great for well clustered data but it stinks for poorly clustered data as we approach maintenance_word_mem and have to swap a lot. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster