Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-07 Thread Michael Paquier
On Wed, Jun 8, 2016 at 1:23 AM, Tom Lane  wrote:
> "David G. Johnston"  writes:
>> On Tue, Jun 7, 2016 at 1:35 AM, Michael Paquier 
>> wrote:
>>> I have finally given a shot at improving the docs with the attached.
>>> Comments are welcome.
>
>> [ assorted comments ]
>
> I adopted most of David's suggestions, whacked it around a bit further
> myself, and committed.  See what you think.

That looks better, thanks.

>> It would be nice to give guidance on selecting a bit size for columns and
>> a signature length.  Yes, Wikipedia covers the topic but to get the reader
>> started some discussion of the relevant trade-offs when using larger
>> numbers than the default would be nice.  I don't suspect using smaller the
>> default values is apt to be worthwhile...
>
> Agreed, but I didn't want to write such text myself.  There's room for
> further improvement here.  I did add a note in the main example about
> what happens with a non-default signature length, but that hardly
> constitutes guidance.
>
> BTW, it seemed to me while generating the example that the planner's
> costing for bloom index searches was unduly pessimistic; maybe there's
> work to do there?

I wanted them to do so to prove that index rechecks are necessary as
false positives can be returned when scanning the index. We could add
an extra example with an index that has a longer signature size... I
am not sure that's worth the complication though.

I am marking this item as closed, in my view things are looking far better.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-07 Thread Tom Lane
"David G. Johnston"  writes:
> Do I understand the process correctly?  The current 9.6 docs reflect what
> was committed and branched as beta1.  Ongoing work is done against master
> (devel docs).  When beta2 is released it is branched from the current
> master; not beta1.  At that point the current devel docs will become the
> new 9.6 docs.

There is no beta1 branch.  9.6 is still the master branch, and beta2 will
be stamped on master, as will at least the next couple of betas.  We will
branch off REL9_6_STABLE whenever we're ready to open the tree for
9.7^H^H^H10 development, which likely won't be till September.  (The
point here is to try to keep people focused on stabilizing 9.6 for
awhile longer yet.  Last year we opened new development in the summer,
and that had a detrimental effect on getting 9.5 out the door.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-07 Thread David G. Johnston
On Tue, Jun 7, 2016 at 12:23 PM, Tom Lane  wrote:

> "David G. Johnston"  writes:
> > On Tue, Jun 7, 2016 at 1:35 AM, Michael Paquier <
> michael.paqu...@gmail.com>
> > wrote:
> >> I have finally given a shot at improving the docs with the attached.
> >> Comments are welcome.
>
> > [ assorted comments ]
>
> I adopted most of David's suggestions, whacked it around a bit further
> myself, and committed.  See what you think.
>

Looks good though I'm waiting for the website to update.

Do I understand the process correctly?  The current 9.6 docs reflect what
was committed and branched as beta1.  Ongoing work is done against master
(devel docs).  When beta2 is released it is branched from the current
master; not beta1.  At that point the current devel docs will become the
new 9.6 docs.

​Thanks!​

​David J.
​


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-07 Thread Tom Lane
"David G. Johnston"  writes:
> On Tue, Jun 7, 2016 at 1:35 AM, Michael Paquier 
> wrote:
>> I have finally given a shot at improving the docs with the attached.
>> Comments are welcome.

> [ assorted comments ]

I adopted most of David's suggestions, whacked it around a bit further
myself, and committed.  See what you think.

> ​It would be nice to give guidance on selecting a bit size for columns and
> a signature length​.  Yes, Wikipedia covers the topic but to get the reader
> started some discussion of the relevant trade-offs when using larger
> numbers than the default would be nice.  I don't suspect using smaller the
> default values is apt to be worthwhile...

Agreed, but I didn't want to write such text myself.  There's room for
further improvement here.  I did add a note in the main example about
what happens with a non-default signature length, but that hardly
constitutes guidance.

BTW, it seemed to me while generating the example that the planner's
costing for bloom index searches was unduly pessimistic; maybe there's
work to do there?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-07 Thread David G. Johnston
On Tue, Jun 7, 2016 at 1:35 AM, Michael Paquier 
wrote:

> On Fri, Jun 3, 2016 at 11:25 PM, Tom Lane  wrote:
> > Michael Paquier  writes:
> >> Actually, the docs could be more polished.
> >
> > I think the docs could stand to be rewritten from scratch ;-).  But
> > upthread there was an offer to work on them if we made the code behavior
> > saner.  I've done the latter part, I don't want to do the former.
>
> I have finally given a shot at improving the docs with the attached.
> Comments are welcome.


​It would be nice to give guidance on selecting a bit size for columns and
a signature length​.  Yes, Wikipedia covers the topic but to get the reader
started some discussion of the relevant trade-offs when using larger
numbers than the default would be nice.  I don't suspect using smaller the
default values is apt to be worthwhile...

David J.


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-07 Thread David G. Johnston
On Tue, Jun 7, 2016 at 1:35 AM, Michael Paquier 
wrote:

> On Fri, Jun 3, 2016 at 11:25 PM, Tom Lane  wrote:
> > Michael Paquier  writes:
> >> Actually, the docs could be more polished.
> >
> > I think the docs could stand to be rewritten from scratch ;-).  But
> > upthread there was an offer to work on them if we made the code behavior
> > saner.  I've done the latter part, I don't want to do the former.
>
> I have finally given a shot at improving the docs with the attached.
> Comments are welcome.
>

​Looks good.  Thanks!​

​Some minor word-smithing​ related stuff and one definitional concern:

​"of all indexed attributes and so it can report false positives" -> of all
indexed attributes and as such is prone to reporting false positives;

​"in the set, however" -> "in the set although"
​
​"one only needs a single bloom index (default 80, maximum 4096)" -> ​the
default seems like it would be better placed in the first paragraph of the
intro where "whose size in calculated in bits" is mentioned; or better yet
dropped altogether since the parameters section covers the defaults.

*** "to the number of the column for" - the examples imply that each
parameter refers to columns by name, not number.

"a bloom index  representing first the advantage to be more" - this intro
to the example needs some work.  maybe: "Here is a more complete example of
index definition and usage, as well as a comparison with the equivalent
btree index.  The bloom index is considerably smaller as well as performs
better than the btree index.

---As an aside, is a multi-column index really a fair comparison here?

---Leaving a sequential scan explain analyze in place should be considered.

​"The Bloom opclass interface" -> The Bloom opclass interface requires a
hash function for the indexing datatype and an equality operator for
searching.  The example...(drop the simple conclusion the word the equality
operator part better).

​"are implemented with the module" - are supplied by this module. (side
question, for 10.0 how about we call these extensions instead of modules?)

David J.


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-06 Thread Michael Paquier
On Fri, Jun 3, 2016 at 11:25 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> Actually, the docs could be more polished.
>
> I think the docs could stand to be rewritten from scratch ;-).  But
> upthread there was an offer to work on them if we made the code behavior
> saner.  I've done the latter part, I don't want to do the former.

I have finally given a shot at improving the docs with the attached.
Comments are welcome.
-- 
Michael
diff --git a/doc/src/sgml/bloom.sgml b/doc/src/sgml/bloom.sgml
index 8667763..c1e204f 100644
--- a/doc/src/sgml/bloom.sgml
+++ b/doc/src/sgml/bloom.sgml
@@ -8,43 +8,38 @@
  
 
  
-  bloom is a module that implements an index access method.  It comes
-  as an example of custom access methods and generic WAL record usage.  But it
-  is also useful in itself.
+  bloom provides an index access method usable as a
+  http://en.wikipedia.org/wiki/Bloom_filter;>Bloom filter.
  
 
- 
-  Introduction
-
-  
-   The implementation of a
-   http://en.wikipedia.org/wiki/Bloom_filter;>Bloom filter
-   allows fast exclusion of non-candidate tuples via signatures.
-   Since a signature is a lossy representation of all indexed attributes,
-   search results must be rechecked using heap information.
-   The user can specify signature length in bits (default 80, maximum 4096)
-   and the number of bits generated for each index column (default 2,
-   maximum 4095).
-  
+ 
+  A bloom filter is a space-efficient data structure that is used to test
+  whether an element is a member of a set.  In the case of an index access
+  method, it allows fast exclusion of non-candidate tuples via signatures
+  whose size is calculated in bits and is controllable at index creation.
+ 
 
-  
-   This index is useful if a table has many attributes and queries include
-   arbitrary combinations of them.  A traditional btree index is
-   faster than a bloom index, but it can require many indexes to support all
-   possible queries where one needs only a single bloom index.  A Bloom index
-   supports only equality comparison.  Since it's a signature file, and not a
-   tree, it always must be read fully, but sequentially, so that index search
-   performance is constant and doesn't depend on a query.
-  
- 
+ 
+  A signature is a lossy representation of all indexed attributes and
+  so it can report false positives (it is reported that an element exists in
+  the set, however it does not), so search results must be rechecked using
+  heap information.
+ 
+ 
+ 
+  This type of index is useful if a table has many attributes and queries
+  include arbitrary combinations of them.  A traditional btree
+  index is faster than a bloom index, but it can require many indexes to
+  support all possible queries where one needs only a single bloom index
+  (default 80, maximum 4096).
+ 
 
  
   Parameters
 
   
-   bloom indexes accept the following parameters in the
-   WITH
-   clause.
+   A bloom index accepts the following parameters in its
+   WITH clause.
   
 

@@ -52,7 +47,8 @@
 length
 
  
-  Length of signature in bits
+  Length of signature in bits. Default is 80 and maximum is
+  4096.
  
 

@@ -62,7 +58,9 @@
 col1  col32
 
  
-  Number of bits generated for each index column
+  Number of bits generated for each index column. Each parameter refers
+  to the number of the column for the relation on which an index is
+  defined.
  
 

@@ -82,91 +80,88 @@ CREATE INDEX bloomidx ON tbloom USING bloom (i1,i2,i3)
 
 
   
-   Here, we created a bloom index with a signature length of 80 bits,
-   and attributes i1 and i2 mapped to 2 bits, and attribute i3 mapped to 4 bits.
+   Here, a bloom index is created with a signature length of 80 bits, and
+   attributes i1 and i2 mapped to 2 bits, and attribute i3 mapped to 4 bits.
   
 
   
-   Here is a fuller example of index definition and usage:
+   Here is a more complete example of index definition and usage, a bloom
+   index  representing first the advantage to be more space-efficient
+   than an equivalent btree index:
   
 
 
-CREATE TABLE tbloom AS
-SELECT
-random()::int as i1,
-random()::int as i2,
-random()::int as i3,
-random()::int as i4,
-random()::int as i5,
-random()::int as i6,
-random()::int as i7,
-random()::int as i8,
-random()::int as i9,
-random()::int as i10,
-random()::int as i11,
-random()::int as i12,
-random()::int as i13
-FROM
-generate_series(1,1000);
-CREATE INDEX bloomidx ON tbloom USING
- bloom (i1, i2, i3, i4, i5, i6, i7, i8, i9, i10, i11, i12);
-SELECT pg_relation_size('bloomidx');
-CREATE index btree_idx ON tbloom(i1,i2,i3,i4,i5,i6,i7,i8,i9,i10,i11,i12);
-SELECT pg_relation_size('btree_idx');
-
-
-
-=# EXPLAIN ANALYZE SELECT * FROM tbloom WHERE i2 = 20 AND i10 = 15;
-   QUERY PLAN

Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-03 Thread Tom Lane
Jeff Janes  writes:
> One thing from the commit-message:
> "On-disk, we can still store it in words, so as to not break on-disk
> compatibility with beta1."

> Hasn't that ship already sailed?

No.

> Or is the concern about intra-version pg_upgrade rather than direct
> on-disk compatibility?

This.  You can pg_upgrade across a catversion bump, but not across
changes in user table or index contents.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-03 Thread Jeff Janes
On Thu, Jun 2, 2016 at 9:03 PM, Tom Lane  wrote:
> I wrote:
>> Jeff Janes  writes:
>>> My biggest gripe with it at the moment is that the signature size should be
>>> expressed in bits, and then internally rounded up to a multiple of 16,
>>> rather than having it be expressed in 'uint16'.
>>> If that were done it would be easier to fix the documentation to be more
>>> understandable.
>
>> +1 ... that sort of definition seems much more future-proof, too.
>> IMO it's not too late to change this.  (We probably don't want to change
>> the on-disk representation of the reloptions, but we could convert from
>> bits to words in bloptions().)
>
> There were no objections to this, but also no action.  Attached is a draft
> patch ... any complaints?

One thing from the commit-message:

"On-disk, we can still store it in words, so as to not break on-disk
compatibility with beta1."

Hasn't that ship already sailed?

from beta1 to HEAD:

The database cluster was initialized with CATALOG_VERSION_NO
201605051, but the server was compiled with CATALOG_VERSION_NO
201605191.

Or is the concern about intra-version pg_upgrade rather than direct
on-disk compatibility?

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-03 Thread Tom Lane
Michael Paquier  writes:
> Actually, the docs could be more polished.

I think the docs could stand to be rewritten from scratch ;-).  But
upthread there was an offer to work on them if we made the code behavior
saner.  I've done the latter part, I don't want to do the former.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-03 Thread Michael Paquier
On Fri, Jun 3, 2016 at 3:11 PM, Michael Paquier
 wrote:
> On Fri, Jun 3, 2016 at 1:03 PM, Tom Lane  wrote:
>> I wrote:
>>> Jeff Janes  writes:
 My biggest gripe with it at the moment is that the signature size should be
 expressed in bits, and then internally rounded up to a multiple of 16,
 rather than having it be expressed in 'uint16'.
 If that were done it would be easier to fix the documentation to be more
 understandable.
>>
>>> +1 ... that sort of definition seems much more future-proof, too.
>>> IMO it's not too late to change this.  (We probably don't want to change
>>> the on-disk representation of the reloptions, but we could convert from
>>> bits to words in bloptions().)
>>
>> There were no objections to this, but also no action.  Attached is a draft
>> patch ... any complaints?
>
> None. This looks rather sane to me.

Actually, the docs could be more polished. "Limitation" should be
changed to "Limitations", and "Opclass interface" to "Operator Class
Interface". The current wording "Seqscan is slow" is not clear, this
should mention a sequential scan instead. And it is not that slow,
just slower than the heap index scan of bloom..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-03 Thread Michael Paquier
On Fri, Jun 3, 2016 at 1:03 PM, Tom Lane  wrote:
> I wrote:
>> Jeff Janes  writes:
>>> My biggest gripe with it at the moment is that the signature size should be
>>> expressed in bits, and then internally rounded up to a multiple of 16,
>>> rather than having it be expressed in 'uint16'.
>>> If that were done it would be easier to fix the documentation to be more
>>> understandable.
>
>> +1 ... that sort of definition seems much more future-proof, too.
>> IMO it's not too late to change this.  (We probably don't want to change
>> the on-disk representation of the reloptions, but we could convert from
>> bits to words in bloptions().)
>
> There were no objections to this, but also no action.  Attached is a draft
> patch ... any complaints?

None. This looks rather sane to me.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-06-02 Thread Tom Lane
I wrote:
> Jeff Janes  writes:
>> My biggest gripe with it at the moment is that the signature size should be
>> expressed in bits, and then internally rounded up to a multiple of 16,
>> rather than having it be expressed in 'uint16'.
>> If that were done it would be easier to fix the documentation to be more
>> understandable.

> +1 ... that sort of definition seems much more future-proof, too.
> IMO it's not too late to change this.  (We probably don't want to change
> the on-disk representation of the reloptions, but we could convert from
> bits to words in bloptions().)

There were no objections to this, but also no action.  Attached is a draft
patch ... any complaints?

regards, tom lane

diff --git a/contrib/bloom/bloom.h b/contrib/bloom/bloom.h
index c21eebf..a9daf63 100644
*** a/contrib/bloom/bloom.h
--- b/contrib/bloom/bloom.h
*** typedef BloomPageOpaqueData *BloomPageOp
*** 79,96 
  #define BLOOM_HEAD_BLKNO		(1)		/* first data page */
  
  /*
!  * Maximum of bloom signature length in uint16. Actual value
!  * is 512 bytes
   */
! #define MAX_BLOOM_LENGTH		(256)
  
  /* Bloom index options */
  typedef struct BloomOptions
  {
  	int32		vl_len_;		/* varlena header (do not touch directly!) */
! 	int			bloomLength;	/* length of signature in uint16 */
! 	int			bitSize[INDEX_MAX_KEYS];		/* signature bits per index
!  * key */
  }	BloomOptions;
  
  /*
--- 79,108 
  #define BLOOM_HEAD_BLKNO		(1)		/* first data page */
  
  /*
!  * We store Bloom signatures as arrays of uint16 words.
   */
! typedef uint16 BloomSignatureWord;
! 
! #define SIGNWORDBITS ((int) (BITS_PER_BYTE * sizeof(BloomSignatureWord)))
! 
! /*
!  * Default and maximum Bloom signature length in bits.
!  */
! #define DEFAULT_BLOOM_LENGTH	(5 * SIGNWORDBITS)
! #define MAX_BLOOM_LENGTH		(256 * SIGNWORDBITS)
! 
! /*
!  * Default and maximum signature bits generated per index key.
!  */
! #define DEFAULT_BLOOM_BITS		2
! #define MAX_BLOOM_BITS			(MAX_BLOOM_LENGTH - 1)
  
  /* Bloom index options */
  typedef struct BloomOptions
  {
  	int32		vl_len_;		/* varlena header (do not touch directly!) */
! 	int			bloomLength;	/* length of signature in words (not bits!) */
! 	int			bitSize[INDEX_MAX_KEYS];	/* signature bits per index key */
  }	BloomOptions;
  
  /*
*** typedef struct BloomState
*** 143,154 
  /*
   * Tuples are very different from all other relations
   */
- typedef uint16 SignType;
- 
  typedef struct BloomTuple
  {
  	ItemPointerData heapPtr;
! 	SignType	sign[FLEXIBLE_ARRAY_MEMBER];
  }	BloomTuple;
  
  #define BLOOMTUPLEHDRSZ offsetof(BloomTuple, sign)
--- 155,164 
  /*
   * Tuples are very different from all other relations
   */
  typedef struct BloomTuple
  {
  	ItemPointerData heapPtr;
! 	BloomSignatureWord	sign[FLEXIBLE_ARRAY_MEMBER];
  }	BloomTuple;
  
  #define BLOOMTUPLEHDRSZ offsetof(BloomTuple, sign)
*** typedef struct BloomTuple
*** 156,162 
  /* Opaque data structure for bloom index scan */
  typedef struct BloomScanOpaqueData
  {
! 	SignType   *sign;			/* Scan signature */
  	BloomState	state;
  }	BloomScanOpaqueData;
  
--- 166,172 
  /* Opaque data structure for bloom index scan */
  typedef struct BloomScanOpaqueData
  {
! 	BloomSignatureWord   *sign;			/* Scan signature */
  	BloomState	state;
  }	BloomScanOpaqueData;
  
*** extern void BloomFillMetapage(Relation i
*** 170,176 
  extern void BloomInitMetapage(Relation index);
  extern void BloomInitPage(Page page, uint16 flags);
  extern Buffer BloomNewBuffer(Relation index);
! extern void signValue(BloomState * state, SignType * sign, Datum value, int attno);
  extern BloomTuple *BloomFormTuple(BloomState * state, ItemPointer iptr, Datum *values, bool *isnull);
  extern bool BloomPageAddItem(BloomState * state, Page page, BloomTuple * tuple);
  
--- 180,186 
  extern void BloomInitMetapage(Relation index);
  extern void BloomInitPage(Page page, uint16 flags);
  extern Buffer BloomNewBuffer(Relation index);
! extern void signValue(BloomState * state, BloomSignatureWord * sign, Datum value, int attno);
  extern BloomTuple *BloomFormTuple(BloomState * state, ItemPointer iptr, Datum *values, bool *isnull);
  extern bool BloomPageAddItem(BloomState * state, Page page, BloomTuple * tuple);
  
diff --git a/contrib/bloom/blscan.c b/contrib/bloom/blscan.c
index aebf32a..0c954dc 100644
*** a/contrib/bloom/blscan.c
--- b/contrib/bloom/blscan.c
*** blgetbitmap(IndexScanDesc scan, TIDBitma
*** 93,99 
  		/* New search: have to calculate search signature */
  		ScanKey		skey = scan->keyData;
  
! 		so->sign = palloc0(sizeof(SignType) * so->state.opts.bloomLength);
  
  		for (i = 0; i < scan->numberOfKeys; i++)
  		{
--- 93,99 
  		/* New search: have to calculate search signature */
  		ScanKey		skey = scan->keyData;
  
! 		so->sign = palloc0(sizeof(BloomSignatureWord) * so->state.opts.bloomLength);
  
  		for (i = 0; i < 

Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-05-25 Thread Tom Lane
Jeff Janes  writes:
> Given what a Bloom filter is/does, I'm having a hard time seeing how it
> makes much sense to support the boolean type.

> My biggest gripe with it at the moment is that the signature size should be
> expressed in bits, and then internally rounded up to a multiple of 16,
> rather than having it be expressed in 'uint16'.

> If that were done it would be easier to fix the documentation to be more
> understandable.

+1 ... that sort of definition seems much more future-proof, too.
IMO it's not too late to change this.  (We probably don't want to change
the on-disk representation of the reloptions, but we could convert from
bits to words in bloptions().)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-05-25 Thread Jeff Janes
On May 24, 2016 5:27 PM, "David G. Johnston" 
wrote:
>
> Moving my griping to -hackers only
>
> On Tue, May 24, 2016 at 8:08 PM, Tom Lane  wrote:
>>
>> dig...@126.com writes:
>> > postgres=# create unlogged table u_tbl (id int);
>> > CREATE TABLE
>> > postgres=# create index idx_u_tbl on u_tbl using bloom (id);
>> > ERROR:  index "idx_u_tbl" already contains data
>>
>> Yeah, it looks like nobody ever tested bloom's unlogged-index support;
>> it doesn't work or even come very close to working.  Will fix, thanks
>> for the report!
>
>
> ​I'll tack on my own gripe here, just because.
>
> It doesn't give me a lot of confidence in what was committed when the
summary sentence for the module says:
>
> "
> bloom is a module which implements an index access method. It comes as an
example of custom access methods and generic WAL records usage. But it is
also useful in itself.
> ​"​
>
>
> ​Honestly, as a user I couldn't care less that bloom is "an example
custom access method"​.  I want to know what it does and that it does so
reliably, and has a easy-to-use interface.  I complained earlier about its
lack of direct support for the boolean type.  Teodor's response on the
thread wasn't particularly encouraging:
>

Given what a Bloom filter is/does, I'm having a hard time seeing how it
makes much sense to support the boolean type.

My biggest gripe with it at the moment is that the signature size should be
expressed in bits, and then internally rounded up to a multiple of 16,
rather than having it be expressed in 'uint16'.

If that were done it would be easier to fix the documentation to be more
understandable.

On the positive side, I've done extensive crash-recovery testing (not with
unlogged tables, obviously) and that part seems solid.

Cheers,

Jeff


Re: [HACKERS] [BUGS] BUG #14155: bloom index error with unlogged table

2016-05-24 Thread David G. Johnston
Moving my griping to -hackers only

On Tue, May 24, 2016 at 8:08 PM, Tom Lane  wrote:

> dig...@126.com writes:
> > postgres=# create unlogged table u_tbl (id int);
> > CREATE TABLE
> > postgres=# create index idx_u_tbl on u_tbl using bloom (id);
> > ERROR:  index "idx_u_tbl" already contains data
>
> Yeah, it looks like nobody ever tested bloom's unlogged-index support;
> it doesn't work or even come very close to working.  Will fix, thanks
> for the report!
>

​I'll tack on my own gripe here, just because.

It doesn't give me a lot of confidence in what was committed when the
summary sentence for the module says:

"
bloom is a module which implements an index access method. It comes as an
example of custom access methods and generic WAL records usage. But it is
also useful in itself.
​"​


​Honestly, as a user I couldn't care less that bloom is "an example custom
access method"​.  I want to know what it does and that it does so reliably,
and has a easy-to-use interface.  I complained earlier about its lack of
direct support for the boolean type.  Teodor's response on the thread
wasn't particularly encouraging:

https://www.postgresql.org/message-id/5718a59d.4090...@sigaev.ru

I also see that the following -hacker thread didn't get resolved:

https://www.postgresql.org/message-id/cakfquwykrepeselfwb0b85dat466lely8ao-okpwaqpwtmg...@mail.gmail.com

I would not be surprised to see additional problems crop up in the module.
Tom's characterization above just reinforces that.

David J.