Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-16 Thread Amit Kapila
On Sat, Nov 8, 2014 at 1:26 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:


 I just pushed this, after some more minor tweaks.  Thanks, and please do
 continue testing!


Few typo's and few questions

1.  * range.  Need to an extra flag in mmtuples for that.
 */
Datum
brinbulkdelete(PG_FUNCTION_ARGS)

Isn't the part of comment referring *mmtuples* require some change,
as I think mmtuples was used in initial version of patch.

2.
/* ---
 * mt_info is laid out in the following fashion:
 *
 * 7th (high)
bit: has nulls
 * 6th bit: is placeholder tuple
 * 5th bit: unused
 * 4-0 bit: offset of data
 * ---
 */
uint8 bt_info;
} BrinTuple;

Here in comments, bt_info is referred as mt_info.

3.
/*
 * t_info manipulation macros
 */
#define BRIN_OFFSET_MASK 0x1F

I think in above comment it should be bt_info, rather than t_info.

4.
static void
revmap_physical_extend(BrinRevmap *revmap)
{
..
..
START_CRIT_SECTION();

/* the rm_tids array is initialized to all invalid by PageInit */
brin_page_init(page, BRIN_PAGETYPE_REVMAP);
MarkBufferDirty(buf);

metadata-lastRevmapPage = mapBlk;
MarkBufferDirty(revmap-rm_metaBuf);
..
}

Can't we update revmap-rm_lastRevmapPage along with metadata-lastRevmap?

5.
typedef struct BrinMemTuple
{
bool bt_placeholder; /* this is a placeholder tuple */
BlockNumber bt_blkno; /* heap blkno that the tuple is for */
MemoryContext bt_context; /*
memcxt holding the dt_column values */
..
}

How is this memory context getting used?

I could see that this is used brin_deform_tuple() which gets called from
3 other places in core code bringetbitmap(), brininsert() and union_tuples()
and in all the 3 places there is already another temporaray memory context
used to avoid any form of memory leaks.

6.
Is there anyway to force brin index to be off, if not, then do we need it
as it is present for other type of scan's.

like set enable_indexscan=off;


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-11 Thread Greg Stark
On Tue, Nov 11, 2014 at 2:14 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I'm not sure why you say opaque is not associated with the specific
 scan.  Are you thinking we could reuse opaque for a future scan?  I
 think we could consider that opaque *is* the place to cache things such
 as the hashed value of the qual constants or whatever.

Oh. I guess this goes back to my original suggestion that the API docs
need to explain some sense of when OpcInfo is called. I didn't realize
it was tied to a specific scan. This does raise the question of why
the scan information isn't available in OpcInfo though. That would let
me build the hash value in a natural place instead of having to do it
lazily which I find significantly more awkward.

Is it possible for scan keys to change between calls for nested loop
joins or quirky SQL with volatile functions in the scan or anything? I
guess that would prevent the index scan from being used at all. But I
can be reassured the Opcinfo call will be called again when a cached
plan is reexecuted? Stable functions might have new values in a
subsequent execution even if the plan hasn't changed at all for
example.


 That's to test the Union procedure; if you look at the code, it's just
 used in assert-enabled builds.  Now that I think about it, perhaps this
 can turn out to be problematic for your bloom filter opclass.  I
 considered the idea of allowing the opclass to disable this testing
 procedure, but it isn't done (yet.)

No, it isn't a problem for my opclass other than performance, it was
quite helpful in turning up bugs early in fact. It was just a bit
confusing because I was trying to test things one by one and it turned
out the assertion checks meant a simple insert turned up bugs in Union
which I hadn't expected. But it seems perfectly sensible in an
assertion check.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-11 Thread Alvaro Herrera
Greg Stark wrote:
 On Tue, Nov 11, 2014 at 2:14 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  I'm not sure why you say opaque is not associated with the specific
  scan.  Are you thinking we could reuse opaque for a future scan?  I
  think we could consider that opaque *is* the place to cache things such
  as the hashed value of the qual constants or whatever.
 
 Oh. I guess this goes back to my original suggestion that the API docs
 need to explain some sense of when OpcInfo is called. I didn't realize
 it was tied to a specific scan. This does raise the question of why
 the scan information isn't available in OpcInfo though. That would let
 me build the hash value in a natural place instead of having to do it
 lazily which I find significantly more awkward.

Hmm.  OpcInfo is also called in contexts other than scans, though, so
passing down scan keys into it seems wrong.  Maybe we do need another
amproc that initializes the scan for the opclass, which would get
whatever got returned from opcinfo as well as scankeys.  There you would
have the opportunity to run the hash and store it into the opaque.

 Is it possible for scan keys to change between calls for nested loop
 joins or quirky SQL with volatile functions in the scan or anything? I
 guess that would prevent the index scan from being used at all. But I
 can be reassured the Opcinfo call will be called again when a cached
 plan is reexecuted? Stable functions might have new values in a
 subsequent execution even if the plan hasn't changed at all for
 example.

As far as I understand, the scan keys don't change within any given
scan; if they do, the rescan AM method is called, at which point we
should reset whatever is cached about the previous scan.

  That's to test the Union procedure; if you look at the code, it's just
  used in assert-enabled builds.  Now that I think about it, perhaps this
  can turn out to be problematic for your bloom filter opclass.  I
  considered the idea of allowing the opclass to disable this testing
  procedure, but it isn't done (yet.)
 
 No, it isn't a problem for my opclass other than performance, it was
 quite helpful in turning up bugs early in fact. It was just a bit
 confusing because I was trying to test things one by one and it turned
 out the assertion checks meant a simple insert turned up bugs in Union
 which I hadn't expected. But it seems perfectly sensible in an
 assertion check.

Great, thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-11 Thread Greg Stark
On Tue, Nov 11, 2014 at 12:12 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 As far as I understand, the scan keys don't change within any given
 scan; if they do, the rescan AM method is called, at which point we
 should reset whatever is cached about the previous scan.

But am I guaranteed that rescan will throw away the opcinfo struct and
its opaque element? I guess that's the heart of the uncertainty I had.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-11 Thread Alvaro Herrera
Greg Stark wrote:
 On Tue, Nov 11, 2014 at 12:12 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  As far as I understand, the scan keys don't change within any given
  scan; if they do, the rescan AM method is called, at which point we
  should reset whatever is cached about the previous scan.
 
 But am I guaranteed that rescan will throw away the opcinfo struct and
 its opaque element? I guess that's the heart of the uncertainty I had.

Well, it should, and if not that's a bug, which should be fixed by the
attached (untested) patch.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index bd35cf6..441127f 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -519,6 +519,7 @@ brinrescan(PG_FUNCTION_ARGS)
 {
 	IndexScanDesc scan = (IndexScanDesc) PG_GETARG_POINTER(0);
 	ScanKey		scankey = (ScanKey) PG_GETARG_POINTER(1);
+	BrinOpaque *opaque;
 
 	/* other arguments ignored */
 
@@ -526,6 +527,14 @@ brinrescan(PG_FUNCTION_ARGS)
 		memmove(scan-keyData, scankey,
 scan-numberOfKeys * sizeof(ScanKeyData));
 
+	/*
+	 * Reinitialize the opaque struct; some opclasses might choose to
+	 * cache per-scan info in there.
+	 */
+	opaque = (BrinOpaque *) scan-opaque;
+	brin_free_desc(opaque-bo_bdesc);
+	opaque-bo_bdesc = brin_build_desc(scan-indexRelation);
+
 	PG_RETURN_VOID();
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-11 Thread Greg Stark
It might be clearer to have an opclassinfo and a scaninfo which can
store information in separate opc_opaque and scan_opaque fields with
distinct liftetimes.

In the bloom filter case the longlived info is the (initial?) size of
the bloom filter and the number of hash functions. But I still haven't
determined how much it will cost to recalculate them. Right now
they're just hard coded so it doesn't hurt to do it on every rescan
but if it involves peeking at the index reloptions or stats that might
be impractical.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-11 Thread Alvaro Herrera
Greg Stark wrote:
 It might be clearer to have an opclassinfo and a scaninfo which can
 store information in separate opc_opaque and scan_opaque fields with
 distinct liftetimes.
 
 In the bloom filter case the longlived info is the (initial?) size of
 the bloom filter and the number of hash functions. But I still haven't
 determined how much it will cost to recalculate them. Right now
 they're just hard coded so it doesn't hurt to do it on every rescan
 but if it involves peeking at the index reloptions or stats that might
 be impractical.

Patches welcome :-)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-11 Thread Greg Stark
On Tue, Nov 11, 2014 at 1:04 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 It might be clearer ...

 Patches welcome :-)

Or perhaps there could still be a single opaque field but have two
optional opclass methods scaninit and rescan which allow the op
class to set or reset whichever fields inside opaque that need to be
reset.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-10 Thread Alvaro Herrera
Fujii Masao wrote:

 I got the following PANIC error in the standby server when I set up
 the replication servers and ran make installcheck. Note that I was
 repeating the manual CHECKPOINT every second while installcheck
 was running. Without the checkpoints, I could not reproduce the
 problem. I'm not sure if CHECKPOINT really triggers this problem, though.
 Anyway BRIN seems to have a problem around its WAL replay.

Hm, I think I see what's happening.  The xl_brin_update record
references two buffers, one which is target for the updated tuple and
another which is the revmap buffer.  When the update target buffer is
being first used we set the INIT bit which removes the buffer reference
from the xlog record; in that case, if the revmap buffer is first being
modified after the prior checkpoint, that revmap buffer receives backup
block number 0; but the code hardcodes it as 1 on the expectation that
the buffer that's target for the update will receive 0.  The attached
patch should fix this.

I cannot reproduce the issue after applying this patch, can you please
confirm that it fixes the issue for you as well?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/brin/brin_xlog.c b/src/backend/access/brin/brin_xlog.c
index ebef984..2937068 100644
--- a/src/backend/access/brin/brin_xlog.c
+++ b/src/backend/access/brin/brin_xlog.c
@@ -60,9 +60,11 @@ brin_xlog_insert_update(XLogRecPtr lsn, XLogRecord *record,
 	 */
 	if (record-xl_info  XLOG_BRIN_INIT_PAGE)
 	{
-		XLogReadBufferForRedoExtended(lsn, record, 0,
-	  xlrec-node, MAIN_FORKNUM, blkno,
-	  RBM_ZERO, false, buffer);
+		/*
+		 * No full-page image here.  Don't try to read it, because there
+		 * might be one for the revmap buffer, below.
+		 */
+		buffer = XLogReadBuffer(xlrec-node, blkno, true);
 		page = BufferGetPage(buffer);
 		brin_page_init(page, BRIN_PAGETYPE_REGULAR);
 		action = BLK_NEEDS_REDO;
@@ -97,7 +99,9 @@ brin_xlog_insert_update(XLogRecPtr lsn, XLogRecord *record,
 		UnlockReleaseBuffer(buffer);
 
 	/* update the revmap */
-	action = XLogReadBufferForRedo(lsn, record, 1, xlrec-node,
+	action = XLogReadBufferForRedo(lsn, record,
+   record-xl_info  XLOG_BRIN_INIT_PAGE ? 0 : 1,
+   xlrec-node,
    xlrec-revmapBlk, buffer);
 	if (action == BLK_NEEDS_REDO)
 	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-10 Thread Alvaro Herrera
Alvaro Herrera wrote:

 Hm, I think I see what's happening.  The xl_brin_update record
 references two buffers, one which is target for the updated tuple and
 another which is the revmap buffer.  When the update target buffer is
 being first used we set the INIT bit which removes the buffer reference
 from the xlog record; in that case, if the revmap buffer is first being
 modified after the prior checkpoint, that revmap buffer receives backup
 block number 0; but the code hardcodes it as 1 on the expectation that
 the buffer that's target for the update will receive 0.  The attached
 patch should fix this.

Pushed, thanks for the report.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-10 Thread Alvaro Herrera
Greg Stark wrote:

 1) The manual describes the exensibility API including the BrinOpcInfo
 struct -- but it doesn't define the BrinDesc struct that every API
 method takes. It's not clear what exactly that argument is for or how
 to make use of it.

Hm, I guess this could use some expansion.

 2) The mention about additional opclass operators and to number them
 from 11 up is fine -- but there's no explanation of how to decide what
 operators need to be explicitly added like that. Specifically I gather
 from reading minmax that = is handled internally by Brin and you only
 need to add any other operators aside from = ? Is that right?

I think I already replied to this in the other email.

 3) It's not entirely clear in the docs when each method is will be
 invoked. Specifically it's not clear whether opcInfo is invoked once
 when the index is defined or every time the definition is loaded to be
 used. I gather it's the latter? Perhaps there needs to be a method
 that's invoked specifically when the index is defined? I'm wondering
 where I'm going to hook in the logic to determine the size and number
 of hash functions to use for the bloom filter which needs to be
 decided once when the index is created and then static for the index
 in the future.

Every time the index is accessed, yeah.  I'm not sure about figuring the
initial creation details.  Do you think we need another support
procedure to help with that?  We can add it if needed; minmax would just
define it to InvalidOid.

 4) It doesn't look like BRIN handles cross-type operators at all.

The idea here is that there is a separate opclass to handle cross-type
operators, which would be together in the same opfamily as the opclass
used to create the index.  I haven't actually tried this yet, mind you.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-10 Thread Alvaro Herrera
Greg Stark wrote:
 On Sun, Nov 9, 2014 at 5:57 PM, Greg Stark st...@mit.edu wrote:
  2) The mention about additional opclass operators and to number them
  from 11 up is fine -- but there's no explanation of how to decide what
  operators need to be explicitly added like that. Specifically I gather
  from reading minmax that = is handled internally by Brin and you only
  need to add any other operators aside from = ? Is that right?
 
 I see I totally misunderstood the use of the opclass procedure
 functions. I think I understand now but just to be sure -- If I can
 only handle BTEqualStrategyNumber keys then is it adequate to just
 define the opclass containing only the equality operator?

Yes.

I agree that this deserves some more documentation.  In a nutshell, the
opclass must provide three separate groups of items:

1. the mandatory support functions, opcInfo, addValue, Union,
Consistent.  opcInfo is invoked each time the index is accessed
(including during index creation).

2. the additional support functions; normally these are called from
within addValue, Consistent, Union.  For minmax, what we provide is the
functions that implement the inequality operators for the type, that is
 = = and .  Since minmax tries to be generic and support a whole lot
of types, this is the way that the mandatory support functions know what
functions to call to compare two given values.  If the opclass is
specific to one data type, you might not need anything here; or perhaps
you have other ways to figure out a hash function to call, etc.

3. the operators.  We only use these so that the optimizer picks up the
index for queries.


 Somehow I got confused between the amprocs that minmax uses to
 implement the consistency function and the amops that the brin index
 supports.

I think it is somewhat confusing, yeah.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-10 Thread Greg Stark
On Mon, Nov 10, 2014 at 9:31 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Every time the index is accessed, yeah.  I'm not sure about figuring the
 initial creation details.  Do you think we need another support
 procedure to help with that?  We can add it if needed; minmax would just
 define it to InvalidOid.

I have a working bloom filter with hard coded filter size and hard
coded number of hash functions. I need to think about how I'm going to
make it more general now. I think the answer is that I should have an
index option that specifies the false positive rate and calculates the
optimal filter size and number of hash functions. It might possibly
need to peek at the table statistics to determine the population size
though. Or perhaps I should bite the bullet and size the bloom filters
based on the actual number of rows in a chunk since the BRIN
infrastructure does allow each summary to be a different size.

There's another API question I have. To implement Consistent I need to
call the hash function which in the case of functions like hashtext
could be fairly expensive and I even need to generate multiple hash
values(though currently I'm slicing them all from the integer hash
value so that's not too bad) and then test each of those bits. It
would be natural to call hashtext once at the start of the scan and
possibly build a bitmap and compare all of them in a single 
operation. But afaict there's no way to hook the beginning of the scan
and opaque is not associated with the specific scan so I don't think I
can cache the hash value of the scan key there safely. Is there a good
way to do it with the current API?

On a side note I'm curious about something, I was stepping through the
my code in gdb and discovered that a single row insert appeared to
construct a new summary then union it into the existing summary
instead of just calling AddValue on the existing summary. Is that
intentional? What led to that?

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-10 Thread Alvaro Herrera
Greg Stark wrote:

 There's another API question I have. To implement Consistent I need to
 call the hash function which in the case of functions like hashtext
 could be fairly expensive and I even need to generate multiple hash
 values(though currently I'm slicing them all from the integer hash
 value so that's not too bad) and then test each of those bits. It
 would be natural to call hashtext once at the start of the scan and
 possibly build a bitmap and compare all of them in a single 
 operation. But afaict there's no way to hook the beginning of the scan
 and opaque is not associated with the specific scan so I don't think I
 can cache the hash value of the scan key there safely. Is there a good
 way to do it with the current API?

I'm not sure why you say opaque is not associated with the specific
scan.  Are you thinking we could reuse opaque for a future scan?  I
think we could consider that opaque *is* the place to cache things such
as the hashed value of the qual constants or whatever.

 On a side note I'm curious about something, I was stepping through the
 my code in gdb and discovered that a single row insert appeared to
 construct a new summary then union it into the existing summary
 instead of just calling AddValue on the existing summary. Is that
 intentional? What led to that?

That's to test the Union procedure; if you look at the code, it's just
used in assert-enabled builds.  Now that I think about it, perhaps this
can turn out to be problematic for your bloom filter opclass.  I
considered the idea of allowing the opclass to disable this testing
procedure, but it isn't done (yet.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-09 Thread Heikki Linnakangas

On 11/09/2014 08:06 AM, Michael Paquier wrote:

On Sat, Nov 8, 2014 at 5:40 PM, David Rowley dgrowle...@gmail.com wrote:

Please find attached another small fix. This time it's just a small typo in
the README, and just some updates to some, now outdated docs.

Speaking about the feature... The index operators are still named with
minmax, wouldn't it be better to switch to brin?


All the built-in opclasses still implement the min-max policy - they 
store the min and max values. BRIN supports other kinds of opclasses, 
like storing a containing box for points, but no such opclasses have 
been implemented yet.


Speaking of which, Alvaro, any chance we could get such on opclass still 
included into 9.5? It would be nice to have one, just to be sure that 
nothing minmax-specific has crept into the BRIN code.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-09 Thread Fujii Masao
On Sat, Nov 8, 2014 at 4:56 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 I just pushed this, after some more minor tweaks.

Nice!

 Thanks, and please do continue testing!

I got the following PANIC error in the standby server when I set up
the replication servers and ran make installcheck. Note that I was
repeating the manual CHECKPOINT every second while installcheck
was running. Without the checkpoints, I could not reproduce the
problem. I'm not sure if CHECKPOINT really triggers this problem, though.
Anyway BRIN seems to have a problem around its WAL replay.

2014-11-09 22:19:42 JST sby1 WARNING:  page 547 of relation
base/16384/30878 does not exist
2014-11-09 22:19:42 JST sby1 CONTEXT:  xlog redo BRIN/UPDATE: rel
1663/16384/30878 heapBlk 6 revmapBlk 1 pagesPerRange 1 old TID (3,2)
TID (547,2)
2014-11-09 22:19:42 JST sby1 PANIC:  WAL contains references to invalid pages
2014-11-09 22:19:42 JST sby1 CONTEXT:  xlog redo BRIN/UPDATE: rel
1663/16384/30878 heapBlk 6 revmapBlk 1 pagesPerRange 1 old TID (3,2)
TID (547,2)
2014-11-09 22:19:47 JST sby1 LOG:  startup process (PID 15230) was
terminated by signal 6: Abort trap
2014-11-09 22:19:47 JST sby1 LOG:  terminating any other active server processes

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-09 Thread Greg Stark
On Sun, Nov 9, 2014 at 9:18 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Speaking of which, Alvaro, any chance we could get such on opclass still
 included into 9.5? It would be nice to have one, just to be sure that
 nothing minmax-specific has crept into the BRIN code.

I'm trying to do a bloom filter Brin index. I'm already a bit puzzled
by a few things but I've just started so maybe it'll become clear.

From what I've seen so far it feels more likely there's the opposite.
There's some boilerplate that I'm doing that feels like it could be
pushed down into general Brin code since it'll be the same for every
access method.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-09 Thread Greg Stark
On Sun, Nov 9, 2014 at 5:06 PM, Greg Stark st...@mit.edu wrote:
 I'm trying to do a bloom filter Brin index. I'm already a bit puzzled
 by a few things but I've just started so maybe it'll become clear.


So some quick comments from pretty early goings -- partly because I'm
afraid once I get past them I'll forget what it was I was confused
by


1) The manual describes the exensibility API including the BrinOpcInfo
struct -- but it doesn't define the BrinDesc struct that every API
method takes. It's not clear what exactly that argument is for or how
to make use of it.

2) The mention about additional opclass operators and to number them
from 11 up is fine -- but there's no explanation of how to decide what
operators need to be explicitly added like that. Specifically I gather
from reading minmax that = is handled internally by Brin and you only
need to add any other operators aside from = ? Is that right?

3) It's not entirely clear in the docs when each method is will be
invoked. Specifically it's not clear whether opcInfo is invoked once
when the index is defined or every time the definition is loaded to be
used. I gather it's the latter? Perhaps there needs to be a method
that's invoked specifically when the index is defined? I'm wondering
where I'm going to hook in the logic to determine the size and number
of hash functions to use for the bloom filter which needs to be
decided once when the index is created and then static for the index
in the future.

4) It doesn't look like BRIN handles cross-type operators at all. For
example this query with btree indexes can use the index just fine
because it looks up the operator based on both the left and right
operands:

::***# explain select * from data where i = 1::smallint;
┌─┐
│ QUERY PLAN  │
├─┤
│ Index Scan using btree_i on data  (cost=0.42..8.44 rows=1 width=14) │
│   Index Cond: (i = 1::smallint) │
└─┘
(2 rows)

But Minmax opclasses don't contain the cross-type operators and in
fact looking at the code I don't think minmax would be able to cope
(minmax_get_procinfo doesn't even get passed the type int he qual,
only the type of the column).

::***# explain select * from data2 where i = 1::smallint;
┌──┐
│QUERY PLAN│
├──┤
│ Seq Scan on data2  (cost=0.00..18179.00 rows=1 width=14) │
│   Filter: (i = 1::smallint)  │
└──┘
(2 rows)

Time: 0.544 ms

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-09 Thread Greg Stark
On Sun, Nov 9, 2014 at 5:57 PM, Greg Stark st...@mit.edu wrote:
 2) The mention about additional opclass operators and to number them
 from 11 up is fine -- but there's no explanation of how to decide what
 operators need to be explicitly added like that. Specifically I gather
 from reading minmax that = is handled internally by Brin and you only
 need to add any other operators aside from = ? Is that right?

I see I totally misunderstood the use of the opclass procedure
functions. I think I understand now but just to be sure -- If I can
only handle BTEqualStrategyNumber keys then is it adequate to just
define the opclass containing only the equality operator?

Somehow I got confused between the amprocs that minmax uses to
implement the consistency function and the amops that the brin index
supports.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-09 Thread Alvaro Herrera
Heikki Linnakangas wrote:

 Speaking of which, Alvaro, any chance we could get such on opclass still
 included into 9.5? It would be nice to have one, just to be sure that
 nothing minmax-specific has crept into the BRIN code.

Emre Hasegeli contributed a patch for range types.  I am hoping he will
post a rebased version that we can consider including.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-09 Thread Amit Langote
On Sun, Nov 9, 2014 at 10:30 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sat, Nov 8, 2014 at 4:56 AM, Alvaro Herrera alvhe...@2ndquadrant.com 
 wrote:

 I just pushed this, after some more minor tweaks.

 Nice!

 Thanks, and please do continue testing!

 I got the following PANIC error in the standby server when I set up
 the replication servers and ran make installcheck. Note that I was
 repeating the manual CHECKPOINT every second while installcheck
 was running. Without the checkpoints, I could not reproduce the
 problem. I'm not sure if CHECKPOINT really triggers this problem, though.
 Anyway BRIN seems to have a problem around its WAL replay.

 2014-11-09 22:19:42 JST sby1 WARNING:  page 547 of relation
 base/16384/30878 does not exist
 2014-11-09 22:19:42 JST sby1 CONTEXT:  xlog redo BRIN/UPDATE: rel
 1663/16384/30878 heapBlk 6 revmapBlk 1 pagesPerRange 1 old TID (3,2)
 TID (547,2)
 2014-11-09 22:19:42 JST sby1 PANIC:  WAL contains references to invalid pages
 2014-11-09 22:19:42 JST sby1 CONTEXT:  xlog redo BRIN/UPDATE: rel
 1663/16384/30878 heapBlk 6 revmapBlk 1 pagesPerRange 1 old TID (3,2)
 TID (547,2)
 2014-11-09 22:19:47 JST sby1 LOG:  startup process (PID 15230) was
 terminated by signal 6: Abort trap
 2014-11-09 22:19:47 JST sby1 LOG:  terminating any other active server 
 processes


I could reproduce this using the same steps. It's the same page 547
here too if that's any helpful.

Thanks,
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-08 Thread David Rowley
On Sat, Nov 8, 2014 at 8:56 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:


 I just pushed this, after some more minor tweaks.  Thanks, and please do
 continue testing!


Please find attached another small fix. This time it's just a small typo in
the README, and just some updates to some, now outdated docs.

Kind Regards

David Rowley
diff --git a/doc/src/sgml/ref/create_index.sgml 
b/doc/src/sgml/ref/create_index.sgml
index ead3375..18bd0d3 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -57,8 +57,8 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
replaceable class=
 
   para
productnamePostgreSQL/productname provides the index methods
-   B-tree, hash, GiST, SP-GiST, and GIN.  Users can also define their own index
-   methods, but that is fairly complicated.
+   B-tree, hash, GiST, SP-GiST, GIN, and BRIN.  Users can also define their own
+   index methods, but that is fairly complicated.
   /para
 
   para
@@ -166,7 +166,8 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS 
] replaceable class=
para
 The name of the index method to be used.  Choices are
 literalbtree/literal, literalhash/literal,
-literalgist/literal, literalspgist/ and literalgin/.
+literalgist/literal, literalspgist/, literalgin/, and
+literalbrin/.
 The default method is literalbtree/literal.
/para
   /listitem
@@ -492,7 +493,7 @@ Indexes:
   /caution
 
   para
-   Currently, only the B-tree, GiST and GIN index methods support
+   Currently, only the B-tree, GiST, GIN, and BRIN index methods support
multicolumn indexes. Up to 32 fields can be specified by default.
(This limit can be altered when building
productnamePostgreSQL/productname.)  Only B-tree currently
diff --git a/src/backend/access/brin/README b/src/backend/access/brin/README
index 2619be8..636d965 100644
--- a/src/backend/access/brin/README
+++ b/src/backend/access/brin/README
@@ -126,7 +126,7 @@ that page range is complete and new tuples belong in a new 
page range that
 hasn't yet been summarized.  Those insertions do not create a new index
 entry; instead, the page range remains unsummarized until later.
 
-Wehn VACUUM is run on the table, all unsummarized page ranges are
+Whenever VACUUM is run on the table, all unsummarized page ranges are
 summarized.  This action can also be invoked by the user via
 brin_summarize_new_values().  Both these procedures scan all the
 unsummarized ranges, and create a summary tuple.  Again, this includes the

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-08 Thread Michael Paquier
On Sat, Nov 8, 2014 at 5:40 PM, David Rowley dgrowle...@gmail.com wrote:
 Please find attached another small fix. This time it's just a small typo in
 the README, and just some updates to some, now outdated docs.
Speaking about the feature... The index operators are still named with
minmax, wouldn't it be better to switch to brin?
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-07 Thread Alvaro Herrera

I just pushed this, after some more minor tweaks.  Thanks, and please do
continue testing!

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-07 Thread David Rowley
On Sat, Nov 8, 2014 at 8:56 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:


 I just pushed this, after some more minor tweaks.  Thanks, and please do
 continue testing!


I'm having problems getting this to compile on MSVC. Attached is a patch
which fixes the problem.

There also seems to be a bit of a problem with:

brin.c(250): warning C4700: uninitialized local variable 'newsz' used

/*
 * Before releasing the lock, check if we can attempt a same-page
 * update.  Another process could insert a tuple concurrently in
 * the same page though, so downstream we must be prepared to cope
 * if this turns out to not be possible after all.
 */
samepage = brin_can_do_samepage_update(buf, origsz, newsz);

LockBuffer(buf, BUFFER_LOCK_UNLOCK);

newtup = brin_form_tuple(bdesc, heapBlk, dtup, newsz);

Here newsz is passed to brin_can_do_samepage_update before being
initialised. I'm not quite sure of the solution here as I've not spent much
time looking at it, but perhaps brin_form_tuple needs to happen before
brin_can_do_samepage_update, then the lock should be released? I didn't
change this in the patch as I'm not sure if that's the proper fix or not.

The attached should fix the build problem that anole is having:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anoledt=2014-11-07%2022%3A04%3A03

Regards

David Rowley
diff --git a/src/include/access/brin_internal.h 
b/src/include/access/brin_internal.h
index 651ab5f..220e9a9 100644
--- a/src/include/access/brin_internal.h
+++ b/src/include/access/brin_internal.h
@@ -78,7 +78,7 @@ typedef struct BrinDesc
 #if defined(BRIN_DEBUG)  defined(__GNUC__)
 #define BRIN_elog(level, ...)  elog(level, __VA_ARGS__)
 #else
-#define BRIN_elog(a)   void(0)
+#define BRIN_elog(a, ...)  ((void)0)
 #endif
 
 /* brin.c */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-07 Thread Tom Lane
David Rowley dgrowle...@gmail.com writes:
 I'm having problems getting this to compile on MSVC. Attached is a patch
 which fixes the problem.

The committed code is completely broken on compilers that don't accept
varargs macros, and this patch will not make them happier.

Probably what needs to happen is to put extra parentheses into the call
sites, along the lines of

   #ifdef BRIN_DEBUG
   #define BRIN_elog(args) elog args
   #else
   #define BRIN_elog(args) ((void) 0)
   #endif


   BRIN_elog((LOG, fmt, ...));


Or we could decide we don't need this debugging crud anymore and just
nuke it all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-07 Thread Alvaro Herrera
Tom Lane wrote:
 David Rowley dgrowle...@gmail.com writes:
  I'm having problems getting this to compile on MSVC. Attached is a patch
  which fixes the problem.
 
 The committed code is completely broken on compilers that don't accept
 varargs macros, and this patch will not make them happier.

I tried to make it fire only on GCC, which is known to support variadic
macros, but I evidently failed.

 Probably what needs to happen is to put extra parentheses into the call
 sites, along the lines of
 
#ifdef BRIN_DEBUG
#define BRIN_elog(args) elog args
#else
#define BRIN_elog(args) ((void) 0)
#endif
 
 
BRIN_elog((LOG, fmt, ...));

That works for me, thanks for the suggestion.

 Or we could decide we don't need this debugging crud anymore and just
 nuke it all.

I'm removing one which seems pointless, but keeping the others for now.
We can always remove them later.  (I also left BRIN_DEBUG turned on by
default; I'm turning it off.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-07 Thread David Rowley
On Sat, Nov 8, 2014 at 8:56 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:


 I just pushed this, after some more minor tweaks.  Thanks, and please do
 continue testing!


Here's another small fix for some unused variable warnings. Unfortunately
this Microsoft compiler that I'm using does not know about
__attribute__((unused)), so some warnings are generated for these:

BrinTuple  *tmptup PG_USED_FOR_ASSERTS_ONLY;
BrinMemTuple *tmpdtup PG_USED_FOR_ASSERTS_ONLY;
Size tmpsiz PG_USED_FOR_ASSERTS_ONLY;

The attached patch moves these into within the #ifdef USE_ASSERT_CHECKING
section.

I know someone will ask so, let me explain: The reason I don't see a bunch
of other warnings for PG_USED_FOR_ASSERTS_ONLY vars when compiling without
assert checks, is that this Microsoft compiler seems to be ok with
variables being assigned values and the values never being used, but if the
variable is never assigned a value, then it'll warn you of that.

Regards

David Rowley
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index ae65549..7042f67 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -107,9 +107,6 @@ brininsert(PG_FUNCTION_ARGS)
BrinMemTuple *dtup;
BlockNumber heapBlk;
int keyno;
-   BrinTuple  *tmptup PG_USED_FOR_ASSERTS_ONLY;
-   BrinMemTuple *tmpdtup PG_USED_FOR_ASSERTS_ONLY;
-   Size tmpsiz PG_USED_FOR_ASSERTS_ONLY;
 
CHECK_FOR_INTERRUPTS();
 
@@ -139,6 +136,10 @@ brininsert(PG_FUNCTION_ARGS)
 
 #ifdef USE_ASSERT_CHECKING
{
+   BrinTuple  *tmptup;
+   BrinMemTuple *tmpdtup;
+   Size tmpsiz;
+
/*
 * When assertions are enabled, we use this as an 
opportunity to
 * test the union method, which would otherwise be 
used very

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-06 Thread Alvaro Herrera
Jeff Janes wrote:
 On Wed, Nov 5, 2014 at 12:54 PM, Alvaro Herrera alvhe...@2ndquadrant.com
 wrote:
 
 Thanks for the updated patch.
 
 Now when I run the test program (version with better error reporting
 attached), it runs fine until I open a psql session and issue:
 
 reindex table foo;

Interesting.  This was a more general issue actually -- if you dropped
the index at that point and created it again, the resulting index would
also be corrupt in the same way.  Inspecting with the supplied
pageinspect functions made the situation pretty obvious.  The old code
was skipping page ranges in which it could not find any tuples, but
that's bogus and inefficient.  I changed an if into a loop that
inserts intermediary tuples, if any are needed.  I cannot reproduce that
problem anymore.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


brin-24.patch.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-05 Thread Jeff Janes
On Tue, Nov 4, 2014 at 2:28 PM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:


 I was clearly too careless about testing the xlog code --- it had
 numerous bugs.  This version should be a lot better, but there might be
 problems lurking still as I don't think I covered it all.  Let me know
 if you see anything wrong.



At line 252 of brin_xlog.c, should the UnlockReleaseBuffer(metabuf) be
protected by a BufferIsValid?

XLogReadBufferForRedo says that it might return an invalid buffer under
some situations.  Perhaps it is known that those situations can't apply
here?


Now I am getting segfaults during normal (i.e. no intentional crashes)
operations.  I think I was seeing them sometimes before as well, I just
wasn't looking for them.

The attached script invokes the segfault within a few minutes.  A lot of
the stuff in the script is probably not necessary, I just didn't spend the
time to pair it down to the essentials.  It does not need to be in
parallel, I get the crash when invoked with only one job (perl ~/
brin_crash.pl 1).

I think this is related to having block ranges which have no tuples in them
when they are first summarized.  If I take out the with t as (delete from
foo returning *) insert into foo select * from t, then I don't see the
crashes


#0  0x0089ed3e in pg_detoast_datum_packed (datum=0x0) at fmgr.c:2270
#1  0x00869be9 in text_le (fcinfo=0x7fff1bf6b9f0) at varlena.c:1661
#2  0x0089cfc7 in FunctionCall2Coll (flinfo=0x297e640,
collation=100, arg1=0, arg2=43488216) at fmgr.c:1324
#3  0x004678f8 in minmaxConsistent (fcinfo=0x7fff1bf6be40) at
brin_minmax.c:213
#4  0x0089d0c9 in FunctionCall3Coll (flinfo=0x297b830,
collation=100, arg1=43509512, arg2=43510296, arg3=43495856) at fmgr.c:1349
#5  0x00462484 in bringetbitmap (fcinfo=0x7fff1bf6c310) at
brin.c:469
#6  0x0089cfc7 in FunctionCall2Coll (flinfo=0x28f2440, collation=0,
arg1=43495712, arg2=43497376) at fmgr.c:1324
#7  0x004b3fc9 in index_getbitmap (scan=0x297b120,
bitmap=0x297b7a0) at indexam.c:651
#8  0x0062ece0 in MultiExecBitmapIndexScan (node=0x297af30) at
nodeBitmapIndexscan.c:89
#9  0x00619783 in MultiExecProcNode (node=0x297af30) at
execProcnode.c:550
#10 0x0062dea2 in BitmapHeapNext (node=0x2974750) at
nodeBitmapHeapscan.c:104

Cheers,

Jeff


brin_crash.pl
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-05 Thread Jeff Janes
On Wed, Nov 5, 2014 at 12:54 PM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

Thanks for the updated patch.

Now when I run the test program (version with better error reporting
attached), it runs fine until I open a psql session and issue:

reindex table foo;

Then it immediately falls over with some rows no longer being findable
through the index.

-- use index
select count(*) from foo where text_array = md5(4611::text);
0

-- use seq scan
select count(*) from foo where text_array||'' = md5(4611::text);
1

Where the number '4611' was taken from the error message of the test
program.


brin_crash.pl
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-04 Thread Simon Riggs
On 3 November 2014 22:18, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 So here's v21.  I also attach a partial diff from v20, just in case
 anyone wants to give it a look.

Looks really good.

I'd like to reword this sentence in the readme, since one of the main
use cases would be tables without btrees
   It's unlikely that BRIN would be the only
+ indexes in a table, though, because primary keys can be btrees only, and so
+ we don't implement this optimization.

I don't see a regression test. Create, use, VACUUM, just so we know it
hasn't regressed after commit.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-04 Thread Jeff Janes
On Mon, Nov 3, 2014 at 2:18 PM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:


 So here's v21.  I also attach a partial diff from v20, just in case
 anyone wants to give it a look.


This needs a bump to 1.3, or the extension won't install:

contrib/pageinspect/pageinspect.control


During crash recovery, I am getting a segfault:

#0  0x0067ee35 in LWLockRelease (lock=0x0) at lwlock.c:1161
#1  0x00664f4a in UnlockReleaseBuffer (buffer=0) at bufmgr.c:2888
#2  0x00465a88 in brin_xlog_revmap_extend (lsn=value optimized
out, record=value optimized out) at brin_xlog.c:261
#3  brin_redo (lsn=value optimized out, record=value optimized out) at
brin_xlog.c:284
#4  0x004ce505 in StartupXLOG () at xlog.c:6795

I failed to preserve the data directory, I'll try to repeat this later this
week if needed.

Cheers,

Jeff


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-03 Thread Jeff Janes
On Mon, Nov 3, 2014 at 2:18 PM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Robert Haas wrote:
  [lots]

 I have fixed all these items in the attached, thanks -- most
 user-visible change was the pageinspect 1.3 thingy.  pg_upgrade from 1.2
 works fine now.  I also fixed some things Heikki noted, mainly avoid
 retail pfree where possible, and renumber the support procs to leave
 room for future expansion of the framework.  XLog replay code is updated
 too.

 Also, I made the summarization step callable directly from SQL without
 having to invoke VACUUM.

 So here's v21.  I also attach a partial diff from v20, just in case
 anyone wants to give it a look.


I get a couple compiler warnings with this:

brin.c: In function 'brininsert':
brin.c:97: warning: 'tupcxt' may be used uninitialized in this function
brin.c:98: warning: 'oldcxt' may be used uninitialized in this function

Also, I think it is missing a cat version bump.  It let me start the
patched server against an unpatched initdb run, but once started it didn't
find the index method.

What would it take to make CLUSTER work on a brin index?  Now I just added
a btree index on the same column, clustered on that, then dropped that
index.

Thanks,

Jeff


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-03 Thread Alvaro Herrera
Jeff Janes wrote:
 On Mon, Nov 3, 2014 at 2:18 PM, Alvaro Herrera alvhe...@2ndquadrant.com
 wrote:

 I get a couple compiler warnings with this:
 
 brin.c: In function 'brininsert':
 brin.c:97: warning: 'tupcxt' may be used uninitialized in this function
 brin.c:98: warning: 'oldcxt' may be used uninitialized in this function

Ah, that's easily fixed.  My compiler (gcc 4.9 from Debian Jessie
nowadays) doesn't complain, but I can see that it's not entirely
trivial.

 Also, I think it is missing a cat version bump.  It let me start the
 patched server against an unpatched initdb run, but once started it didn't
 find the index method.

Sure, that's expected (by me at least).  I'm too lazy to maintain
catversion bumps in the patch before pushing, since that generates
constant conflicts as I rebase.

 What would it take to make CLUSTER work on a brin index?  Now I just added
 a btree index on the same column, clustered on that, then dropped that
 index.

Interesting question.  What's the most efficient way to pack a table to
minimize the intervals covered by each index entry?  One thing that
makes this project a bit easier, I think, is that CLUSTER has already
been generalized so that it supports either an indexscan or a
seqscan+sort.  If anyone wants to work on this, be my guest; I'm
certainly not going to add it to the initial commit.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-10-30 Thread Robert Haas
+   acronymBRIN/acronym indexes can satisfy queries via the bitmap
+   scanning facility, and will return all tuples in all pages within

The bitmap scanning facility?  Does this mean a bitmap index scan?
Or something novel to BRIN?  I think this could be clearer.

+   This enables them to work as very fast sequential scan helpers to avoid
+   scanning blocks that are known not to contain matching tuples.

Hmm, but they don't actually do anything about sequential scans per
se, right?  I'd say something like: Because a BRIN index is very
small, scanning the index adds little overhead compared to a
sequential scan, but may avoid scanning large parts of the table that
are known not to contain matching tuples.

+   depend on the operator class selected for the data type.

The operator class is selected for the index, not the data type.

+   The size of the block range is determined at index creation time with
+   the literalpages_per_range/ storage parameter.
+   The smaller the number, the larger the index becomes (because of the need to
+   store more index entries), but at the same time the summary data stored can
+   be more precise and more data blocks can be skipped during an index scan.

I would insert a sentence something like this: The number of index
entries will be equal to the size of the relation in pages divided by
the selected value for pages_per_range.  Therefore, the smaller the
number  At least, I would insert that if it's actually true.  My
point is that I think the effect of pages_per_range could be made more
clear.

+   The core productnamePostgreSQL/productname distribution includes
+   includes the acronymBRIN/acronym operator classes shown in
+   xref linkend=gin-builtin-opclasses-table.

Shouldn't that say brin, not gin?

+   requiring the access method implementer only to implement the semantics

The naming of the reverse range map seems a little weird.  It seems
like most operations go through it, so it feels more like the forward
direction.  Maybe I'm misunderstanding.  (I doubt it's worth renaming
it at this point either way, but I thought I'd mention it.)

+  errmsg(unlogged BRIN indexes are not supported)));

Why not?  Shouldn't be particularly hard, I wouldn't think.

I'm pretty sure you need to create a pageinspect--1.3.sql, not just
update the 1.2 file.  Because that's in 9.4, and this won't be.

I'm pretty excited about this feature.  I think it's going to be very
good for PostgreSQL.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-10-29 Thread Alvaro Herrera
Heikki Linnakangas wrote:
 On 10/07/2014 01:33 AM, Alvaro Herrera wrote:

 I added an USE_ASSERTION-only block in brininsert that runs the union
 support proc and compares the output with the one from regular addValue.
 I haven't tested this too much yet.
 
 Ok, that's better than nothing. I wonder if it's too strict, though. It uses
 brin_tuple_equal(), which does a memcmp() on the tuples. That will trip for
 any non-meaningful differences, like the scale in a numeric.

True.  I'm not real sure how to do better, though.  For types that have
a btree opclass it's easy, because we can just use the btree equality
function to compare the values.  But most interesting cases would not
have btree opclasses; those are covered by the minmax family of
opclasses.

 It would be wise to reserve some more support procedure numbers, for future
 expansion. Currently, support procs 1-4 are used by BRIN itself, and higher
 numbers can be used by the opclass. minmax opclasses uses 5-8 for the , =,
 = and  operators. If we ever want to add a new, optional, support function
 to BRIN, we're out of luck. Let's document that e.g. support procs  10 are
 reserved for BRIN.

Sure.  I hope we never need to add a seventh optional support function ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-10-07 Thread Heikki Linnakangas

On 10/07/2014 01:33 AM, Alvaro Herrera wrote:

Heikki Linnakangas wrote:

On 09/23/2014 10:04 PM, Alvaro Herrera wrote:

+ Open questions
+ --
+
+ * Same-size page ranges?
+   Current related literature seems to consider that each index entry in a
+   BRIN index must cover the same number of pages.  There doesn't seem to be a


What is the related literature? Is there an academic paper or
something that should be cited as a reference for BRIN?


I the original minmax-proposal file, I had these four URLs:

: Other database systems already have similar features. Some examples:
:
: * Oracle Exadata calls this storage indexes
:   http://richardfoote.wordpress.com/category/storage-indexes/
:
: * Netezza has zone maps
:   http://nztips.com/2010/11/netezza-integer-join-keys/
:
: * Infobright has this automatically within their data packs according to a
:   May 3rd, 2009 blog post
:   
http://www.infobright.org/index.php/organizing_data_and_more_about_rough_data_contest/
:
: * MonetDB also uses this technique, according to a published paper
:   http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.2662
:   Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS

I gave them all a quick look and none of them touches the approach in
detail; in fact other than the Oracle Exadata one, they are all talking
about something else and mention the minmax stuff only in passing.  I
don't think any of them is worth citing.


I think the current related literature phrase should be removed, if 
there isn't in fact any literature on this. If there's any literature 
worth referencing, should add a proper citation.



I added an USE_ASSERTION-only block in brininsert that runs the union
support proc and compares the output with the one from regular addValue.
I haven't tested this too much yet.


Ok, that's better than nothing. I wonder if it's too strict, though. It 
uses brin_tuple_equal(), which does a memcmp() on the tuples. That will 
trip for any non-meaningful differences, like the scale in a numeric.



* clarify the memory context stuff of support functions that we also
discussed earlier


I re-checked this stuff.  Turns out that the support functions don't
palloc/pfree memory too much, except to update the stuff stored in
BrinValues, by using datumCopy().  This memory is only freed when we
need to update a previous Datum.  There's no way for the brin.c code to
know when the Datum is going to be released by the support proc, and
thus no way for a temp context to be used.

The memory context experiments I alluded to earlier are related to
pallocs done in brininsert / bringetbitmap themselves, not in the
opclass-provided support procs.


At the very least, it needs to be documented.


All in all, I don't think there's much
room for improvement, other than perhaps doing so in brininsert/
bringetbitmap.  Don't really care too much about this either way.


Doing it in brininsert/bringetbitmap seems like the right approach. 
GiST, GIN, and SP-GiST all use a temporary memory context like that.



It would be wise to reserve some more support procedure numbers, for 
future expansion. Currently, support procs 1-4 are used by BRIN itself, 
and higher numbers can be used by the opclass. minmax opclasses uses 5-8 
for the , =, = and  operators. If we ever want to add a new, 
optional, support function to BRIN, we're out of luck. Let's document 
that e.g. support procs  10 are reserved for BRIN.


The redo routines should be updated to follow the new 
XLogReadBufferForRedo idiom (commit 
f8f4227976a2cdb8ac7c611e49da03aa9e65e0d2).


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-26 Thread Erik Rijkers
On Tue, September 23, 2014 21:04, Alvaro Herrera wrote:
 Alvaro Herrera wrote:

 [minmax-19.patch]
 [minmax-19a.patch]

Although admittedly it is not directly likely for us to need it, and although I 
see that there is a BRIN Extensibility
chapter added (good!), I am still a bit surprised by the absence of a built-in 
BRIN operator class for bigint, as the BRIN
index type is specifically useful for huge tables (where after all huge values 
are more likely to occur).

Will a brin int8 be added operator class for 9.5? (I know, quite some time 
left...)

(btw, so far the patch proves quite stable under my abusive testing...)


thanks,


Erik Rijkers









-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-24 Thread Heikki Linnakangas

On 09/23/2014 10:04 PM, Alvaro Herrera wrote:

+  para
+   The acronymBRIN/acronym implementation in 
productnamePostgreSQL/productname
+   is primarily maintained by Aacute;lvaro Herrera.
+  /para


We don't usually have such verbiage in the docs. The GIN and GiST pages 
do, but I think those are a historic exceptions, not something we want 
to do going forward.



+   variablelist
+varlistentry
+ termfunctionBrinOpcInfo *opcInfo(void)//term
+ listitem
+  para
+   Returns internal information about the indexed columns' summary data.
+  /para
+ /listitem
+/varlistentry


I think you should explain what that internal information is. The 
minmax-19a.patch adds the type OID argument to this; remember to update 
the docs.


In SP-GiST, the similar function is called config. It might be good to 
use the same name here, for consistency across indexams (although I 
actually like the opcInfo name better than config)


The docs for the other support functions need to be updated, now that 
you changed the arguments from DeformedBrTuple to BrinValues.



+ !-- this needs improvement ... --
+   To implement these methods in a generic ways, normally the opclass
+   defines its own internal support functions.  For instance, minmax
+   opclasses add the support functions for the four inequality operators
+   for the datatype.
+   Additionally, the operator class must supply appropriate
+   operator entries,
+   to enable the optimizer to use the index when those operators are
+   used in queries.


The above needs improvement ;-)


+BRIN indexes (a shorthand for Block Range indexes)
+store summaries about the values stored in consecutive table physical 
block ranges.


consecutive table physical block ranges is quite a mouthful.


+For datatypes that have a linear sort order, the indexed data
+corresponds to the minimum and maximum values of the
+values in the column for each block range,
+which support indexed queries using these operators:
+
+simplelist
+ memberliterallt;/literal/member
+ memberliterallt;=/literal/member
+ memberliteral=/literal/member
+ memberliteralgt;=/literal/member
+ memberliteralgt;/literal/member
+/simplelist


That's the built-in minmax indexing strategy, yes, but you could have 
others, even for datatypes with a linear sort order.



+ To find out the index tuple for a particular page range, we have an internal


s/find out/find/


+ new heap tuple contains null values but the index tuple indicate there are no


s/indicate/indicates/


+ Open questions
+ --
+
+ * Same-size page ranges?
+   Current related literature seems to consider that each index entry in a
+   BRIN index must cover the same number of pages.  There doesn't seem to be a


What is the related literature? Is there an academic paper or something 
that should be cited as a reference for BRIN?



+  * TODO
+  ** ScalarArrayOpExpr (amsearcharray - SK_SEARCHARRAY)
+  ** add support for unlogged indexes
+  ** ditto expressional indexes


We don't have unlogged indexes in general, so no need to list that here. 
What would be needed to implement ScalarArrayOpExprs?


I didn't realize that expression indexes are still not supported. And I 
see that partial indexes are not supported either. Why not? I wouldn't 
expect BRIN to need to care about those things in particular; the 
expressions for an expressional or partial index are handled in the 
executor, no?



+ /*
+  * A tuple in the heap is being inserted.  To keep a brin index up to date,
+  * we need to obtain the relevant index tuple, compare its stored values with
+  * those of the new tuple; if the tuple values are consistent with the summary
+  * tuple, there's nothing to do; otherwise we need to update the index.


s/compare/and compare/. Perhaps replace one of the semicolons with a 
full stop.



+  * If the range is not currently summarized (i.e. the revmap returns 
InvalidTid
+  * for it), there's nothing to do either.
+  */
+ Datum
+ brininsert(PG_FUNCTION_ARGS)


There is no InvalidTid, as a constant or a #define. Perhaps replace with 
invalid item pointer.



+   /*
+* XXX We need to know the size of the table so that we know how long to
+* iterate on the revmap.  There's room for improvement here, in that we
+* could have the revmap tell us when to stop iterating.
+*/


The revmap doesn't know how large the table is. Remember that you have 
to return all blocks that are not in the revmap, so you can't just stop 
when you reach the end of the revmap. I think the current design is fine.


I have to stop now to do some other stuff. Overall, this is in pretty 
good shape. In addition to little cleanup of things I listed above, and 
similar stuff elsewhere that I didn't read through right now, there are 
a few medium-sized items I'd still like to see addressed before you 
commit this:


* 

Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-23 Thread Alvaro Herrera
Alvaro Herrera wrote:

 Heikki Linnakangas wrote:

  If you add a new datatype, and define b-tree operators for it, what
  is required to create a minmax opclass for it? Would it be possible
  to generalize the functions in brin_minmax.c so that they can be
  reused for any datatype (with b-tree operators) without writing any
  new C code? I think we're almost there; the only thing that differs
  between each data type is the opcinfo function. Let's pass the type
  OID as argument to the opcinfo function. You could then have just a
  single minmax_opcinfo function, instead of the macro to generate a
  separate function for each built-in datatype.
 
 Yeah, that's how I had that initially.  I changed it to what it's now as
 part of a plan to enable building cross-type opclasses, so you could
 have WHERE int8col=42 without requiring a cast of the constant to type
 int8.  This might have been a thinko, because AFAICS it's possible to
 build them with a constant opcinfo as well (I changed several other
 things to support this, as described in a previous email.)  I will look
 into this later.

I found out that we don't really throw errors in such cases anymore; we
insert casts instead.  Maybe there's a performance argument that it
might be better to use existing cross-type operators than casting, but
justifying this work just turned a lot harder.  Here's a patch that
reverts opcinfo into a generic function that receives the type OID.

I will look into adding some testing mechanism for the union support
proc; with that I will just consider the patch ready for commit and will
push.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/src/backend/access/brin/brin.c
--- b/src/backend/access/brin/brin.c
***
*** 886,894  brin_build_desc(Relation rel)
  
  		opcInfoFn = index_getprocinfo(rel, keyno + 1, BRIN_PROCNUM_OPCINFO);
  
- 		/* actually FunctionCall0 but we don't have that */
  		opcinfo[keyno] = (BrinOpcInfo *)
! 			DatumGetPointer(FunctionCall1(opcInfoFn, InvalidOid));
  		totalstored += opcinfo[keyno]-oi_nstored;
  	}
  
--- 886,894 
  
  		opcInfoFn = index_getprocinfo(rel, keyno + 1, BRIN_PROCNUM_OPCINFO);
  
  		opcinfo[keyno] = (BrinOpcInfo *)
! 			DatumGetPointer(FunctionCall1(opcInfoFn,
! 		  tupdesc-attrs[keyno]-atttypid));
  		totalstored += opcinfo[keyno]-oi_nstored;
  	}
  
*** a/src/backend/access/brin/brin_minmax.c
--- b/src/backend/access/brin/brin_minmax.c
***
*** 50,87  typedef struct MinmaxOpaque
  	bool		inited[MINMAX_NUM_PROCNUMS];
  } MinmaxOpaque;
  
! #define OPCINFO(typname, typoid)			\
! PG_FUNCTION_INFO_V1(minmaxOpcInfo_##typname);\
! Datum		\
! minmaxOpcInfo_##typname(PG_FUNCTION_ARGS)	\
! {			\
! 	BrinOpcInfo *result;	\
! 			\
! 	/*		\
! 	 * opaque-operators is initialized lazily, as indicated by 'inited'	\
! 	 * which is initialized to all false by palloc0.\
! 	 */		\
! 			\
! 	result = palloc0(MAXALIGN(SizeofBrinOpcInfo(2)) +		\
! 	 sizeof(MinmaxOpaque));	\
! 	result-oi_nstored = 2;	\
! 	result-oi_opaque = (MinmaxOpaque *)	\
! 		MAXALIGN((char *) result + SizeofBrinOpcInfo(2));	\
! 	result-oi_typids[0] = typoid;			\
! 	result-oi_typids[1] = typoid;			\
! 			\
! 	PG_RETURN_POINTER(result);\
! }
  
! OPCINFO(int4, INT4OID)
! OPCINFO(numeric, NUMERICOID)
! OPCINFO(text, TEXTOID)
! OPCINFO(time, TIMEOID)
! OPCINFO(timetz, TIMETZOID)
! OPCINFO(timestamp, TIMESTAMPOID)
! OPCINFO(timestamptz, TIMESTAMPTZOID)
! OPCINFO(date, DATEOID)
! OPCINFO(char, CHAROID)
  
  /*
   * Examine the given index tuple (which contains partial status of a certain
--- 50,77 
  	bool		inited[MINMAX_NUM_PROCNUMS];
  } MinmaxOpaque;
  
! PG_FUNCTION_INFO_V1(minmaxOpcInfo);
! Datum
! minmaxOpcInfo(PG_FUNCTION_ARGS)
! {
! 	Oid		typoid = PG_GETARG_OID(0);
! 	BrinOpcInfo *result;
! 
! 	/*
! 	 * opaque-operators is initialized lazily, as indicated by 'inited'
! 	 * which is initialized to all false by palloc0.
! 	 */
  
! 	result = palloc0(MAXALIGN(SizeofBrinOpcInfo(2)) +
! 	 sizeof(MinmaxOpaque));
! 	result-oi_nstored = 2;
! 	result-oi_opaque = (MinmaxOpaque *)
! 		MAXALIGN((char *) result + SizeofBrinOpcInfo(2));
! 	result-oi_typids[0] = typoid;
! 	result-oi_typids[1] = typoid;
! 
! 	PG_RETURN_POINTER(result);
! }
  
  /*
   * Examine the given index tuple (which contains partial status of a certain
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
***
*** 436,514  DATA(insert (	4017   25 25 5 4031 ));
  DATA(insert (   4054   23 23 1 3383 ));
  DATA(insert (   4054   23 23 2 3384 ));
  DATA(insert (   4054   23 23 3 3385 ));
! DATA(insert (   4054   23 23 4 3394 ));
  DATA(insert (   4054   23 23 5   66 ));
  

Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-23 Thread Robert Haas
On Tue, Sep 23, 2014 at 3:04 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Alvaro Herrera wrote:
 Heikki Linnakangas wrote:
  If you add a new datatype, and define b-tree operators for it, what
  is required to create a minmax opclass for it? Would it be possible
  to generalize the functions in brin_minmax.c so that they can be
  reused for any datatype (with b-tree operators) without writing any
  new C code? I think we're almost there; the only thing that differs
  between each data type is the opcinfo function. Let's pass the type
  OID as argument to the opcinfo function. You could then have just a
  single minmax_opcinfo function, instead of the macro to generate a
  separate function for each built-in datatype.

 Yeah, that's how I had that initially.  I changed it to what it's now as
 part of a plan to enable building cross-type opclasses, so you could
 have WHERE int8col=42 without requiring a cast of the constant to type
 int8.  This might have been a thinko, because AFAICS it's possible to
 build them with a constant opcinfo as well (I changed several other
 things to support this, as described in a previous email.)  I will look
 into this later.

 I found out that we don't really throw errors in such cases anymore; we
 insert casts instead.  Maybe there's a performance argument that it
 might be better to use existing cross-type operators than casting, but
 justifying this work just turned a lot harder.  Here's a patch that
 reverts opcinfo into a generic function that receives the type OID.

 I will look into adding some testing mechanism for the union support
 proc; with that I will just consider the patch ready for commit and will
 push.

With all respect, I think this is a bad idea.  I know you've put a lot
of energy into this patch and I'm confident it's made a lot of
progress.  But as with Stephen's patch, the final form deserves a
thorough round of looking over by someone else before it goes in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-23 Thread Michael Paquier
On Wed, Sep 24, 2014 at 8:23 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Sep 23, 2014 at 3:04 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Alvaro Herrera wrote:
 I will look into adding some testing mechanism for the union support
 proc; with that I will just consider the patch ready for commit and will
 push.

 With all respect, I think this is a bad idea.  I know you've put a lot
 of energy into this patch and I'm confident it's made a lot of
 progress.  But as with Stephen's patch, the final form deserves a
 thorough round of looking over by someone else before it goes in.

Would this person be it an extra committer or an simple reviewer? It
would give more insurance if such huge patches (couple of thousands of
lines) get an extra +1 from another committer, proving that the code
has been reviewed by people well-experienced with backend code. Now as
this would put more pressure in the hands of committers, an extra
external pair of eyes, be it non-committer but let's say a seasoned
reviewer would be fine IMO.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-23 Thread Alvaro Herrera
Robert Haas wrote:

 With all respect, I think this is a bad idea.  I know you've put a lot
 of energy into this patch and I'm confident it's made a lot of
 progress.  But as with Stephen's patch, the final form deserves a
 thorough round of looking over by someone else before it goes in.

As you can see in the thread, Heikki's put a lot of review effort into
it (including important code contributions); I don't feel I'm rushing it
at this point.  If you or somebody else want to give it a look, I have
no problem waiting a bit longer.  I don't want to delay indefinitely,
though, because I think it's better shipped early in the release cycle
than later, to allow for further refinements and easier testing by other
interested parties.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-23 Thread Robert Haas
On Tue, Sep 23, 2014 at 7:35 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Would this person be it an extra committer or an simple reviewer? It
 would give more insurance if such huge patches (couple of thousands of
 lines) get an extra +1 from another committer, proving that the code
 has been reviewed by people well-experienced with backend code. Now as
 this would put more pressure in the hands of committers, an extra
 external pair of eyes, be it non-committer but let's say a seasoned
 reviewer would be fine IMO.

If you're volunteering, I certainly wouldn't say no.  The more the
merrier.  Same with anyone else.  Since Heikki looked at it before, I
also think it would be appropriate to give him a bit of time to see if
he feels satisfied with it now - nobody on this project has more
experience with indexing than he does, but he may not have the time,
and even if he does, someone else might spot something he misses.

Alvaro's quite right to point out that there is no sense in waiting a
long time for a review that isn't coming.  That just backs everything
up against the end of the release cycle to no benefit.  But if there's
review available from experienced people within the community, taking
advantage of that now might find things that could be much harder to
fix later.  That's a win for everybody.  And it's not like we're
pressed up against the end of the cycle, nor is it as if this feature
has been through endless rounds of review already.  It's certainly had
some, and it's gotten better as a result.  But it's also changed a lot
in the process.

And much of the review to date has been high-level design review, like
how should the opclasses look? and what should we call this thing
anyway?.  Going through it for logic errors, documentation
shortcomings, silly thinkos, etc. has not been done too much, I think,
and definitely not on the latest version.  So, some of that might not
be out of place.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-23 Thread Robert Haas
On Tue, Sep 23, 2014 at 9:23 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Robert Haas wrote:
 With all respect, I think this is a bad idea.  I know you've put a lot
 of energy into this patch and I'm confident it's made a lot of
 progress.  But as with Stephen's patch, the final form deserves a
 thorough round of looking over by someone else before it goes in.

 As you can see in the thread, Heikki's put a lot of review effort into
 it (including important code contributions); I don't feel I'm rushing it
 at this point.

Yeah, I was really glad Heikki looked at it.  That seemed good.

 If you or somebody else want to give it a look, I have
 no problem waiting a bit longer.  I don't want to delay indefinitely,
 though, because I think it's better shipped early in the release cycle
 than later, to allow for further refinements and easier testing by other
 interested parties.

I agree with that.  I'd like to look at it, and I will if I get time,
but as I said elsewhere, I also think it's appropriate to give a
little time around the final version of any big, complex patch just
because people may have thoughts, and they may not have time to
deliver those thoughts the minute the patch hits the list.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-23 Thread Alvaro Herrera
Robert Haas wrote:
 On Tue, Sep 23, 2014 at 9:23 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:

  If you or somebody else want to give it a look, I have
  no problem waiting a bit longer.  I don't want to delay indefinitely,
  though, because I think it's better shipped early in the release cycle
  than later, to allow for further refinements and easier testing by other
  interested parties.
 
 I agree with that.  I'd like to look at it, and I will if I get time,
 but as I said elsewhere, I also think it's appropriate to give a
 little time around the final version of any big, complex patch just
 because people may have thoughts, and they may not have time to
 deliver those thoughts the minute the patch hits the list.

Fair enough -- I'll keep it open for the time being.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-08 Thread Erik Rijkers
On Mon, September 8, 2014 18:02, Alvaro Herrera wrote:
 Here's version 18.  I have renamed it: These are now BRIN indexes.


I get into a BadArgument after:


$ cat crash.sql

-- drop table if exists t_100_000_000 cascade;
   create table t_100_000_000 as select cast(i as integer) from 
generate_series(1, 1) as f(i) ;

-- drop index if exists t_100_000_000_i_brin_idx;
   create index t_100_000_000_i_brin_idx on t_100_000_000 using 
brin(i); select
pg_size_pretty(pg_relation_size('t_100_000_000_i_brin_idx'));

select i from t_100_000_000 where i between 1 and 100; -- ( + 99 )


Log file says:

TRAP: BadArgument(!(((context) != ((void *)0)  (const 
Node*)((context)))-type) == T_AllocSetContext, File:
mcxt.c, Line: 752)
2014-09-08 19:54:46.071 CEST 30151 LOG:  server process (PID 30336) was 
terminated by signal 6: Aborted
2014-09-08 19:54:46.071 CEST 30151 DETAIL:  Failed process was running: select 
i from t_100_000_000 where i between 1
and 100;



The crash is caused by the last select statement; the table and index create 
are OK.

it only happens with a largish table; small tables are OK.



Linux / Centos / 32 GB.

 PostgreSQL 9.5devel_minmax_20140908_1809_0640c1bfc091 on 
x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.9.1, 64-bit

 setting  |  current_setting
--+
 autovacuum   | off
 port | 6444
 shared_buffers   | 100MB
 effective_cache_size | 4GB
 work_mem | 10MB
 maintenance_work_mem | 1GB
 checkpoint_segments  | 20
 data_checksums   | on
 server_version   | 9.5devel_minmax_20140908_1809_0640c1bfc091
 pg_postmaster_start_time | 2014-09-08 19:53 (uptime: 0d 0h 6m 54s)

'--prefix=/var/data1/pg_stuff/pg_installations/pgsql.minmax' 
'--with-pgport=6444'
'--bindir=/var/data1/pg_stuff/pg_installations/pgsql.minmax/bin'
'--libdir=/var/data1/pg_stuff/pg_installations/pgsql.minmax/lib' 
'--enable-depend' '--enable-cassert' '--enable-debug'
'--with-perl' '--with-openssl' '--with-libxml' 
'--with-extra-version=_minmax_20140908_1809_0640c1bfc091'


pgpatches/0095/minmax/20140908/minmax-18.patch


thanks,


Erik Rijkers




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-09-08 Thread Alvaro Herrera
Erik Rijkers wrote:

 Log file says:
 
 TRAP: BadArgument(!(((context) != ((void *)0)  (const 
 Node*)((context)))-type) == T_AllocSetContext, File:
 mcxt.c, Line: 752)
 2014-09-08 19:54:46.071 CEST 30151 LOG:  server process (PID 30336) was 
 terminated by signal 6: Aborted
 2014-09-08 19:54:46.071 CEST 30151 DETAIL:  Failed process was running: 
 select i from t_100_000_000 where i between 1
 and 100;

A double-free mistake -- here's a patch.  Thanks.


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index c89a167..6ac012c 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -388,10 +388,7 @@ bringetbitmap(PG_FUNCTION_ARGS)
 			PointerGetDatum(key));
 	addrange = DatumGetBool(add);
 	if (!addrange)
-	{
-		brin_free_tuple(tup);
 		break;
-	}
 }
 			}
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers