Re: [HACKERS] Must be owner to truncate?

2005-08-23 Thread Andreas Seltenreich
Bruce Momjian schrob:

 Stephen Frost wrote:
 -- Start of PGP signed section.
 * Jim C. Nasby ([EMAIL PROTECTED]) wrote:
  On Thu, Jul 07, 2005 at 01:48:59PM -0400, Tom Lane wrote:
   I don't really agree with the viewpoint that truncate is just a quick
   DELETE, and so I do not agree that DELETE permissions should be enough
   to let you do a TRUNCATE.
  
  What about adding a truncate permission? I would find it useful, as it
  seems would others.
 
 That would be acceptable for me as well.  I'd prefer it just work off
 delete, but as long as I can grant truncate to someone w/o giving them
 ownership rights on the table I'd be happy.

 Added to TODO:

   * Add TRUNCATE permission
   
 Currently only the owner can TRUNCATE a table because triggers are not
 called, and the table is locked in exclusive mode.

Is anyone working on this yet? I looked at the code involved, and it
seems there are just a couple of lines needed, some regression test
and documentation updates, and most importantly, tab-completion
updates.

However, a question arose quickly: According to the standard, revoking
INSERT, UPDATE and DELETE after GRANT ALL PRIVILEGES would leave the
relation read-only, but with the TRUNCATE privilege lying around, this
would no longer be true for PostgreSQL. Would this open a security
hole or is it okay as far as extensions to the standard go?

regards,
Andreas
-- 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] localization problem (and solution)

2005-12-20 Thread Andreas Seltenreich
Tom Lane writes:

 I looked into this a bit more, and it seems the issue is that libperl
 will do
   setlocale(LC_ALL, );
 the first time any locale-related Perl function is invoked.  To defend
 ourselves against that, we'd have to set more environment variables than
 just LC_COLLATE and LC_CTYPE.

 What I'm thinking about is:
 * during startup, putenv(LC_ALL=C) and unsetenv any other LC_ variables
   that may be lurking, except LC_MESSAGES.
 * copy LC_COLLATE and LC_CTYPE into the environment when we get them
   from pg_control, as Manuel suggested.

I'm afraid having LC_ALL in the environment at this time would still
do the wrong thing on setlocale(LC_ALL, ); since a LC_ALL
environment variable overrides the other categories. Maybe setting
LANG instead would be a better choice?

regards,
Andreas
-- 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] psql readline win32

2006-01-02 Thread Andreas Seltenreich
Magnus Hagander writes:

 2) Should we ship a file of standard bindings. We're not 
 going to get it complete, but we could get some of the most 
 common ones in europe at least (in sweden, this would for 
 example include [EMAIL PROTECTED]|). Which would help people a lot.
   
 
 
 Yes we should, at least for Windows - put it in share along 
 with other samples, maybe.

 That was my second question on that - where to put it, and how to find it.

IMHO an elegant solution would be to expose readline's macro/bind
mechanism with a psql command, e.g. like bash's bind does. This way
one could put the macros into the system-wide psqlrc.

regards,
Andreas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PG Extensions: Must be statically linked?

2006-03-03 Thread Andreas Seltenreich
Mark Dilger writes:

 I have had the same concern, though never any hard evidence of a
 problem.  If the C++ functions are wrapped with extern C, and all
 exceptions caught (perhaps converted into error numbers which are then
 returned from the wrapper functions to the plain-C calling functions),
 are there any remaining known problems?  I have often considered
 making a C++ allocator which wrapped palloc and pfree, so that I could
 then use the STL within the backend...

 Has anyone tried this?

I did some experiments on a C++ language handler last year (including
an allocator and a class loader to spare the extern Cs/name
mangling):

http://archives.postgresql.org/pgsql-general/2005-10/msg01570.php
news:[EMAIL PROTECTED]

The remaining issue is the impedance mismatch between longjmp()ing and
exceptions.

regards,
Andreas
-- 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PL/pgSQL 'i = i + 1' Syntax

2006-05-18 Thread Andreas Seltenreich
Mark Dilger schrob:

 Tom Lane wrote:
 No it isn't.  The plpgsql scanner treats := and = as *the same token*.
 They can be interchanged freely.  This has nothing to do with the case
 of modifying a loop variable in particular.

 I disagree.  If the scanner treated them the same, then

   if i := 1 then ...

 would work, but it doesn't.  The := is rejected in a conditional.  Try the
 following code if you don't believe me:

You're confusing the PL/pgSQL scanner with the SQL
scanner. Expressions in PL/pgSQL are handed down to the SQL parser.

regards,
andreas

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] GIN stuck in loop during PITR

2006-05-25 Thread Andreas Seltenreich
I'm just experimenting a bit with GIN, and it is occasionally getting
stuck looping in findParents() during WAL replay.

The attached patch seems to fix it. I also had to set ptr-off as
advertised in the comment above the function to avoid triggering
assertions.

GIN isn't fully transparent to me yet, so it is quite likely that I am
missing something...

regards,
andreas

Index: ginbtree.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/gin/ginbtree.c,v
retrieving revision 1.1
diff -c -r1.1 ginbtree.c
*** ginbtree.c  2 May 2006 11:28:54 -   1.1
--- ginbtree.c  25 May 2006 18:12:13 -
***
*** 202,208 
for(;;) {
buffer = ReadBuffer(btree-index, blkno);
LockBuffer(buffer, GIN_EXCLUSIVE);
!   page = BufferGetPage(root-buffer);
if ( GinPageIsLeaf(page) )
elog(ERROR, Lost path);
  
--- 202,208 
for(;;) {
buffer = ReadBuffer(btree-index, blkno);
LockBuffer(buffer, GIN_EXCLUSIVE);
!   page = BufferGetPage(buffer);
if ( GinPageIsLeaf(page) )
elog(ERROR, Lost path);
  
***
*** 224,229 
--- 224,230 
ptr-blkno = blkno;
ptr-buffer = buffer;
ptr-parent = root; /* it's may be wrong, but in next 
call we will correct */
+   ptr-off = offset;
stack-parent = ptr;
return;
}

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] GIN stuck in loop during PITR

2006-05-26 Thread Andreas Seltenreich
Andreas Seltenreich schrob:

 Teodor Sigaev schrob:

 Thanks a lot, applied. Can you describe test suite? It may be useful
 for test more...

 Here's a shell script that triggers the bug when I revert the patch.

Just tried the script on HEAD, and it was triggering an assertion. I
guess it is because we are still returning InvalidOffsetNumber in the
trivial case (looks like a typo to me). I've attached a patch.

regards,
andreas

Index: ginbtree.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/access/gin/ginbtree.c,v
retrieving revision 1.2
diff -c -r1.2 ginbtree.c
*** ginbtree.c  26 May 2006 08:01:17 -  1.2
--- ginbtree.c  26 May 2006 20:09:45 -
***
*** 189,195 
Assert( !GinPageIsLeaf(page) );
  
/* check trivial case */
!   if ( (root-off != btree-findChildPtr(btree, page, stack-blkno, 
InvalidOffsetNumber)) != InvalidBuffer ) {
stack-parent = root;
return;
}
--- 189,195 
Assert( !GinPageIsLeaf(page) );
  
/* check trivial case */
!   if ( (root-off = btree-findChildPtr(btree, page, stack-blkno, 
InvalidOffsetNumber)) != InvalidOffsetNumber ) {
stack-parent = root;
return;
}

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] backup + restore fails

2006-10-20 Thread Andreas Seltenreich
Holger Schoenen writes:

 ERROR:  invalid byte sequence for encoding UTF8: 0xe46973
 Command was: --
[...]
 -- Started on 2006-09-15 14:56:51 Westeuropäische Normalzeit

The same problem was recently reported on the pgsql-de-allgemein list.

Would just avoiding %Z in Win32's strftime be an acceptable solution?
elog.c is already doing this, however because of the length of the
zone names, not the localization problem.  The attached patch is
completely untested because I don't have access to a win32 box.

regards,
andreas

Index: src/bin/pg_dump/pg_backup_archiver.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/pg_dump/pg_backup_archiver.c,v
retrieving revision 1.137
diff -c -r1.137 pg_backup_archiver.c
*** src/bin/pg_dump/pg_backup_archiver.c14 Oct 2006 23:07:22 -  
1.137
--- src/bin/pg_dump/pg_backup_archiver.c20 Oct 2006 18:59:11 -
***
*** 2780,2785 
  {
charbuf[256];
  
!   if (strftime(buf, 256, %Y-%m-%d %H:%M:%S %Z, localtime(tim)) != 0)
ahprintf(AH, -- %s %s\n\n, msg, buf);
  }
--- 2780,2793 
  {
charbuf[256];
  
!   if (strftime(buf, 256,
!/* Win32 timezone names are long and localized 
and
! * can interfere with utf-8 dumps */
! #ifndef WIN32
!%Y-%m-%d %H:%M:%S %Z,
! #else
!%Y-%m-%d %H:%M:%S,
! #endif
!localtime(tim)) != 0)
ahprintf(AH, -- %s %s\n\n, msg, buf);
  }
Index: src/bin/pg_dump/pg_dumpall.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/pg_dump/pg_dumpall.c,v
retrieving revision 1.84
diff -c -r1.84 pg_dumpall.c
*** src/bin/pg_dump/pg_dumpall.c7 Oct 2006 20:59:05 -   1.84
--- src/bin/pg_dump/pg_dumpall.c20 Oct 2006 18:59:12 -
***
*** 1320,1325 
charbuf[256];
time_t  now = time(NULL);
  
!   if (strftime(buf, 256, %Y-%m-%d %H:%M:%S %Z, localtime(now)) != 0)
printf(-- %s %s\n\n, msg, buf);
  }
--- 1320,1333 
charbuf[256];
time_t  now = time(NULL);
  
!   if (strftime(buf, 256,
!/* Win32 timezone names are long and localized 
and
! * can interfere with utf-8 dumps */
! #ifndef WIN32
!%Y-%m-%d %H:%M:%S %Z,
! #else
!%Y-%m-%d %H:%M:%S,
! #endif
!localtime(now)) != 0)
printf(-- %s %s\n\n, msg, buf);
  }

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Writing triggers in C++

2007-02-14 Thread Andreas Seltenreich
Florian G. Pflug writes:

 Maybe we should create some wiki page or pgfoundry project that collects
 all glue code, tipps and tricks that people invented to glue C++ into
 the postgres backend.

 If it can be made to work, sure; in techdocs.

 I was thinking that two pairs of macros,
 PG_BEGIN_CPP, PG_END_CPP and
 PG_CPP_BEGIN_BACKEND, PG_CPP_END_BACKEND
 should be able to take care of the exception handling issues.

 You'd need to wrap any code-block that calls postgres functions that
 might do an elog(ERROR) inside PG_CPP_BEGIN_BACKEND,
 PG_CPP_END_BACKEND.

 Vice versa, any block of c++ code that is called from the backend would
 need to start with PG_BEGIN_CPP, and end with PG_END_CPP.

I've made positive experiences with such a setup, although I've spared
the PG_BEGIN_CPP/PG_END_CPP by doing the exception conversion in a C++
language handler that instantiates functors using the portable class
loading technique described in this paper:

http://www.s11n.net/papers/classloading_cpp.html

I'd be glad to help out on a pgfoundry project to make C++ a better
citizen for extending postgres.

regards,
andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCH] Add error handling to byteaout.

2015-06-03 Thread Andreas Seltenreich
Alvaro Herrera writes:

 Why not just use an unsigned 64 bit variable?  Also, perhaps
 palloc_huge() avoids the whole problem in the first place ...

Ja, that crossed my mind too, but the current limit is already far
beyond anything that is usually configured for per-backend memory use,
so I dismissed it.

 though it might only move the issue around, if you cannot ship the
 longer-than-1GB resulting escaped value.

For example, when client and server encodings do not match:

,[ mbutils.c ]
|   result = palloc(len * MAX_CONVERSION_GROWTH + 1);
`

This results in the fun fact that the maximum size for bytea values that
are guaranteed to be pg_dumpable regardless of encoding/escaping
settings is lower than 64MB.

One thing that would also mitigate the problem is supporting a more
efficient output format.  For example, there's already means for
base64-encoding in the backend:

self=# select c, length(encode(mkbytea(28),c)) from (values ('hex'),('base64')) 
as v(c);
   c|  length
+---
 hex| 536870912
 base64 | 362623337
(2 rows)

Maybe it is reasonable to make it available as another option for use in
bytea_output?

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add error handling to byteaout.

2015-06-03 Thread Andreas Seltenreich
Piotr Stefaniak writes:
 s/int/Size/ doesn't fix anything on 32-bit machines.

 Postgres requires twos-complement representation, so that the
 assumption that signed integer types wrap around on overflow can be
 safely made.

Thanks for the clarification!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Add error handling to byteaout.

2015-06-02 Thread Andreas Seltenreich
Hi,

when dealing with bytea values that are below the 1GB limit, but too
large to be escaped, the error messages emitted are not very helpful for
users.  Especially if they appear in an unrelated context such as during
pg_dump.  I've attached a patch that adds ereport()ing that would have
prevented some confusion.

Example:

,[ master ]
| ase=# select mkbytea(29);
| ERROR:  invalid memory alloc request size 1073741827
| ase=# set bytea_output to escape;
| ase=# select mkbytea(29);
| ERROR:  invalid memory alloc request size 18446744071562067969
`

The scary one is due to an integer overflow the attached patch also
fixes.  I don't see any security implications though as it's only the
sign bit that is affected.

,[ with patch applied ]
| ase=# set bytea_output to 'escape';
| ase=# select mkbytea(29);
| ERROR:  escaped bytea value would be too big
| DETAIL:  Value would require 2147483649 bytes.
| HINT:  Use a different bytea_output setting or binary methods such as COPY 
BINARY.
`

regards,
Andreas

create function mkbytea(power int, part bytea default '\x00')
   returns bytea as
   $$select case when power0 then mkbytea(power-1,part||part) else part 
end;$$
   language sql;

From f62a101a690fc9251c4c2de9c87323cedd0e9a54 Mon Sep 17 00:00:00 2001
From: Andreas Seltenreich andreas.seltenre...@credativ.de
Date: Mon, 1 Jun 2015 16:17:21 +0200
Subject: [PATCH] Add error handling to byteaout.

Emit a comprehensible error message when escaping fails due to
MaxAllocSize instead of waiting for palloc to fail.  Also use size_t
for size computations to prevent integer overflow in
BYTEA_OUTPUT_ESCAPE branch.
---
 src/backend/utils/adt/varlena.c | 28 +---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 779729d..ec2594e 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -347,8 +347,21 @@ byteaout(PG_FUNCTION_ARGS)
 
if (bytea_output == BYTEA_OUTPUT_HEX)
{
+   Size len;
/* Print hex format */
-   rp = result = palloc(VARSIZE_ANY_EXHDR(vlena) * 2 + 2 + 1);
+   len = (Size)VARSIZE_ANY_EXHDR(vlena) * 2 + 2 + 1;
+
+   if (!AllocSizeIsValid(len)) {
+   ereport(ERROR,
+   (errcode(ERRCODE_OUT_OF_MEMORY),
+errmsg(escaped bytea value would be 
too big),
+errdetail(Value would require %zu 
bytes.,
+  len),
+errhint(Use a different bytea_output 
setting or
+ binary methods such 
as COPY BINARY.)));
+   }
+   
+   rp = result = palloc(len);
*rp++ = '\\';
*rp++ = 'x';
rp += hex_encode(VARDATA_ANY(vlena), VARSIZE_ANY_EXHDR(vlena), 
rp);
@@ -357,8 +370,8 @@ byteaout(PG_FUNCTION_ARGS)
{
/* Print traditional escaped format */
char   *vp;
-   int len;
-   int i;
+   Sizelen;
+   Sizei;
 
len = 1;/* empty string has 1 
char */
vp = VARDATA_ANY(vlena);
@@ -371,6 +384,15 @@ byteaout(PG_FUNCTION_ARGS)
else
len++;
}
+   if (!AllocSizeIsValid(len)) {
+   ereport(ERROR,
+   (errcode(ERRCODE_OUT_OF_MEMORY),
+errmsg(escaped bytea value would be 
too big),
+errdetail(Value would require %zu 
bytes.,
+  len),
+errhint(Use a different bytea_output 
setting or
+ binary methods such 
as COPY BINARY.)));
+   }
rp = result = (char *) palloc(len);
vp = VARDATA_ANY(vlena);
for (i = VARSIZE_ANY_EXHDR(vlena); i != 0; i--, vp++)
-- 
2.1.4



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add error handling to byteaout.

2015-06-02 Thread Andreas Seltenreich
Tom Lane t...@sss.pgh.pa.us writes:

 Andreas Seltenreich andreas.seltenre...@credativ.de writes:
 The scary one is due to an integer overflow the attached patch also
 fixes.

 s/int/Size/ doesn't fix anything on 32-bit machines.

Well, it changes the signedness of the computation on 32-bit, and in
combination with the fact that len is always smaller than 2^32, but
may exceed 2^31-1, the change avoids the dependency on the undefined
behavior of signed integer overflows in C on 32-bit as well.  But I
admit that this might be a rather academic point...

Anyway, my motivation for the patch was the improved error reporting.
Is the drive-by type change a problem here?

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-07-31 Thread Andreas Seltenreich
Hi,

sqlsmith triggered the following assertion in master (c188204).

TRAP: FailedAssertion(!(!bms_overlap(joinrelids, sjinfo-min_lefthand)), 
File: joinrels.c, Line: 500)

As usual, the query is against the regression database.  It is rather
unwieldy… I wonder if I should stop working on new grammar rules and
instead work on some post-processing that prunes the AST as much as
possible while maintaining the failure mode.

regards,
andreas

select
  subq_647409.c0 as c0,
  subq_647409.c0 as c1
from
  public.customer as rel_4116461
  left join public.clstr_tst_s as rel_4116555
  left join information_schema.columns as rel_4116556
  on (rel_4116555.rf_a = rel_4116556.ordinal_position )
right join pg_catalog.pg_roles as rel_4116557
on (rel_4116556.maximum_cardinality = rel_4116557.rolconnlimit )
  on (rel_4116461.passwd = rel_4116557.rolpassword )
left join (select
subq_647410.c8 as c0
  from
public.char_tbl as rel_4116611,
lateral (select
  rel_4116612.name as c0,
  rel_4116612.comment as c1,
  rel_4116612.nslots as c2,
  rel_4116612.comment as c3,
  rel_4116612.nslots as c4,
  rel_4116612.nslots as c5,
  rel_4116612.comment as c6,
  rel_4116612.comment as c7,
  rel_4116612.nslots as c8,
  rel_4116612.nslots as c9,
  rel_4116612.nslots as c10
from
  public.hub as rel_4116612
where rel_4116612.comment ~=~ rel_4116612.comment
fetch first 116 rows only) as subq_647410
  where (subq_647410.c7 !~~* subq_647410.c7)
or ((subq_647410.c3 = subq_647410.c3)
  and ((subq_647410.c7 ~* subq_647410.c6)
and (subq_647410.c7 @@ subq_647410.c7)))
  fetch first 152 rows only) as subq_647409
  inner join public.int4_tbl as rel_4116661
  inner join public.shoe as rel_4116662
  on (rel_4116661.f1 = rel_4116662.sh_avail )
inner join public.rtest_vview3 as rel_4116663
on (rel_4116661.f1 = rel_4116663.a )
  on (subq_647409.c0 = rel_4116662.sh_avail )
on (rel_4116555.b = rel_4116661.f1 )
where ((rel_4116557.rolvaliduntil is NULL)
or (rel_4116663.b !~ rel_4116461.name))
  or (rel_4116661.f1 is not NULL)
fetch first 80 rows only;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] ERROR: failed to build any %d-way joins

2015-08-08 Thread Andreas Seltenreich
Hi,

there's a 1/1e6 chance that a sqlsmith query on the regression db of
master (c124cef) fails with

ERROR:  failed to build any {4..8}-way joins

They all appear to work fine on REL9_5_STABLE.

sample query:

select 1 from
  information_schema.collations as rel_60113935
  inner join information_schema.sql_sizing_profiles as rel_60113936
  on (rel_60113935.pad_attribute = rel_60113936.sizing_name )
inner join information_schema.foreign_tables as rel_60113937
on (rel_60113936.profile_id = rel_60113937.foreign_table_catalog )
  right join information_schema.constraint_table_usage as rel_60113938
  on (rel_60113935.collation_schema = rel_60113938.table_catalog )
left join public.btree_tall_tbl as rel_60113939
on (rel_60113936.required_value = rel_60113939.id )
where rel_60113938.table_name is not NULL;

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] ERROR: failed to build any %d-way joins

2015-08-08 Thread Andreas Seltenreich
I wrote:

 They all appear to work fine on REL9_5_STABLE.

oops, that branch was slightly stale.  Updating it with the latest
planner changes (8dccf03..0853388), the queries fail there as well.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] subplan variable reference / unassigned NestLoopParams

2015-08-08 Thread Andreas Seltenreich
Tom Lane writes:

 Andreas Seltenreich seltenre...@gmx.de writes:
 Tom Lane writes:
 Well, I certainly think all of these represent bugs:
 3 | ERROR:  plan should not reference subplan's variable
 2 | ERROR:  failed to assign all NestLoopParams to plan nodes

 These appear to be related.  The following query produces the former,
 but if you replace the very last reference of provider with the literal
 'bar', it raises the latter error.

 Fixed that, thanks for the test case!

I haven't seen the former since your commit, but there recently were
some instances of the latter.  The attached queries all throw the error
against master at 8752bbb.

regards,
Andreas

-- ERROR:  failed to assign all NestLoopParams to plan nodes
select
  subq_19608926.c0 as c0,
  rel_124211605.objname as c1
from
  (select
   rel_124211602.numeric_precision as c0
 from
   information_schema.sequences as rel_124211602
 where EXISTS (
   select
   rel_124211603.unique_constraint_name as c0
 from
   information_schema.referential_constraints as rel_124211603
 where rel_124211603.unique_constraint_schema is not NULL)) as 
subq_19608926
 left join public.tab1 as rel_124211604
 on (subq_19608926.c0 = rel_124211604.a )
  right join pg_catalog.pg_seclabels as rel_124211605
 right join public.phone as rel_124211606
 on (rel_124211605.objtype = rel_124211606.comment )
  on (rel_124211604.b = rel_124211605.objtype )
inner join public.bt_i4_heap as rel_124211729
  inner join public.bt_i4_heap as rel_124211730
  on (rel_124211729.random = rel_124211730.seqno )
on (subq_19608926.c0 = rel_124211730.seqno )
where rel_124211606.comment = rel_124211604.b;

-- ERROR:  failed to assign all NestLoopParams to plan nodes
select
  subq_53656269.c0 as c0
from
  (select
 rel_339945676.id3c as c0
   from
 public.rule_and_refint_t3 as rel_339945676
   where rel_339945676.data !~~ rel_339945676.data) as subq_53656269
  inner join public.dropcolumn as rel_339945677
  on (subq_53656269.c0 = rel_339945677.b )
inner join public.bt_name_heap as rel_339945678
  left join public.rtest_order2 as rel_339945705
   inner join information_schema.sequences as rel_339945706
   on (rel_339945705.a = rel_339945706.numeric_precision )
 inner join public.num_result as rel_339945707
 on (rel_339945705.b = rel_339945707.id1 )
  on (rel_339945678.random = rel_339945706.numeric_precision )
on (rel_339945677.b = rel_339945706.numeric_precision )
where rel_339945678.seqno ~* rel_339945705.c
fetch first 45 rows only;

-- ERROR:  failed to assign all NestLoopParams to plan nodes
select
  rel_273437910.name as c0,
  rel_273437908.sequence_catalog as c1,
  rel_273437865.b as c2,
  rel_273437910.location as c3,
  rel_273437864.id2c as c4,
  rel_273437908.start_value as c5
from
  public.rule_and_refint_t2 as rel_273437864
  inner join public.ruletest_tbl as rel_273437865
  on (rel_273437864.id2c = rel_273437865.a )
inner join information_schema.sequences as rel_273437908
 inner join public.rules_log as rel_273437909
 on (rel_273437908.numeric_precision_radix = rel_273437909.f1 )
  right join public.emp as rel_273437910
  on (rel_273437909.tag = rel_273437910.name )
on (rel_273437865.a = rel_273437908.numeric_precision )
where rel_273437909.tag !~* rel_273437909.tag
fetch first 110 rows only;

-- ERROR:  failed to assign all NestLoopParams to plan nodes
select
  rel_156464410.minimum_value as c0
from
  public.rule_and_refint_t1 as rel_156464330
  inner join public.rules_src as rel_156464331
  on (rel_156464330.id1b = rel_156464331.f1 )
inner join public.main_table as rel_156464401
  inner join pg_catalog.pg_file_settings as rel_156464402
 left join public.person as rel_156464409
   inner join information_schema.sequences as rel_156464410
   on (rel_156464409.age = rel_156464410.numeric_precision )
 on (rel_156464402.sourceline = rel_156464409.age )
  on (rel_156464401.a = rel_156464402.sourceline )
on (rel_156464331.f1 = rel_156464410.numeric_precision )
where rel_156464402.sourcefile @@ rel_156464409.name
fetch first 155 rows only;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-01 Thread Andreas Seltenreich
Tom Lane writes:

 Well, I certainly think all of these represent bugs:

 [...]

thanks for priorizing them.  I'll try to digest them somewhat before
posting.

 This one's pretty darn odd, because 2619 is pg_statistic and not an index
 at all:

  4 | ERROR:  cache lookup failed for index 2619

This is actually the one from the README :-).  Quoting to spare media
discontinuity:

--8---cut here---start-8---
Taking a closer look at it reveals that it happens when you query a
certain catalog view like this:

  self=# select indexdef from pg_catalog.pg_indexes where indexdef is not NULL;
  FEHLER:  cache lookup failed for index 2619

This is because the planner then puts pg_get_indexdef(oid) in a context
where it sees non-index-oids, which causes it to croak:

 QUERY PLAN

 Hash Join  (cost=17.60..30.65 rows=9 width=4)
   Hash Cond: (i.oid = x.indexrelid)
   -  Seq Scan on pg_class i  (cost=0.00..12.52 rows=114 width=8)
 Filter: ((pg_get_indexdef(oid) IS NOT NULL) AND (relkind = 
'i'::char))
   -  Hash  (cost=17.31..17.31 rows=23 width=4)
 -  Hash Join  (cost=12.52..17.31 rows=23 width=4)
   Hash Cond: (x.indrelid = c.oid)
   -  Seq Scan on pg_index x  (cost=0.00..4.13 rows=113 width=8)
   -  Hash  (cost=11.76..11.76 rows=61 width=8)
 -  Seq Scan on pg_class c  (cost=0.00..11.76 rows=61 
width=8)
   Filter: (relkind = ANY ('{r,m}'::char[]))
--8---cut here---end---8---

thanks,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] subplan variable reference / unassigned NestLoopParams (was: [sqlsmith] Failed assertion in joinrels.c)

2015-08-02 Thread Andreas Seltenreich
Tom Lane writes:

 Well, I certainly think all of these represent bugs:

  3 | ERROR:  plan should not reference subplan's variable
  2 | ERROR:  failed to assign all NestLoopParams to plan nodes

These appear to be related.  The following query produces the former,
but if you replace the very last reference of provider with the literal
'bar', it raises the latter error.

select 1 from
  pg_catalog.pg_shseclabel as rel_09
  inner join public.rtest_view2 as rel_32
  left join pg_catalog.pg_roles as rel_33
  on (rel_32.a = rel_33.rolconnlimit )
  on (rel_09.provider = rel_33.rolpassword )
left join pg_catalog.pg_user as rel_35
on (rel_33.rolconfig = rel_35.useconfig )
where ( ((rel_09.provider ~~ 'foo')
  and (rel_35.usename ~* rel_09.provider)));

,[ FWIW: git bisect run ]
| first bad commit: [e83bb10d6dcf05a666d4ada00d9788c7974ad378]
| Adjust definition of cheapest_total_path to work better with LATERAL.
`

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-05 Thread Andreas Seltenreich
Tom Lane writes:

 On 08/01/2015 05:59 PM, Tom Lane wrote:
 Well, I certainly think all of these represent bugs:
 1 | ERROR:  could not find pathkey item to sort

 Hmm ... I see no error with these queries as of today's HEAD or
 back-branch tips.  I surmise that this was triggered by one of the other
 recently-fixed bugs, though the connection isn't obvious offhand.

I still see this error in master as of b8cbe43, but the queries are
indeed a pita to reproduce.  The one below is the only one so far that
is robust against running ANALYZE on the regression db, and also
reproduces when I run it as an EXTRA_TEST with make check.

regards,
Andreas

select
  rel_217088662.a as c0,
  rel_217088554.a as c1,
  rel_217088662.b as c2,
  subq_34235266.c0 as c3,
  rel_217088660.id2 as c4,
  rel_217088660.id2 as c5
from
  public.clstr_tst as rel_217088554
inner join (select
   rel_217088628.a as c0
 from
   public.rtest_vview3 as rel_217088628
 where (rel_217088628.b !~ rel_217088628.b)
   and rel_217088628.b ~~* rel_217088628.b)
 or (rel_217088628.b ~* rel_217088628.b))
   or (rel_217088628.b  rel_217088628.b))
 or (rel_217088628.b = rel_217088628.b))) as subq_34235266
 inner join public.num_exp_mul as rel_217088660
   inner join public.onek2 as rel_217088661
   on (rel_217088660.id1 = rel_217088661.unique1 )
 on (subq_34235266.c0 = rel_217088660.id1 )
  inner join public.main_table as rel_217088662
  on (rel_217088661.unique2 = rel_217088662.a )
on (rel_217088554.b = rel_217088660.id1 )
where rel_217088554.d = rel_217088554.c
fetch first 94 rows only;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-03 Thread Andreas Seltenreich
Peter Geoghegan writes:

 On Fri, Jul 31, 2015 at 5:56 PM, Andreas Seltenreich seltenre...@gmx.de 
 wrote:
 sqlsmith triggered the following assertion in master (c188204).

 Thanks for writing sqlsmith. It seems like a great tool.

 I wonder, are you just running the tool with assertions enabled when
 PostgreSQL is built?

Right.  I have to admit my testing setup is still more tailored towards
testing sqlsmith than postgres.

 If so, it might make sense to make various problems more readily
 detected.  As you may know, Clang has a pretty decent option called
 AddressSanitizer that can detect memory errors as they occur with an
 overhead that is not excessive.

I didn't known this clang feature yet, thanks for pointing it out.  I
considered running some instances under valgrind to detect these, but
the performance penalty seemed not worth it.

 One might use the following configure arguments when building
 PostgreSQL to use AddressSanitizer:

 ./configure CC=clang CFLAGS='-O1 -g -fsanitize=address
 -fno-omit-frame-pointer -fno-optimize-sibling-calls' --enable-cassert

A quick attempt to sneak these in made my ansible playbooks unhappy due
to make check failures and other generated noise.  I'll try to have an
instance with the AddressSanitizer active soon though.

 Of course, it remains to be seen if this pays for itself. Apparently
 the tool has about a 2x overhead [1]. I'm really not sure that you'll
 find any more bugs this way, but it's certainly possible that you'll
 find a lot more. Given your success in finding bugs without using
 AddressSanitizer, introducing it may be premature.

Piotr also suggested on IRC to run coverage tests w/ sqlsmith.  This
could yield valuable hints in which direction to extend sqlsmith's
grammar.

Thanks,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c

2015-08-01 Thread Andreas Seltenreich
Tom Lane writes:

 What concerns me more is that what you're finding is only cases that trip
 an assertion sanity check.  It seems likely that you're also managing to
 trigger other bugs with less drastic consequences, such as could not
 devise a query plan failures or just plain wrong answers.

Ja, some of these are logged as well[1], but most of them are really as
undrastic as can get, and I was afraid reporting them would be more of a
nuisance.  I analysed a couple of the cache lookup failures, and they
all had a similar severreness than the example in the README[2].  The
operator ones I analysed seem due to intentionally broken operators in
the regression db.  The NestLoopParams and subplan reference one sound
interesting though…

 I'm not sure how we could identify wrong answers automatically :-(

Csmith isn't doing this either.  They discuss differential testing
though in their papers, i.e., comparing the results of different
products.  Maybe a simple metric like numbers of rows returned might
already be valuable for correctness checks.

I also thought about doing some sampling on the data and simulating
relational operations and check for witness tuples, but it is probably
not appropriate to start implementing a mini-rdbms on the client side.

 but it might be worth checking for XX000 SQLSTATE responses, since
 generally that should be a can't-happen case.  (Or if it can happen,
 we need to change the errcode.)

The sqlstate is currently missing in the reports because libpqxx is not
putting it in it's exceptions :-/.

regards,
Andreas

Footnotes: 
[1]  smith=# select * from report24h;
 count |  error 
  
---+--
 43831 | ERROR:  unsupported XML feature
 39496 | ERROR:  invalid regular expression: quantifier operand invalid
 27261 | ERROR:  canceling statement due to statement timeout
 21386 | ERROR:  operator does not exist: point = point
  8580 | ERROR:  cannot compare arrays of different element types
  5019 | ERROR:  invalid regular expression: brackets [] not balanced
  4646 | ERROR:  could not determine which collation to use for string 
comparison
  2583 | ERROR:  invalid regular expression: nfa has too many states
  2248 | ERROR:  operator does not exist: xml = xml
  1198 | ERROR:  operator does not exist: polygon = polygon
  1171 | ERROR:  cache lookup failed for index 16862
   677 | ERROR:  invalid regular expression: parentheses () not balanced
   172 | ERROR:  cache lookup failed for index 257148
84 | ERROR:  could not find member 1(34520,34520) of opfamily 1976
55 | ERROR:  missing support function 1(34516,34516) in opfamily 1976
42 | ERROR:  operator does not exist: city_budget = city_budget
13 | ERROR:  could not find commutator for operator 34538
10 | ERROR:  could not identify a comparison function for type xid
 4 | Connection to database failed
 4 | ERROR:  cache lookup failed for index 2619
 3 | ERROR:  plan should not reference subplan's variable
 2 | ERROR:  cache lookup failed for index 12322
 2 | ERROR:  failed to assign all NestLoopParams to plan nodes
 2 | ERROR:  invalid regular expression: invalid character range
 1 | ERROR:  could not find pathkey item to sort
(25 rows)
Time: 1158,990 ms

[2]  https://github.com/anse1/sqlsmith/blob/master/README.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] ERROR: too late to create a new PlaceHolderInfo

2015-08-07 Thread Andreas Seltenreich
Hi,

on master at 36d9345, the attached queries raised too late to create a
new PlaceHolderInfo.

regards,
Andreas

select
   subq_218206.c0 as c0
 from
   (select
 rel_1375794.sl_name as c0,
 coalesce(rel_1375793.f2, rel_1375793.f2) as c1
   from
 public.subselect_tbl as rel_1375793
   inner join public.shoe_ready as rel_1375794
   on (rel_1375793.f1 = rel_1375794.sh_avail )
   where 34  21
   fetch first 99 rows only) as subq_218205,
   lateral (select
 rel_1375795.b as c0
   from
 public.rtest_t5 as rel_1375795
   where rel_1375795.b  coalesce(rel_1375795.b, rel_1375795.b)
   fetch first 122 rows only) as subq_218206
 where EXISTS (
   select
   coalesce(rel_1375802.tablename, rel_1375802.schemaname) as c0
 from
   (select
   rel_1375801.b as c0,
   rel_1375801.b as c1,
   rel_1375801.b as c2,
   rel_1375801.a as c3
 from
   public.rtest_t9 as rel_1375801
 where rel_1375801.b ~~ rel_1375801.b
 fetch first 180 rows only) as subq_218208
 left join pg_catalog.pg_policies as rel_1375802
 on (subq_218208.c2 = rel_1375802.cmd )
 where rel_1375802.qual !~* rel_1375802.cmd
 fetch first 118 rows only)
 fetch first 154 rows only;
 select
   coalesce(subq_734225.c1, subq_734225.c1) as c0,
   rel_4650803.srvversion as c1
 from
   (select
   rel_4650756.name as c0,
   subq_734226.c0 as c1
 from
   public.street as rel_4650756,
   lateral (select
 rel_4650757.conproc as c0
   from
 pg_catalog.pg_conversion as rel_4650757
   where rel_4650757.conowner = rel_4650757.connamespace
   fetch first 52 rows only) as subq_734226
 where rel_4650756.name ~~* coalesce(rel_4650756.cname, 
rel_4650756.cname)) as subq_734225
 right join (select
 rel_4650790.thepath as c0,
 rel_4650790.name as c1
   from
 public.road as rel_4650790
   where rel_4650790.thepath  rel_4650790.thepath) as subq_734232
   inner join pg_catalog.pg_policies as rel_4650802
   right join pg_catalog.pg_foreign_server as rel_4650803
   on (rel_4650802.tablename = rel_4650803.srvname )
 inner join pg_catalog.pg_roles as rel_4650804
 on (rel_4650803.srvtype = rel_4650804.rolpassword )
   on (subq_734232.c1 = rel_4650802.cmd )
 on (subq_734225.c0 = rel_4650803.srvtype )
 where (((rel_4650804.rolname !~ subq_734225.c0)
   and (((rel_4650802.tablename is NULL)
   or (rel_4650802.qual !~* subq_734225.c0))
 and (subq_734225.c0 ~ subq_734225.c0)))
 and ((rel_4650802.policyname is not NULL)
   or (rel_4650802.roles is NULL)))
   and (rel_4650802.with_check ~~ subq_734225.c0)
 fetch first 63 rows only;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in analyzejoins.c

2015-08-06 Thread Andreas Seltenreich
Hi,

this one was in today's sqlsmith harvest.  It triggers an assertion in
current master (c030dc4).

regards,
Andreas

-- TRAP: FailedAssertion(!(!bms_is_empty(phinfo-ph_eval_at)), File: 
analyzejoins.c, Line: 474)
select
  rel_141618057.srvfdw as c0,
  rel_141618057.srvversion as c1
from
  public.tv as rel_141617979
right join pg_catalog.pg_foreign_server as rel_141618057
  left join pg_catalog.pg_roles as rel_141618058
  on (rel_141618057.srvname = rel_141618058.rolname )
on (rel_141617979.type = rel_141618058.rolpassword )
where rel_141618057.srvfdw is not NULL;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

2015-07-26 Thread Andreas Seltenreich
Hi,

when running my random query generator contraption[1] against the
regression database of 9.5 or master, it occasionally triggers one of
the following three assertions.  Someone more knowledgeable might want
to take a look at them...

-- FailedAssertion(!(outer_rel-rows  0), File: indxpath.c, Line: 1911)
-- sample query:
select
  rel1925354.loid as c0,
  rel1925353.version as c1
from
  (select
rel1925352.aa as c0,
rel1925352.aa as c1
  from
public.b as rel1925352
  where (rel1925352.bb is NULL)
and (rel1925352.bb  rel1925352.bb)) as subq_303136
  inner join pg_catalog.pg_stat_ssl as rel1925353
  on (subq_303136.c0 = rel1925353.pid )
right join pg_catalog.pg_largeobject as rel1925354
on (subq_303136.c0 = rel1925354.pageno )
where (rel1925353.clientdn !~ rel1925353.clientdn)
  and (rel1925353.cipher = rel1925353.clientdn);

,[ git bisect ]
|   first bad commit: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve
|   predtest.c's ability to reason about operator expressions.
`

-- FailedAssertion(!(!bms_is_empty(phinfo-ph_eval_at)), File: 
placeholder.c, Line: 109)
-- sample query:
select
  rel1600276.viewowner as c0,
  rel1600274.maxwritten_clean as c1,
  rel1600275.n_tup_hot_upd as c2
from
  pg_catalog.pg_stat_bgwriter as rel1600274
  inner join pg_catalog.pg_stat_xact_all_tables as rel1600275
  on (rel1600274.maxwritten_clean = rel1600275.seq_scan )
right join pg_catalog.pg_views as rel1600276
  right join pg_catalog.pg_operator as rel1600277
  on (rel1600276.viewname = rel1600277.oprname )
on (rel1600275.relname = rel1600277.oprname )
where 3 is not NULL;

,[ git bisect ]
|   first bad commit: [f4abd0241de20d5d6a79b84992b9e88603d44134] Support
|   flattening of empty-FROM subqueries and one-row VALUES tables.
`

-- FailedAssertion(!(key-sk_flags  0x0080), File: brin_minmax.c, Line: 
177)
-- sample query:
select
  rel167978.namecol as c0
from
  information_schema.parameters as rel167972
left join public.student as rel167977
  inner join public.brintest as rel167978
  on (rel167977.age = rel167978.int4col )
on (rel167972.interval_precision = rel167977.age )
where rel167977.name  rel167977.name;

regards,
andreas

Footnotes: 
[1]  https://github.com/anse1/sqlsmith


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

2015-07-27 Thread Andreas Seltenreich
Tom Lane writes:

 Andreas Seltenreich seltenre...@gmx.de writes:
 when running my random query generator contraption[1] against the
 regression database of 9.5 or master, it occasionally triggers one of
 the following three assertions.

 I've fixed the first two of these --- thanks for the report!

I let sqlsmith run during the night, and it did no longer trigger the
first two.  During roughly a million random queries it triggered the
already mentioned brin one 10 times, but there was also one instance of
this new one in the log:

TRAP: FailedAssertion(!(join_clause_is_movable_into(rinfo, joinrel-relids, 
join_and_req)), File: relnode.c, Line: 987)
LOG:  server process (PID 12851) was terminated by signal 6: Aborted
DETAIL:  Failed process was running: select  
  rel65543066.tmplname as c0, 
  rel65543064.umuser as c1
from 
  public.dept as rel65543059
inner join pg_catalog.pg_user_mappings as rel65543064
left join pg_catalog.pg_enum as rel65543065
on (rel65543064.srvname = rel65543065.enumlabel )
  inner join pg_catalog.pg_ts_template as rel65543066
  on (rel65543065.enumtypid = rel65543066.tmplnamespace )
on (rel65543059.dname = rel65543064.srvname )
where ((rel65543059.mgrname = rel65543059.mgrname) 
and (rel65543064.usename = rel65543066.tmplname)) 
  and (rel65543059.mgrname ~~ rel65543059.mgrname)
fetch first 128 rows only;

 ,[ git bisect ]
 |   first bad commit: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve
 |   predtest.c's ability to reason about operator expressions.
 `

 I'm a bit confused about this aspect of your report though, because in
 my hands that example fails clear back to 9.2.  It doesn't seem to require
 the predtest.c improvement to expose the fault.

Hmm, I actually used a different, uglier query to trigger this assertion
for the bisection run.  I'll attach it[1] along with the complete git
bisect log[2].

regards,
andreas

Footnotes: 
[1]  select  subq_717608.c3 as c0, rel4551421.inhrelid as c1, 
rel4551421.inhrelid as c2, subq_717608.c3 as c3 from 
information_schema.foreign_tables as rel4551363 right join public.hash_f8_heap 
as rel4551366 inner join pg_catalog.pg_constraint as rel4551419 inner join 
(select  rel4551420.bb as c0, rel4551420.aa as c1, rel4551420.aa as c2, 
rel4551420.aa as c3 from public.b as rel4551420 where ( 
rel4551420.bbrel4551420.bb ) and ( rel4551420.bbrel4551420.bb ) ) as 
subq_717608 on (rel4551419.coninhcount = subq_717608.c1 ) left join 
pg_catalog.pg_inherits as rel4551421 on (subq_717608.c1 = rel4551421.inhseqno ) 
on (rel4551366.seqno = subq_717608.c1 ) on (rel4551363.foreign_table_schema = 
rel4551419.conname ) where ( ( rel4551419.contypidrel4551419.connamespace ) 
and ( rel4551419.connamespace=rel4551419.conrelid ) ) and ( 
rel4551421.inhparentrel4551419.contypid )  fetch first 9 rows only ;

[2] git bisect start
# bad: [3b5a89c4820fb11c337838c1ad71e8e93f2937d1] Fix resource leak pointed out 
by Coverity.
git bisect bad 3b5a89c4820fb11c337838c1ad71e8e93f2937d1
# good: [e6df2e1be6330660ba4d81daa726ae4a71535aa9] Stamp 9.4beta1.
git bisect good e6df2e1be6330660ba4d81daa726ae4a71535aa9
# bad: [68e66923ff629c324e219090860dc9e0e0a6f5d6] Add missing volatile 
qualifier.
git bisect bad 68e66923ff629c324e219090860dc9e0e0a6f5d6
# bad: [a16bac36eca8158cbf78987e95376f610095f792] Remove dependency on 
wsock32.lib in favor of ws2_32
git bisect bad a16bac36eca8158cbf78987e95376f610095f792
# good: [55d5b3c08279b487cfa44d4b6e6eea67a0af89e4] Remove unnecessary output 
expressions from unflattened subqueries.
git bisect good 55d5b3c08279b487cfa44d4b6e6eea67a0af89e4
# bad: [1cbc9480106241aaa8db112331e93d0a265b6db0] Check interrupts during 
logical decoding more frequently.
git bisect bad 1cbc9480106241aaa8db112331e93d0a265b6db0
# bad: [686f362bee126e50280bcd3b35807b02f18a8966] Fix 
contrib/pg_upgrade/test.sh for $PWD containing spaces.
git bisect bad 686f362bee126e50280bcd3b35807b02f18a8966
# bad: [be76a6d39e2832d4b88c0e1cc381aa44a7f86881] Secure Unix-domain sockets of 
make check temporary clusters.
git bisect bad be76a6d39e2832d4b88c0e1cc381aa44a7f86881
# good: [6554656ea2043c5bb877b427237dc5ddd7c5e5c8] Improve tuplestore's error 
messages for I/O failures.
git bisect good 6554656ea2043c5bb877b427237dc5ddd7c5e5c8
# bad: [a7205d81573cb0c979f2d463a1d9edb6f97c94aa] Adjust 9.4 release notes.
git bisect bad a7205d81573cb0c979f2d463a1d9edb6f97c94aa
# bad: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve predtest.c's ability 
to reason about operator expressions.
git bisect bad 3f8c23c4d31d4a0e801041733deb2c7cfa577b32
# good: [c81e63d85f0c2c39d3fdfd8b95fc1ead6fdcb89f] Fix pg_restore's processing 
of old-style BLOB COMMENTS data.
git bisect good c81e63d85f0c2c39d3fdfd8b95fc1ead6fdcb89f
# first bad commit: [3f8c23c4d31d4a0e801041733deb2c7cfa577b32] Improve 
predtest.c's ability to reason about operator

Re: [HACKERS] 9.3.9 and pg_multixact corruption

2015-09-13 Thread Andreas Seltenreich
Thomas Munro writes:

> In various places we have int pageno = offset / (uint32) 1636, expanded
> from this macro (which calls the offset an xid):

It appears to depend on the context it is expanded in, as some of the
code must have gotten the segment number right:

,[ ls -sh pg_multixact/members/ ]
| 256K 97E0
| [...]
| 256K A03B
|  24K A03C = -5FC4
|0 5FC4
`

> I don't really see how any uint32 value could produce such a pageno via
> that macro.  Even if called in an environment where (xid) is accidentally
> an int, the int / unsigned expression would convert it to unsigned first
> (unless (xid) is a bigger type like int64_t: by the rules of int promotion
> you'd get signed division in that case, hmm...).  But it's always called
> with a MultiXactOffset AKA uint32 variable.

I managed disassemble RecordNewMultiXact from the core dump using a
cross-binutils, and it reveals that the compiler[1] appears to have
indeed generated a signed division here.  I'm attaching a piece of C
code that does the same computation as the assembly (I think), as well
as the disassembly itself.

regards,
Andreas

Footnotes: 
[1]  Sun C 5.12 SunOS_sparc Patch 148917-07 2013/10/18, 64-bit

#include 
#include 
#include 

uint32_t offset2page(uint32_t offset)
{
  uint64_t l0, i3, i5, i4, l6_1, l6_2, o2, o4, o1, o3;
  l0 = offset;
  i3 = 0x5fc3e800ULL;
  i5 = i3 ^ -375;
  i4 = (int32_t)l0;

  l6_1 = i4 * i5;
  o2 = l6_1 >> 32;
  o4 = l0 + o2;
  o1 = ((int32_t)o4) >> 10;
  o3 = ((int32_t)l0) >> 31;
  l6_2 = o1 - o3;
  return l6_2;
}

int main(int argc, char *argv[])
{
  uint32_t page = offset2page(atol(argv[1]));
  printf("page: %d\n", page);
  printf("segment: %04X\n", (int32_t)page/32);
  return 0;
}

RecordNewMultiXact:
   100112be8:   07 00 04 01 sethi  %hi(0x100400), %g3
   100112bec:   9d e3 bf 30 save  %sp, -208, %sp
   100112bf0:   82 10 e2 bd or  %g3, 0x2bd, %g1
   100112bf4:   90 10 20 0e mov  0xe, %o0
   100112bf8:   a5 36 20 00 srl  %i0, 0, %l2
   100112bfc:   9f 28 70 0c sllx  %g1, 0xc, %o7
   100112c00:   a2 0c a7 ff and  %l2, 0x7ff, %l1
   100112c04:   b9 3e a0 00 sra  %i2, 0, %i4
   100112c08:   b4 03 ed d8 add  %o7, 0xdd8, %i2
   100112c0c:   b1 36 60 00 srl  %i1, 0, %i0
   100112c10:   f0 23 a8 af st  %i0, [ %sp + 0x8af ]
   100112c14:   b2 10 00 1b mov  %i3, %i1
   100112c18:   40 08 1d 66 call  0x10031a1b0 ; LWLockAcquire
   100112c1c:   92 10 20 00 clr  %o1
   100112c20:   93 34 a0 0b srl  %l2, 0xb, %o1
   100112c24:   90 10 00 1a mov  %i2, %o0
   100112c28:   97 34 a0 00 srl  %l2, 0, %o3
   100112c2c:   7f ff f6 bd call  0x100110720 ; SimpleLruReadPage
   100112c30:   94 10 20 01 mov  1, %o2
   100112c34:   fa 5e a0 00 ldx  [ %i2 ], %i5
   100112c38:   ab 3a 20 00 sra  %o0, 0, %l5
   100112c3c:   96 10 20 01 mov  1, %o3
   100112c40:   a9 2d 70 03 sllx  %l5, 3, %l4
   100112c44:   e0 5f 60 08 ldx  [ %i5 + 8 ], %l0
   100112c48:   a7 3c 60 00 sra  %l1, 0, %l3
   100112c4c:   ad 2c f0 02 sllx  %l3, 2, %l6
   100112c50:   da 5c 00 14 ldx  [ %l0 + %l4 ], %o5
   100112c54:   f0 23 40 16 st  %i0, [ %o5 + %l6 ]
   100112c58:   d8 5e a0 00 ldx  [ %i2 ], %o4
   100112c5c:   d4 5b 20 18 ldx  [ %o4 + 0x18 ], %o2
   100112c60:   d6 2a 80 15 stb  %o3, [ %o2 + %l5 ]
   100112c64:   40 08 1f 2d call  0x10031a918 ; LWLockRelease
   100112c68:   90 10 20 0e mov  0xe, %o0
   100112c6c:   90 10 20 0f mov  0xf, %o0
   100112c70:   40 08 1d 50 call  0x10031a1b0 ; LWLockAcquire
   100112c74:   92 10 20 00 clr  %o1
   100112c78:   80 a7 20 00 cmp  %i4, 0
   100112c7c:   04 40 00 7c ble,pn   %icc, 0x100112e6c
   100112c80:   90 07 3f ff add  %i4, -1, %o0
   100112c84:   e0 03 a8 af ld  [ %sp + 0x8af ], %l0
   100112c88:   37 17 f0 fa sethi  %hi(0x5fc3e800), %i3
   100112c8c:   86 10 3f ff mov  -1, %g3
   100112c90:   ba 1e fe 89 xor  %i3, -375, %i5
   100112c94:   d0 23 a8 b7 st  %o0, [ %sp + 0x8b7 ]
   100112c98:   a2 10 26 64 mov  0x664, %l1
   100112c9c:   c0 23 a8 b3 clr  [ %sp + 0x8b3 ]
   100112ca0:   37 10 1e 0b sethi  %hi(0x40782c00), %i3
   100112ca4:   b9 3c 20 00 sra  %l0, 0, %i4
   100112ca8:   ac 4f 00 1d mulx  %i4, %i5, %l6
   100112cac:   95 35 b0 20 srlx  %l6, 0x20, %o2
   100112cb0:   ba 10 26 63 mov  0x663, %i5
   100112cb4:   98 04 00 0a add  %l0, %o2, %o4
   100112cb8:   b8 10 20 01 mov  1, %i4
   100112cbc:   93 3b 20 0a sra  %o4, 0xa, %o1
   100112cc0:   97 3c 20 1f sra  %l0, 0x1f, %o3
   100112cc4:   ac 22 40 0b sub  %o1, %o3, %l6
   100112cc8:   89 2d a0 04 sll  %l6, 4, %g4
   100112ccc:   82 01 00 16 add  %g4, %l6, %g1
   100112cd0:   b1 28 60 02 sll  %g1, 2, %i0
   100112cd4:   84 26 00 01 sub  %i0, %g1, %g2
   100112cd8:   f0 03 a8 af ld  [ %sp + 0x8af ], %i0
   100112cdc:   9f 28 a0 03 sll  %g2, 3, %o7
   100112ce0:   90 05 80 0f add  %l6, 

[HACKERS] RemoveLocalLock pfree'ing NULL when out-of-memory

2015-09-20 Thread Andreas Seltenreich
Hi,

a memory-starved instance of sqlsmith just caught RemoveLocalLock
pfree'ing a NULL in locallock->lockOwners.  I think what happened is
that it was called to clean up after LockAcquireExtended's
MemoryContextAlloc failed.  The content of errordata seems consistent
with this.

Caught in master as of 85eda7e (sorry about the bogus hashes in earlier
reports, I had some cruft on my local branch).

regards,
Andreas

FailedAssertion("!(pointer != ((void *)0))", File: "mcxt.c", Line: 1002)

#3  0x007e1c80 in pfree (pointer=) at mcxt.c:1002
#4  0x006bdd24 in RemoveLocalLock (locallock=locallock@entry=0x3a90d68) 
at lock.c:1225
#5  0x006c1ceb in LockReleaseAll (lockmethodid=lockmethodid@entry=1, 
allLocks=1 '\001') at lock.c:2083
#6  0x006c3274 in ProcReleaseLocks (isCommit=isCommit@entry=0 '\000') 
at proc.c:752
#7  0x007e3700 in ResourceOwnerReleaseInternal 
(owner=owner@entry=0x208b488, phase=phase@entry=RESOURCE_RELEASE_LOCKS, 
isCommit=isCommit@entry=0 '\000', isTopLevel=isTopLevel@entry=1 '\001') at 
resowner.c:307
#8  0x007e381f in ResourceOwnerRelease (owner=0x208b488, 
phase=phase@entry=RESOURCE_RELEASE_LOCKS, 
isCommit=isCommit@entry=0 '\000', isTopLevel=isTopLevel@entry=1 '\001') at 
resowner.c:212
#9  0x004e903b in AbortTransaction () at xact.c:2557
#10 0x004e98ad in AbortCurrentTransaction () at xact.c:3003
#11 0x006d45a3 in PostgresMain (argc=1, argv=0x202e638, 
dbname=0x202e610 "regression", username=0x202e5f0 "smith")
at postgres.c:3856
#12 0x00466964 in BackendRun (port=0x204e080) at postmaster.c:4204
#13 BackendStartup (port=0x204e080) at postmaster.c:3880
#14 ServerLoop () at postmaster.c:1683
#15 0x0067867e in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x202d600) at postmaster.c:1292
#16 0x0046774d in main (argc=3, argv=0x202d600) at main.c:223

(gdb) p errordata[0]
$4 = {elevel = 20, output_to_server = 0 '\000', output_to_client = 1 '\001', 
show_funcname = 0 '\000', hide_stmt = 0 '\000', 
  hide_ctx = 0 '\000', filename = 0x9a1f80 "mcxt.c", lineno = 769, funcname = 
0x9a24b0 <__func__.5880> "MemoryContextAlloc", 
  domain = 0x9350f6 "postgres-9.6", context_domain = 0x9350f6 "postgres-9.6", 
sqlerrcode = 8389, 
  message = 0x296d020 "out of memory", detail = 0x296cfe8 "Failed on request of 
size 128.", detail_log = 0x0, hint = 0x0, 
  context = 0x0, schema_name = 0x0, table_name = 0x0, column_name = 0x0, 
datatype_name = 0x0, constraint_name = 0x0, cursorpos = 0, 
  internalpos = 0, internalquery = 0x0, saved_errno = 12, assoc_context = 
0x296a7a8}


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3.9 and pg_multixact corruption

2015-09-25 Thread Andreas Seltenreich
[ adding Bjorn Munch to Cc ]

Jim Nasby writes:
> On 9/20/15 9:23 AM, Christoph Berg wrote:
>> a short update here: the customer updated the compiler to a newer
>> version, is now compiling using -O2 instead of -O3, and the code
>> generated now looks sane, so this turned out to be a compiler issue.
>> (Though it's unclear if the upgrade fixed it, or the different -O
>> level.)
>
> Do we officially not support anything > -O2? If so it'd be nice if
> configure threw at least a warning (if not an error that you had to
> explicitly over-ride).

At least the solaris binaries distributed via postgresql.org[1] have
been compiled with -xO3 according to pg_config.  And their code for
multixact.c looks inconspicuous.  To recap the data points:

| compiler  | flags | multixact.o |
|---+---+-|
| Sun C 5.12 SunOS_sparc Patch 148917-07 2013/10/18 | -xO3  | bad |
| Sun C 5.13 SunOS_Sparc 2014/10/20 | -xO2  | good|
| Sun C 5.8 Patch 121015-04 2007/01/10  | -xO3  | good|

regards,
Andreas

Footnotes: 
[1]  http://www.postgresql.org/download/solaris/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3.9 and pg_multixact corruption

2015-09-25 Thread Andreas Seltenreich
Alvaro Herrera writes:

> Jim Nasby wrote:
>> Do we officially not support anything > -O2? If so it'd be nice if configure
>> threw at least a warning (if not an error that you had to explicitly
>> over-ride).
>
> Keep in mind this is Sun OS C -- not one of the most popular compilers
> in the world.  I don't know what you suggest: have a test program that
> configure runs and detects whether the compiler does the wrong thing?
> It doesn't seem a sane idea to maintain test cases for all known
> compiler bugs ...

I think the intention was to make configure complain if there's a -O > 2
in CFLAGS.

OTOH, a unit test for multixact.c that exercises the code including
wraparounds sounds like a desirable thing regardless of the fact that it
could have caught this miscompilation earlier than 6 months into
production.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-07 Thread Andreas Seltenreich
I wrote:

> Tom Lane writes:
>> Andreas Seltenreich <seltenre...@gmx.de> writes:
>>> I've added new grammar rules to sqlsmith and improved some older ones.
>>> This was rewarded with a return of "failed to generate plan" errors.
>>
>> I believe I've dealt with these cases now.  Thanks for the report!
>
> I no longer see "failed to build any n-way joins" after pulling, but
> there are still instances of "could not devise a query plan". Samples below.

sorry, I spoke too soon: nine of the former have been logged through the
night.  I'm attaching a larger set of sample queries this time in case
that there are still multiple causes for the observed errors.

regards,
Andreas

-- select query||';' from error e
--  where (msg ~~ 'ERROR:  failed to build any%')
--  and e.t > now() - interval '8 hours'
--  order by length(query) asc;

select  
  ref_0.collation_schema as c0
from 
  (select  
  sample_0.is_supported as c0
from 
  information_schema.sql_packages as sample_0 tablesample system (0.8) 
where 25 > 15
fetch first 60 rows only) as subq_0
right join information_schema.collations as ref_0
on (subq_0.c0 = ref_0.collation_catalog ),
  lateral (select  
ref_1.f2 as c0, 
ref_1.f1 as c1, 
ref_0.collation_name as c2, 
ref_0.pad_attribute as c3, 
subq_0.c0 as c4, 
subq_0.c0 as c5, 
subq_0.c0 as c6
  from 
public.func_index_heap as ref_1
  where ref_1.f2 ~ ref_1.f1
  fetch first 170 rows only) as subq_1,
  lateral (select  
sample_8.b as c0, 
subq_1.c1 as c1, 
sample_8.a as c2
  from 
public.clstr_tst as sample_8 tablesample system (8.8) 
left join public.f_star as sample_9 tablesample bernoulli (7.5) 
on (sample_8.a = sample_9.aa )
  left join public.rules_log as sample_10 tablesample bernoulli (7) 
  on (sample_8.b = sample_10.f1 )
  where sample_10.tag <> sample_8.c
  fetch first 126 rows only) as subq_2
where (ref_0.collation_name is NULL) 
  or (subq_2.c1 !~ subq_2.c1)
fetch first 56 rows only;
select  
  subq_16.c4 as c0
from 
  public.rule_and_refint_t1 as sample_18 tablesample system (9.7) 
left join (select  
sample_20.xx as c0, 
sample_20.xx as c1
  from 
public.inhf as sample_20 tablesample bernoulli (1) 
  where sample_20.xx is not NULL) as subq_11
  inner join public.main_table as sample_25 tablesample system (1.2) 
inner join (select  
  sample_26.b as c0, 
  subq_15.c1 as c1, 
  subq_15.c0 as c2, 
  32 as c3, 
  subq_15.c0 as c4
from 
  public.dropcolumn as sample_26 tablesample system (7.6) ,
  lateral (select  
sample_27.t as c0, 
sample_26.b as c1
  from 
public.radix_text_tbl as sample_27 tablesample system (4.6) 
  where (sample_26.b is not NULL) 
and (sample_27.t ~~ sample_27.t)) as subq_15
where subq_15.c0 !~ subq_15.c0) as subq_16
on (sample_25.b = subq_16.c0 )
  on (subq_11.c0 = subq_16.c2 )
on (sample_18.id1a = sample_25.a ),
  lateral (select  
subq_16.c3 as c0, 
subq_17.c1 as c1, 
sample_18.id1a as c2, 
coalesce(subq_16.c0, subq_16.c1) as c3
  from 
public.rtest_vcomp as ref_21,
lateral (select  
  sample_28.a as c0, 
  ref_21.size_in_cm as c1, 
  subq_11.c0 as c2, 
  sample_18.id1a as c3
from 
  public.tab1 as sample_28 tablesample system (8.2) 
where subq_11.c1 <= sample_28.b
fetch first 111 rows only) as subq_17
  where ref_21.size_in_cm is NULL
  fetch first 101 rows only) as subq_18
where subq_11.c0 ~>=~ subq_11.c1
fetch first 94 rows only;
select  
  subq_33.c0 as c0, 
  subq_33.c0 as c1
from 
  (select  
sample_95.y as c0, 
sample_95.z as c1
  from 
public.check2_tbl as sample_95 tablesample system (5.1) 
  where sample_95.y = sample_95.y) as subq_28
  right join pg_catalog.pg_user as ref_101
  on (subq_28.c0 = ref_101.passwd )
left join pg_catalog.pg_opfamily as sample_96 tablesample bernoulli (9.8) 
on (ref_101.usesysid = sample_96.opfmethod ),
  lateral (select  
subq_31.c0 as c0, 
sample_97.e as c1, 
sample_97.e as c2
  from 
public.dropcolumnchild as sample_97 tablesample system (2.4) ,
lateral (select  
  subq_30.c0 as c0
from 
  public.random_tbl as sample_98 tablesample system (8.4) ,
  lateral (select  
subq_28.c1 as c0, 
 

[HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-06 Thread Andreas Seltenreich
Hi,

I've added new grammar rules to sqlsmith and improved some older ones.
This was rewarded with a return of "failed to generate plan" errors.
The failing queries all contain a lateral subquery.  The shortest of the
failing queries are below.  They were run against the regression db of
master as of db07236.

regards,
Andreas

smith=# select msg, query from error where
   (firstline(msg) ~~ 'ERROR:  failed to build any%'
   or firstline(msg) ~~ 'ERROR:  could not devise a query plan%')
  and t > now() - interval '1 day' order by length(query) asc limit 3;

ERROR:  failed to build any 8-way joins
select
  ref_96.foreign_table_schema as c0,
  sample_87.is_supported as c1
from
  information_schema.sql_packages as sample_87 tablesample system (0.2)
right join information_schema._pg_foreign_tables as ref_96
on (sample_87.feature_id = ref_96.foreign_table_catalog ),
  lateral (select
sample_87.is_verified_by as c0,
ref_97.indexed_col as c1,
coalesce(sample_87.feature_id, ref_96.foreign_server_name) as c2,
4 as c3
  from
public.comment_test as ref_97
  where ref_97.id ~>~ ref_97.indexed_col
  fetch first 73 rows only) as subq_33
where ref_96.foreign_table_name ~~ subq_33.c1

ERROR:  could not devise a query plan for the given query
select
  subq_43.c0 as c0
from
  (select
  ref_181.installed as c0
from
  pg_catalog.pg_available_extension_versions as ref_181,
  lateral (select
ref_181.name as c0,
ref_181.installed as c1
  from
pg_catalog.pg_conversion as ref_182
  where ref_182.conname ~~* ref_181.version
  fetch first 98 rows only) as subq_42
where (subq_42.c0 is not NULL)
  or (subq_42.c1 is NULL)) as subq_43
right join pg_catalog.pg_language as sample_177 tablesample system (2.8)
on (subq_43.c0 = sample_177.lanispl )
where sample_177.lanowner < sample_177.lanvalidator

ERROR:  failed to build any 5-way joins
select
  ref_239.id2 as c0,
  40 as c1,
  ref_239.id2 as c2,
  ref_238.aa as c3
from
  public.tt5 as sample_289 tablesample system (8.1)
inner join information_schema.element_types as ref_237
on (sample_289.x = ref_237.character_maximum_length )
  left join public.b as ref_238
  on (ref_237.character_maximum_length = ref_238.aa )
left join public.num_exp_mul as ref_239
on (ref_237.numeric_precision_radix = ref_239.id1 ),
  lateral (select
sample_290.b as c0,
sample_289.y as c1,
ref_239.id2 as c2
  from
public.rtest_t8 as sample_290 tablesample bernoulli (4.6)
  where (sample_290.b > ref_238.bb)
and (sample_289.y > ref_239.expected)
  fetch first 91 rows only) as subq_64
where (subq_64.c1 > sample_289.y)
  and (sample_289.y = ref_239.expected)
fetch first 133 rows only


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-07 Thread Andreas Seltenreich
Tom Lane writes:

> Andreas Seltenreich <seltenre...@gmx.de> writes:
>> I've added new grammar rules to sqlsmith and improved some older ones.
>> This was rewarded with a return of "failed to generate plan" errors.
>
> I believe I've dealt with these cases now.  Thanks for the report!

I no longer see "failed to build any n-way joins" after pulling, but
there are still instances of "could not devise a query plan". Samples below.

regards,
Andreas

select
  ref_1.aa as c0,
  subq_1.c1 as c1,
  coalesce(ref_1.class, ref_1.class) as c2,
  subq_1.c0 as c3
from
  (select
  subq_0.c1 as c0,
  coalesce(sample_0.a, sample_1.i) as c1
from
  public.rtest_t9 as sample_0 tablesample bernoulli (5.6)
inner join public.iportaltest as sample_1 tablesample bernoulli 
(9.8)
on (sample_0.a = sample_1.i ),
  lateral (select
sample_1.d as c0,
ref_0.a as c1,
sample_1.p as c2,
ref_0.a as c3,
ref_0.a as c4,
sample_0.b as c5,
sample_1.i as c6
  from
public.rtest_view2 as ref_0
  where sample_0.b = sample_0.b
  fetch first 93 rows only) as subq_0
where sample_0.b ~<=~ sample_0.b) as subq_1
right join public.e_star as ref_1
on (subq_1.c0 = ref_1.aa )
where ref_1.cc < ref_1.cc
fetch first 59 rows only;

select
  sample_69.tmpllibrary as c0,
  coalesce(sample_69.tmplname, sample_69.tmplname) as c1,
  subq_33.c0 as c2
from
  (select
  coalesce(ref_53.provider, sample_68.typdefault) as c0
from
  pg_catalog.pg_type as sample_68 tablesample bernoulli (6.9)
inner join pg_catalog.pg_shseclabel as ref_53
on (sample_68.typowner = ref_53.objoid ),
  lateral (select
sample_68.typcategory as c0,
ref_54.speaker as c1,
ref_54.speaker as c2
  from
public.test_range_excl as ref_54
  where (ref_53.label >= ref_53.provider)
and (ref_53.label !~* ref_53.provider)
  fetch first 143 rows only) as subq_32
where ref_53.label ~>~ ref_53.label) as subq_33
right join pg_catalog.pg_pltemplate as sample_69 tablesample bernoulli (9.8)
on (subq_33.c0 = sample_69.tmplhandler )
where sample_69.tmplvalidator ~ subq_33.c0
fetch first 131 rows only;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-10 Thread Andreas Seltenreich
Tom Lane writes:

> [2. transitive-lateral-fixes-1.patch]

I was about to write that sqlsmith likes the patch, but after more than
10^8 ok queries the attached ones were generated.

regards,
Andreas



post-patch-errors.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-11 Thread Andreas Seltenreich
Tom Lane writes:
> [2. transitive-lateral-fixes-2.patch ]
> [2. remove-lateraljoininfo-2.patch ]

They seem to have fixed the issue for good now.  No errors have been
logged for 2e8 queries since applying the first patch.  (The second one
was applied later and didn't get as much exposure.)  I guess that means
I have to go back to extending the grammar again :-).

regards
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-11 Thread Andreas Seltenreich
Peter Geoghegan writes:

> On Sun, Dec 6, 2015 at 9:52 AM, Andreas Seltenreich <seltenre...@gmx.de> 
> wrote:
>> I've added new grammar rules to sqlsmith and improved some older ones.
>
> Could you possibly teach sqlsmith about INSERT ... ON CONFLICT DO
> UPDATE/IGNORE? I think that that could be very helpful, especially if
> it could be done in advance of any stable release of 9.5.

In summary, it can't be added ad-hoc, but might still happen in advance
of the release of 9.5.

Adding upsert needs significiant effort because historically,
non-boolean value expression productions yield a random type.  This is
not a problem for generating queries, but it is for inserts.  Also,
sqlsmith can at the moment only generate sensible value expressions from
column references.  Generating a proper upsert would require supporting
type-constraining of productions as well as adding productions for
pulling values out of thin air (e.g., generating atomic value subselects
or calling argumentless functions).

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-14 Thread Andreas Seltenreich
David Fetter writes:

> On Mon, Dec 14, 2015 at 03:06:18PM +0900, Michael Paquier wrote:
>> On Sun, Dec 13, 2015 at 10:14 AM, Andreas Seltenreich wrote:
>> > https://github.com/anse1/sqlsmith
>> 
>> I am in awe regarding this stuff, which has caught many bugs
>> already, it is a bit sad that it is released under the GPL license
>> preventing a potential integration into PostgreSQL core to
>> strengthen the test infrastructure,
>
> I suspect that a polite request to the Andreas that he change to a
> PostgreSQL-compatible license like one of (TPL, BSD2, MIT) might
> handle this part.

It probably would, but I never thought core integration would be a
desirable thing.  Like Csmith, SQLsmith is intended to be
product-agnostic.  That's not yet the case, but it's still on the
roadmap.

Further, the license shouldn't interfere with institutionalizing
sqlsmith somewhere to automatically send mails on failed assertions or
first sight of an error message.  Or providing a web interface around
the logging database of an instance where one can drill down on logged
errors/queries.

>> and it is even sadder to see a direct dependency with libpqxx :(
>
> I suspect this part is a SMOP, but I'm not quite sure what S might
> constitute in this case.

sqlsmith uses three connections in total:

 1. Connection for sending the generated queries to the DUT
 2. Connection for retrieving the schema from the DUT
 3. Logging connection

1 is trivial to change. 1+2 are intendend to be pluggable for supporting
other products.  Switching 1 to libpq is even on the todo list, as
libpqxx doesn't allow access to the SQLSTATE (There is a four year old
bug report about this by Andres).

2. Is more effort to change to libpq, as actual data is passed through
that connection.

3. Was intended to stay libpqxx-only even when testing other products.

Btw, I'm glad C++11 was no longer considered a blocker this thime :-)

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-12 Thread Andreas Seltenreich
Greg Stark writes:

> There may be other errors that would be surprising for Tom or Robert. What
> I did with the string argument fuzzer was printed  counts for each sqlstate
> for the errors and watched for errors that only occurred occasionally or
> didn't make sense to me.
>
> Also, do you have any timeouts?

I currently set statement_timeout to 1s to avoid wasting time letting
postgres crunch numbers.  Less than 0.5% of the queries run into this
timeout.

> Do you have any stats on how long these queries are taking to plan?
> What's the longest query to plan you've found?

No, I'm currently not logging timing spects.  The schema I let the
instances log into is part of the repo[1].

> Do you have coverage data for the corpus?

I do have some older numbers for line coverage from before the recent
grammar extension:

| revision | overall | parser |
|--+-+|
| a4c1989  |26.0 |   20.4 |

regards,
Andreas

Footnotes: 
[1]  https://github.com/anse1/sqlsmith/blob/master/log.sql


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-10 Thread Andreas Seltenreich
Tom Lane writes:

> Merlin Moncure  writes:
>> Aside from the functional issues, could your changes result in
>> performance regressions?
[...]
> It's a little bit harder to gauge the impact on planner speed.  The
> transitive closure calculation could be expensive in a query with many
> lateral references, but that doesn't seem likely to be common; and anyway
> we'll buy back some of that cost due to simpler tests later.  I'm
> optimistic that we'll come out ahead in HEAD/9.5 after the removal
> of LateralJoinInfo setup.  It might be roughly a wash in the back
> branches.

On the empirical side: I see a speedup of 0.4% in testing speed with the
patch applied.  It could very well be me venting the room one additional
time during the second session, resulting in the CPUs spending more time
in their opportunistic frequency range or something.

regards,
Andreas

smith=# select extract('day' from t),
   avg(generated/extract(epoch from updated - t)) as tps
from instance natural join stat
where generated > 100
  and hostname not in('dwagon','slugbug')
   -- these do stuff beside sqlsmith
  and t > now() - interval '3 days' group by 1 order by 1;
 date_part |   tps
---+--
 8 |  55.494181110456
 9 | 55.6902316869404


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed to generate plan on lateral subqueries

2015-12-12 Thread Andreas Seltenreich
Greg Stark writes:

> On Sat, Dec 12, 2015 at 8:30 PM, Andreas Seltenreich <seltenre...@gmx.de> 
> wrote:
> When you hit the timeout is this implemented in your fuzzer or using
> statement_timeout? If the former, can you add a statement_timeout of
> just short of the timeout in the fuzzer and find cases where the
> planner might not be calling CHECK_FOR_INTERRUPTS frequently enough?

It's the latter.  I don't think I can add a client-side timeout into
sqlsmith elegantly.  IMHO it's better to write another test tool that
just re-runs the queries that were logged with a timeout by sqlsmith and
investigates their timeout-behavior more closely.

>> I do have some older numbers for line coverage from before the recent 
>> grammar extension:
>
> If you have a corpus of queries in a simple format it would be pretty
> convenient to add them in a regression test and then run make coverage
> to get html reports.

Hmm, I thought I found a workflow that would yield sqlsmith's coverage
without integrating it into the regession tests.  This is what I did:

make install
initdb /tmp/gcov
pg_ctl -D /tmp/gcov start
make installcheck
pg_ctl -D /tmp/gcov stop
make coverage-clean
pg_ctl -D /tmp/gcov start
sqlsmith --target='dbname=regression' --max-queries=1
pg_ctl -D /tmp/gcov stop
make coverage-html

It seems to yield a pure sqlsmith-only coverage report, as a "make
coverage-html" before the "make coverage-clean" yields a report with
much higher score.  Maybe there are drawbacks to the workflow you are
suggesting?  I just re-did it with the current sqlsmith code, and it's
up by 25% compared to the latest tested revision:

| revision | overall | parser |
|--+-+|
| a4c1989  |26.0 |   20.4 |
| ee099e6  |33.8 |   25.8 |

I also put the report here, in case someone wants to look at certain
details, or make suggestions into what directions to best extend the
grammar to increase coverage.

http://ansel.ydns.eu/~andreas/coverage/
http://ansel.ydns.eu/~andreas/gcov.tar.xz

> Did you publish the source already? I haven't been following all
> along, sorry if these are all answered questions.

It's not had a proper release yet, but the code is available via github
in all its rapid-prototypesque glory:

https://github.com/anse1/sqlsmith

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade in 9.5 broken for adminpack

2016-01-03 Thread Andreas Seltenreich
Bruce Momjian writes:

> On Thu, Apr 16, 2015 at 11:29:07PM -0700, Jeff Janes wrote:
>> Of course after sending that it became obvious.  The C function is not 
>> getting
>> called because the SQL function is marked as being strict, yet is called with
>> NULL arguments.
>> 
>> Trivial patch attached to unset strict flag in pg_proc.h.
>> 
>> But  CATALOG_VERSION_NO probably needs another bump as well.
>
> Patch applied and catversion bumped.  Thanks.

Shouldn't there be some validation of arguments now that the function is
no longer marked strict?  Currently, unprivileged users can crash the
server calling binary_upgrade_create_empty_extension with null
arguments.  Found using sqlsmith.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failing assertions in spgtextproc.c

2016-01-02 Thread Andreas Seltenreich
Tom Lane writes:

> Andreas Seltenreich <seltenre...@gmx.de> writes:
>> TRAP: FailedAssertion([...], File: "spgtextproc.c", Line: 424)
>> TRAP: FailedAssertion([...], File: "spgtextproc.c", Line: 564)
>
> Can you show us the definition of the index that's causing this,
> and some samples of the data you're putting in it?

Here's a recipe for triggering the former:

create table t(c text);
create index on t using spgist(c);
insert into t select '' from generate_series(1,1);
set enable_seqscan to off; select count(1) from t;

I still think it's just the assertions being too strict.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Add STRICT to some regression test C functions.

2016-01-08 Thread Andreas Seltenreich
Hi,

some of the C functions in the regression test DB readily crash when
passing NULL input values.  The regression tests themselves do not pass
NULL values to them, but when the regression database is used as a basis
for fuzz testing, they cause a lot of noise.  Maybe someone can sneak
this patch in?

Thanks,
Andreas
>From 2711471d48c2e58809c2f4617d36352c5903bbd9 Mon Sep 17 00:00:00 2001
From: Andreas Seltenreich <seltenre...@gmx.de>
Date: Sun, 3 Jan 2016 19:05:06 +0100
Subject: [PATCH] Add STRICT to some regression test C functions.

These functions readily crash when passing NULL input values.  The
regression tests themselves do not pass NULL values to them, but when
the regression database is used as a basis for fuzz testing, they
cause a lot of noise.
---
 src/test/regress/input/create_function_2.source  | 10 +-
 src/test/regress/output/create_function_2.source | 10 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/test/regress/input/create_function_2.source b/src/test/regress/input/create_function_2.source
index 1b013ae..486803d 100644
--- a/src/test/regress/input/create_function_2.source
+++ b/src/test/regress/input/create_function_2.source
@@ -74,27 +74,27 @@ CREATE FUNCTION user_relns()
 CREATE FUNCTION pt_in_widget(point, widget)
RETURNS bool
AS '@libdir@/regress@DLSUFFIX@'
-   LANGUAGE C;
+   LANGUAGE C STRICT;
 
 CREATE FUNCTION overpaid(emp)
RETURNS bool
AS '@libdir@/regress@DLSUFFIX@'
-   LANGUAGE C;
+   LANGUAGE C STRICT;
 
 CREATE FUNCTION boxarea(box)
RETURNS float8
AS '@libdir@/regress@DLSUFFIX@'
-   LANGUAGE C;
+   LANGUAGE C STRICT;
 
 CREATE FUNCTION interpt_pp(path, path)
RETURNS point
AS '@libdir@/regress@DLSUFFIX@'
-   LANGUAGE C;
+   LANGUAGE C STRICT;
 
 CREATE FUNCTION reverse_name(name)
RETURNS name
AS '@libdir@/regress@DLSUFFIX@'
-   LANGUAGE C;
+   LANGUAGE C STRICT;
 
 CREATE FUNCTION oldstyle_length(int4, text)
RETURNS int4
diff --git a/src/test/regress/output/create_function_2.source b/src/test/regress/output/create_function_2.source
index 98e1c29..bdfc5be 100644
--- a/src/test/regress/output/create_function_2.source
+++ b/src/test/regress/output/create_function_2.source
@@ -58,23 +58,23 @@ CREATE FUNCTION user_relns()
 CREATE FUNCTION pt_in_widget(point, widget)
RETURNS bool
AS '@libdir@/regress@DLSUFFIX@'
-   LANGUAGE C;
+   LANGUAGE C STRICT;
 CREATE FUNCTION overpaid(emp)
RETURNS bool
AS '@libdir@/regress@DLSUFFIX@'
-   LANGUAGE C;
+   LANGUAGE C STRICT;
 CREATE FUNCTION boxarea(box)
RETURNS float8
AS '@libdir@/regress@DLSUFFIX@'
-   LANGUAGE C;
+   LANGUAGE C STRICT;
 CREATE FUNCTION interpt_pp(path, path)
RETURNS point
AS '@libdir@/regress@DLSUFFIX@'
-   LANGUAGE C;
+   LANGUAGE C STRICT;
 CREATE FUNCTION reverse_name(name)
RETURNS name
AS '@libdir@/regress@DLSUFFIX@'
-   LANGUAGE C;
+   LANGUAGE C STRICT;
 CREATE FUNCTION oldstyle_length(int4, text)
RETURNS int4
AS '@libdir@/regress@DLSUFFIX@'
-- 
2.1.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failing assertions in spgtextproc.c

2015-12-18 Thread Andreas Seltenreich
I do see two assertions in spgtextproc.c fail on occasion when testing
with sqlsmith:

TRAP: FailedAssertion([...], File: "spgtextproc.c", Line: 424)
TRAP: FailedAssertion([...], File: "spgtextproc.c", Line: 564)

I can't reproduce it reliably but looking at the coredumps, the failing
part of the expression is always

in->level == 0 && DatumGetPointer(in->reconstructedValue) == NULL

In all of the dumps I looked at, in->reconstructedValue contains a
zero-length text instead of the asserted NULL, and the tuples fed to
leaf_consistent()/inner_consistent() look like the one below.

,
| (gdb) p *in
| $1 = {scankeys = 0x60a3ee0, nkeys = 1, reconstructedValue = 101373680, level 
= 0, 
|   returnData = 1 '\001', allTheSame = 1 '\001', hasPrefix = 0 '\000', 
prefixDatum = 0, nNodes = 8, 
|   nodeLabels = 0x37b6768}
| (gdb) x ((text *)in->reconstructedValue)->vl_len_
| 0x60ad6f0:0x0010
| (gdb) p *(text *)in->scankeys[0]->sk_argument
| $2 = {vl_len_ = "0\000\000", vl_dat = 0x855950c "sqlsmith~", '\177' , "\020 "}
| (gdb) p in->nodeLabels[0]
| $3 = 65535
`

Maybe these assertions are just too strict?  I don't see the code
misbehaving when relaxing them to

reconstrValue != NULL && VARSIZE_ANY_EXHDR(reconstrValue) == in->level
  || in->level == 0

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failing assertions in spgtextproc.c

2015-12-18 Thread Andreas Seltenreich
Peter Geoghegan writes:

> Can you do this?:
>
> (gdb) p debug_query_string

output below.  Since sqlsmith ist no longer restricted to read-only
statements, the chances for reproduction are low :-/.

select
  pg_catalog.pg_stat_get_buf_written_backend() as c0,
  subq_1.c0 as c1,
  subq_1.c0 as c2,
  subq_1.c0 as c3
from
  (select
(select ordinal_position from information_schema.parameters limit 1 
offset 12)
 as c0,
ref_2.t as c1
  from
public.radix_text_tbl as ref_2
  inner join pg_catalog.pg_stat_activity as ref_3
  on (ref_2.t = ref_3.application_name )
  where ref_2.t @@ cast(coalesce(ref_2.t, ref_3.client_hostname) as text)
  limit 111) as subq_1,
  lateral (select
subq_1.c0 as c0,
subq_2.c2 as c1,
56 as c2,
cast(coalesce(cast(coalesce((select pop from public.real_city limit 1 
offset 34)
, subq_1.c0) as integer), subq_2.c0) as integer) as c3,
74 as c4,
(select unique1 from public.onek2 limit 1 offset 17)
 as c5
  from
(select
  (select ordinal_position from information_schema.parameters limit 
1 offset 27)
 as c0,
  sample_2.umoptions as c1,
  sample_2.umserver as c2
from
  pg_catalog.pg_user_mapping as sample_2 tablesample system (6.2)
where 49 is NULL) as subq_2
  where cast(coalesce(subq_1.c0, subq_2.c0) as integer) is not NULL
  limit 105) as subq_3
where ((select x from public.tt0 limit 1 offset 12)
 <= subq_3.c0)
  and (subq_3.c4 <= 30);


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in postgres_fdw/deparse.c:1116

2016-06-05 Thread Andreas Seltenreich
Creating some foreign tables via postgres_fdw in the regression db of
master as of de33af8, sqlsmith triggers the following assertion:

TRAP: FailedAssertion("!(const Node*)(var))->type) == T_Var))", File: 
"deparse.c", Line: 1116)

gdb says var is holding a T_PlaceHolderVar instead.  In a build without
assertions, it leads to an error later:

ERROR:  cache lookup failed for type 0

Recipe:

--8<---cut here---start->8---
create extension postgres_fdw;
create server myself foreign data wrapper postgres_fdw;
create schema fdw_postgres;
create user mapping for public server myself options (user :'USER');
import foreign schema public from server myself into fdw_postgres;
select subq_0.c0 as c0 from
   (select 31 as c0 from fdw_postgres.a as ref_0
  where 93 >= ref_0.aa) as subq_0
   right join fdw_postgres.rtest_vview5 as ref_1
   on (subq_0.c0 = ref_1.a )
   where 92 = subq_0.c0;
--8<---cut here---end--->8---

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Don't generate parallel paths for rels with parallel-restricted

2016-06-11 Thread Andreas Seltenreich
Amit Kapila writes:

> I have moved it to CLOSE_WAIT state because we have derived our
> queries to reproduce the problem based on original report[1].  If next
> run of sqlsmith doesn't show any problem in this context then we will
> move it to resolved.

I don't have access to my testing horse power this weekend so I can
report on tuesday at the earliest.  Unless someone else feels like
running sqlsmith…

regards,
Andreas
-- 
SQLsmith error of the day: time zone "Bruce Momjian" not recognized.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] PANIC: failed to add BRIN tuple

2016-05-24 Thread Andreas Seltenreich
Alvaro Herrera writes:

> How long does it take for you to reproduce this panic in the unpatched
> code?

Very long, I'm afraid.  I only observed it once, and according to the
logging database, about 4e9 random queries were generated since testing
with 9.5 code.

I could probably speed it up by creating lots of additional BRIN indexes
in the regression database, and by compiling a sqlsmith that generates
update statements only.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ORDER/GROUP BY expression not found in targetlist

2016-05-26 Thread Andreas Seltenreich
Tom Lane writes:

> Andreas Seltenreich <seltenre...@gmx.de> writes:
>> Peter Geoghegan writes:
>>> It's surprising that SQL Smith didn't catch something with such simple
>>> steps to reproduce.
>
>> I removed distinct relatively early because it causes a large part of
>> queries to fail due to it not finding an equality operator it likes.  It
>> seems to be more picky about the equality operator than, say, joins.
>> I'm sure it has a good reason to do so?
>
> It's looking for an operator that is known to be semantically equality,
> by virtue of being the equality member of a btree or hash opclass.
> Type path has no such opclass unfortunately.

As do lots of data types in the regression db while still having an
operator providing semantic equivalence.  I was hoping for someone to
question that status quo.  Naively I'd say an equivalence flag is
missing in the catalog that is independent of opclasses.

regards
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] PANIC: failed to add BRIN tuple

2016-05-25 Thread Andreas Seltenreich
I wrote:
> Alvaro Herrera writes:
>> How long does it take for you to reproduce this panic in the unpatched
>> code?
>
> I could probably speed it up by creating lots of additional BRIN indexes
> in the regression database, and by compiling a sqlsmith that generates
> update statements only.

This actually worked.  Sqlsmith triggered the BRIN panic twice in 80e6
queries (vs. onece in 4e9 before).  I pushed the modified version on the
branch "modify-heavy".  Re-fuzzing now with your patch applied.

andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Should pg_export_snapshot() and currtid() be tagged parallel-unsafe?

2016-06-14 Thread Andreas Seltenreich
Digging through the sqlsmith logging db, I noticed the following errors:

ERROR:  cannot update SecondarySnapshot during a parallel operation
ERROR:  cannot assign XIDs during a parallel operation

Queries raising the first one always contain calls to currtid() or
currtid2().  Queries raising the second one always contain a call to
pg_export_snapshot().  Patch to tag them as unsafe attached.

regards,
Andreas
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index f33c3ff..6a65e77 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -1347,9 +1347,9 @@ DATA(insert OID = 1291 (  suppress_redundant_updates_trigger	PGNSP PGUID 12 1 0
 DESCR("trigger to suppress updates when new and old records match");
 
 DATA(insert OID = 1292 ( tideq			   PGNSP PGUID 12 1 0 0 0 f f f t t f i s 2 0 16 "27 27" _null_ _null_ _null_ _null_ _null_ tideq _null_ _null_ _null_ ));
-DATA(insert OID = 1293 ( currtid		   PGNSP PGUID 12 1 0 0 0 f f f f t f v s 2 0 27 "26 27" _null_ _null_ _null_ _null_ _null_ currtid_byreloid _null_ _null_ _null_ ));
+DATA(insert OID = 1293 ( currtid		   PGNSP PGUID 12 1 0 0 0 f f f f t f v u 2 0 27 "26 27" _null_ _null_ _null_ _null_ _null_ currtid_byreloid _null_ _null_ _null_ ));
 DESCR("latest tid of a tuple");
-DATA(insert OID = 1294 ( currtid2		   PGNSP PGUID 12 1 0 0 0 f f f f t f v s 2 0 27 "25 27" _null_ _null_ _null_ _null_ _null_ currtid_byrelname _null_ _null_ _null_ ));
+DATA(insert OID = 1294 ( currtid2		   PGNSP PGUID 12 1 0 0 0 f f f f t f v u 2 0 27 "25 27" _null_ _null_ _null_ _null_ _null_ currtid_byrelname _null_ _null_ _null_ ));
 DESCR("latest tid of a tuple");
 DATA(insert OID = 1265 ( tidne			   PGNSP PGUID 12 1 0 0 0 f f f t t f i s 2 0 16 "27 27" _null_ _null_ _null_ _null_ _null_ tidne _null_ _null_ _null_ ));
 DATA(insert OID = 2790 ( tidgt			   PGNSP PGUID 12 1 0 0 0 f f f t t f i s 2 0 16 "27 27" _null_ _null_ _null_ _null_ _null_ tidgt _null_ _null_ _null_ ));
@@ -3135,7 +3135,7 @@ DESCR("xlog filename, given an xlog location");
 DATA(insert OID = 3165 ( pg_xlog_location_diff		PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 1700 "3220 3220" _null_ _null_ _null_ _null_ _null_ pg_xlog_location_diff _null_ _null_ _null_ ));
 DESCR("difference in bytes, given two xlog locations");
 
-DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f f t f v r 0 0 25 "" _null_ _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
+DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f f t f v u 0 0 25 "" _null_ _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
 DESCR("export a snapshot");
 
 DATA(insert OID = 3810 (  pg_is_in_recovery		PGNSP PGUID 12 1 0 0 0 f f f f t f v s 0 0 16 "" _null_ _null_ _null_ _null_ _null_ pg_is_in_recovery _null_ _null_ _null_ ));

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Don't generate parallel paths for rels with parallel-restricted

2016-06-15 Thread Andreas Seltenreich
Amit Kapila writes:

> Right, so I have moved "Failed assertion in parallel worker
> (ExecInitSubPlan)" item to CLOSE_WAIT state as I don't think there is any
> known pending issue in that item.  I have moved it to CLOSE_WAIT state
> because we have derived our queries to reproduce the problem based on
> original report[1].  If next run of sqlsmith doesn't show any problem in
> this context then we will move it to resolved.

It ran for about 100e6 queries by now without tripping on any parallel
worker related assertions.  I tested with min_parallel_relation_size_v1.patch
applied and set to 64kB to have more exposure during testing.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ORDER/GROUP BY expression not found in targetlist

2016-05-26 Thread Andreas Seltenreich
Peter Geoghegan writes:

> On Wed, May 25, 2016 at 7:12 PM, Andres Freund  wrote:
>>
>> =# CREATE TABLE twocol(col01 int, col02 int);
>> =# SELECT DISTINCT col01, col02, col01 FROM twocol ;
>> ERROR:  XX000: ORDER/GROUP BY expression not found in targetlist
>> LOCATION:  get_sortgroupref_tle, tlist.c:341
>>
>> which appears to be a 9.6 regression, presumable fallout from the path
>> restructuring.
>
> It's surprising that SQL Smith didn't catch something with such simple
> steps to reproduce.

I removed distinct relatively early because it causes a large part of
queries to fail due to it not finding an equality operator it likes.  It
seems to be more picky about the equality operator than, say, joins.
I'm sure it has a good reason to do so?

regression=> select distinct f1 from path_tbl;
ERROR:  could not identify an equality operator for type path
LINE 1: select distinct f1 from path_tbl;

regression=> \do =
-[ RECORD 38 ]-+
Schema | pg_catalog
Name   | =
Left arg type  | path
Right arg type | path
Result type| boolean
Description| equal



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] PANIC: failed to add BRIN tuple

2016-05-29 Thread Andreas Seltenreich
Alvaro Herrera writes:

> If you can re-run sqlsmith and see if you can find different bugs, I'd
> appreciate it.
[...]
> [2. text/x-diff; brincrash-2.patch]

BRIN is inconspicuous since applying this patch.  All coredumps I see
now are either due to the parallel worker shutdown issue or acl.c's
text/name confusion, both reported earlier.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] OOM crash in plpgsql_extra_checks_check_hook

2016-06-20 Thread Andreas Seltenreich
Just had a parallel worker of a memory-starved instance of sqlsmith
crash.  plpgsql_extra_checks_check_hook forgot to check the result of
its malloc call here:

Core was generated by `postgres: bgworker: parallel worker for PID 5905 
   '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  plpgsql_extra_checks_check_hook (newvalue=, 
extra=0x7fff7fe31a58, source=) at pl_handler.c:113
113 *myextra = extrachecks;
(gdb) bt
#0  plpgsql_extra_checks_check_hook (newvalue=, 
extra=0x7fff7fe31a58, source=) at pl_handler.c:113
#1  0x0080173f in call_string_check_hook (newval=0x7fff7fe31a50, 
extra=, source=, elevel=15, conf=, 
conf=) at guc.c:9779
#2  0x008029b8 in InitializeOneGUCOption (gconf=0x4) at guc.c:4546
#3  0x00804dbc in define_custom_variable (variable=0x2cb6ef0) at 
guc.c:7466
#4  0x00805862 in DefineCustomStringVariable 
(name=name@entry=0x7f803cbfe011 "plpgsql.extra_warnings", 
short_desc=short_desc@entry=0x7f803cbfe1f8 "List of programming constructs that 
should produce a warning.", long_desc=long_desc@entry=0x0, 
valueAddr=valueAddr@entry=0x7f803ce070d8 , 
bootValue=bootValue@entry=0x7f803cbfdf78 "none", 
context=context@entry=PGC_USERSET, flags=1, check_hook=0x7f803cbe9700 
, assign_hook=0x7f803cbe96e0 
, show_hook=0x0) at guc.c:7733
#5  0x7f803cbe99ea in _PG_init () at pl_handler.c:173
#6  0x007f1bcb in internal_load_library 
(libname=libname@entry=0x7f8040cee14d ) at dfmgr.c:276
#7  0x007f2738 in RestoreLibraryState (start_address=0x7f8040cee14d 
) at dfmgr.c:741
#8  0x004e61c0 in ParallelWorkerMain (main_arg=) at 
parallel.c:985
#9  0x00684072 in StartBackgroundWorker () at bgworker.c:726
#10 0x0068f142 in do_start_bgworker (rw=0x2cb5230) at postmaster.c:5535
#11 maybe_start_bgworker () at postmaster.c:5709
#12 0x0068fb96 in sigusr1_handler (postgres_signal_arg=) 
at postmaster.c:4971
#13 
#14 0x7f8040091ac3 in __select_nocancel () at 
../sysdeps/unix/syscall-template.S:81
#15 0x0046c31f in ServerLoop () at postmaster.c:1657
#16 0x00690fc7 in PostmasterMain (argc=argc@entry=4, 
argv=argv@entry=0x2c8c620) at postmaster.c:1301
#17 0x0046d96d in main (argc=4, argv=0x2c8c620) at main.c:228


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Improve spinlock inline assembly for x86.

2016-01-17 Thread Andreas Seltenreich
Hi,

I'm currently experimenting with just-in-time compilation using libfirm.
While discussing issues with its developers, it was pointed out to me
that our spinlock inline assembly is less than optimal.  Attached is a
patch that addresses this.

,
| Remove the LOCK prefix from the XCHG instruction.  Locking is implicit
| with XCHG and the prefix wastes a byte.  Also remove the "cc" register
| from the clobber list as the XCHG instruction does not modify any flags.
| 
| Reported by Christoph Mallon.
`

regards,
Andreas

>From c836b4f3e0b60d070481d4061e6fe0ffbe488495 Mon Sep 17 00:00:00 2001
From: Andreas Seltenreich <seltenre...@gmx.de>
Date: Sun, 17 Jan 2016 11:51:53 +0100
Subject: [PATCH] Improve spinlock inline assembly for x86.

Remove the LOCK prefix from the XCHG instruction.  Locking is implicit
with XCHG and the prefix wastes a byte.  Also remove the "cc" register
from the clobber list as the xchg instruction does not modify any
flags.

Reported by Christoph Mallon.
---
 src/include/storage/s_lock.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
index 8b240cd..933bb76 100644
--- a/src/include/storage/s_lock.h
+++ b/src/include/storage/s_lock.h
@@ -158,7 +158,6 @@ tas(volatile slock_t *lock)
 	__asm__ __volatile__(
 		"	cmpb	$0,%1	\n"
 		"	jne		1f		\n"
-		"	lock			\n"
 		"	xchgb	%0,%1	\n"
 		"1: \n"
 :		"+q"(_res), "+m"(*lock)
@@ -226,11 +225,10 @@ tas(volatile slock_t *lock)
 	register slock_t _res = 1;
 
 	__asm__ __volatile__(
-		"	lock			\n"
 		"	xchgb	%0,%1	\n"
 :		"+q"(_res), "+m"(*lock)
 :		/* no inputs */
-:		"memory", "cc");
+:		"memory");
 	return (int) _res;
 }
 
-- 
2.1.4


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-04-29 Thread Andreas Seltenreich
Hi,

tonight's sqlsmith run yielded another core dump:

TRAP: FailedAssertion("!(MyProc->lockGroupLeader == ((void *)0))", File: 
"proc.c", Line: 1787)

I couldn't identifiy a query for it though: debug_query_string is empty.
Additionally, the offending query was not reported in the error context
as it typically is for non-parallel executor crashes.

regards,
Andreas

GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
Core was generated by `postgres: bgworker: parallel worker for PID 4706 
'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7ff1bda16067 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: Datei oder Verzeichnis nicht 
gefunden.
#0  0x7ff1bda16067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7ff1bda17448 in __GI_abort () at abort.c:89
#2  0x007eaa11 in ExceptionalCondition 
(conditionName=conditionName@entry=0x988318 "!(MyProc->lockGroupLeader == 
((void *)0))", errorType=errorType@entry=0x82a45d "FailedAssertion", 
fileName=fileName@entry=0x8760e5 "proc.c", lineNumber=lineNumber@entry=1787) at 
assert.c:54
#3  0x006e3e7b in BecomeLockGroupLeader () at proc.c:1787
#4  0x004e6a59 in LaunchParallelWorkers (pcxt=pcxt@entry=0x1db05c8) at 
parallel.c:437
#5  0x005ef2d7 in ExecGather (node=node@entry=0x1d9d0b8) at 
nodeGather.c:168
#6  0x005dd788 in ExecProcNode (node=node@entry=0x1d9d0b8) at 
execProcnode.c:515
#7  0x005d999f in ExecutePlan (dest=0x1d7d310, direction=, numberTuples=0, sendTuples=, operation=CMD_SELECT, 
use_parallel_mode=, planstate=0x1d9d0b8, estate=0x1d9c858) at 
execMain.c:1567
#8  standard_ExecutorRun (queryDesc=0x1db0080, direction=, 
count=0) at execMain.c:338
#9  0x005dcb3f in ParallelQueryMain (seg=, 
toc=0x7ff1be507000) at execParallel.c:716
#10 0x004e608b in ParallelWorkerMain (main_arg=) at 
parallel.c:1033
#11 0x00683a42 in StartBackgroundWorker () at bgworker.c:726
#12 0x0068eb82 in do_start_bgworker (rw=0x1d24ec0) at postmaster.c:5531
#13 maybe_start_bgworker () at postmaster.c:5706
#14 0x0046c993 in ServerLoop () at postmaster.c:1762
#15 0x006909fe in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x1cfa560) at postmaster.c:1298
#16 0x0046d5ed in main (argc=3, argv=0x1cfa560) at main.c:228
(gdb) bt full
#3  0x006e3e7b in BecomeLockGroupLeader () at proc.c:1787
leader_lwlock = 
#4  0x004e6a59 in LaunchParallelWorkers (pcxt=pcxt@entry=0x1db05c8) at 
parallel.c:437
oldcontext = 0x1d9ced0
worker = {
  bgw_name = 
"\220\a\333\001\000\000\000\000\370\340L\276\361\177\000\000\370\373\025\264\361\177\000\000P\a\333\001\000\000\000\000X\310\331\001\000\000\000\000\004\000\000\000\000\000\000\000\310\005\333\001\000\000\000\000\233\222J\000\000\000\000",
 
  bgw_flags = 1, 
  bgw_start_time = BgWorkerStart_PostmasterStart, 
  bgw_restart_time = 1, 
  bgw_main = 0x0, 
  bgw_library_name = 
"\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000u\222J\000\000\000\000\000\350\336\331\001\000\000\000\000\360\260a\000\000\000\000\000\200\315]\000\000\000\000\000\300\330\367\217\375\177\000\000h\205\333\001\000\000\000",
 
  bgw_function_name = 
"`\346\327\001\000\000\000\000\004\000\000\000\000\000\000\000\360\260a\000\000\000\000\000\200\315]\000\000\000\000\000\300\330\367\217\375\177\000\000h\317\331\001\000\000\000\000\210\325\327\001\000\000\000\000\004\000\000\000\000\000\000",
 
  bgw_main_arg = 6402288, 
  bgw_extra = 
"X\310\331\001\000\000\000\000\000\000\000\000\000\000\000\000\060\233\333\001\000\000\000\000\001\000\000\000\000\000\000\000\004\000\000\000\000\000\000\000l\321]\000\000\000\000\000\000\000\000\000\000\000\000\000H-\334\001\000\000\000\000h\317\331\001\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\274\341M\276\361\177\000\000\310\005\333\001\000\000\000\000\004\000\000\000\000\000\000\000\310\005\333\001\000\000\000\000\000\000\000\000\000\000\000",
 
  bgw_notify_pid = 4
}
i = 
any_registrations_failed = 0 '\000'
#5  0x005ef2d7 in ExecGather (node=node@entry=0x1d9d0b8) at 
nodeGather.c:168
pcxt = 0x1db05c8
estate = 
gather = 0x1d7d440
fslot = 0x1d9ced0
i = 
resultSlot = 
isDone = ExprSingleResult
econtext = 
#6  0x005dd788 in ExecProcNode (node=node@entry=0x1d9d0b8) at 
execProcnode.c:515
result = 
__func__ = "ExecProcNode"
#7  0x005d999f in ExecutePlan (dest=0x1d7d310, direction=, numberTuples=0, sendTuples=, operation=CMD_SELECT, 
use_parallel_mode=, planstate=0x1d9d0b8, estate=0x1d9c858) at 
execMain.c:1567
slot = 
current_tuple_count = 0
#8  standard_ExecutorRun (queryDesc=0x1db0080, direction=, 

Re: [HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-04-29 Thread Andreas Seltenreich
Alvaro Herrera writes:
> Amit Kapila wrote:
>> It will be helpful if you can find the offending query or plan
>> corresponding to it?
>
> So I suppose the PID of the process starting the workers should be in
> the stack somewhere.

Ja, it's right on the top, but long gone by now…

> With that one should be able to attach to that process and get another
> stack trace.  I'm curious on whether you would need to have started
> the server with "postgres -T"

This sounds like it should work to capture more context when the
Assertion fails the next time.  I have to purge the catalogs a bit
though to avoid stopping early on boring core dumps.  Most of them are
currently caused by acl.c using text for syscache lookups and triggering
an NAMEDATALEN assertion.

E.g.: select 
has_language_privilege('smithsmithsmithsmithsmithsmithsmithsmithsmithsmithsmithsmithsmith',
 smith');

thanks,
andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-04-29 Thread Andreas Seltenreich
Simon Riggs writes:

> It's good that the input is fuzzed, but there needs to be a way to re-run
> identical fuzzing or a way to backtrack to find what broke. Not much point
> finding bugs we can't identify later.

sqlsmith is deterministic and allows re-generating a sequence of random
queries with the --seed argument.  Finding a testing methodology that
ensures a repeatable server-side is a harder problem though.

One would have to avoid touching any kind of concurrency, disable
autovacuum, autoanalyze and invoke explicit analyzes/vacuums in concert
with query generation.  Further, one would have to avoid any kind of
concurrency while testing.  Even then, 1% of the queries run into a
statement_timeout due to randomly generated excessive cross joins.  If a
timeout just barely happens, it might not do so on the repeated run and
the deterministic state is gone from then on.  I'm afraid this list is
not complete yet.

I didn't think the effort of creating this kind of clean-room testing
was worth it.  If reports of failed assertions with backtrace without a
recipe to reproduce them are a nuisance, I'll avoid them in the future.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-04-30 Thread Andreas Seltenreich
Amit Kapila writes:

> On Fri, Apr 29, 2016 at 7:15 PM, Tom Lane  wrote:
>> but it might be worth copying over the full query from the parent
>> side.
>
> That would amount to couple of extra cycles considering we need to do it
> for each worker, but OTOH it might be a useful debugging information in the
> cases as reported in this thread.

Maybe only do it in assertion-enabled builds when performance is an
issue?

regards,
andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-04-30 Thread Andreas Seltenreich
Amit Kapila writes:

> On Sat, Apr 30, 2016 at 5:58 AM, Andreas Seltenreich <seltenre...@gmx.de> 
> wrote:
>> This sounds like it should work to capture more context when the
>> Assertion fails the next time.
>
> Sounds good.  So can we assume that you will try to get us the new report
> with more information?

Ja.  I do have a busy weekend+week ahead though, so no holding of
breath.

regards
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Crash in apply_projection_to_path

2016-04-28 Thread Andreas Seltenreich
Hi,

the following query against the regression database crashes master as of
23b09e15.

select 1 from depth0 inner join depth1 on (depth0.c = depth1.c)
 where depth0.c @@ depth1.c limit 1;

regards,
Andreas

Program terminated with signal SIGSEGV, Segmentation fault.
#0  create_projection_path (root=root@entry=0x6918e60,
rel=0x7f7f7f7f7f7f7f7f, subpath=0x69d6de8, target=target@entry=0x69d7428)
at pathnode.c:2160
2160pathnode->path.parallel_safe = rel->consider_parallel &&
#0  create_projection_path (root=root@entry=0x6918e60, rel=0x7f7f7f7f7f7f7f7f, 
subpath=0x69d6de8, target=target@entry=0x69d7428) at pathnode.c:2160
#1  0x0067841e in apply_projection_to_path (root=0x6918e60, 
rel=0x69d5e18, path=0x69d6ee0, target=0x69d7428) at pathnode.c:2251
#2  0x0065ff20 in grouping_planner (root=0x22e1850, 
root@entry=0x6918e60, inheritance_update=56 '8', inheritance_update@entry=0 
'\000', tuple_fraction=2615.2579411764709, tuple_fraction@entry=0) at 
planner.c:1737
#3  0x00661f44 in subquery_planner (glob=glob@entry=0x6918dc8, 
parse=parse@entry=0x235e3b0, parent_root=parent_root@entry=0x0, 
hasRecursion=hasRecursion@entry=0 '\000', 
tuple_fraction=tuple_fraction@entry=0) at planner.c:758
#4  0x00662e2b in standard_planner (parse=0x235e3b0, cursorOptions=256, 
boundParams=0x0) at planner.c:307
#5  0x006f40ed in pg_plan_query (querytree=0x235e3b0, 
cursorOptions=256, boundParams=0x0) at postgres.c:798
#6  0x006f41e4 in pg_plan_queries (querytrees=, 
cursorOptions=256, boundParams=0x0) at postgres.c:857
#7  0x006f5c93 in exec_simple_query (query_string=) at 
postgres.c:1022


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertions on parallel worker shutdown

2016-05-22 Thread Andreas Seltenreich
I wrote:

> There's another class of parallel worker core dumps when testing master
> with sqlsmith.  In these cases, the following assertion fails for all
> workers simulataneously:
>
> TRAP: FailedAssertion("!(mqh->mqh_partial_bytes <= nbytes)", File: 
> "shm_mq.c", Line: 386)

I no longer observe these after applying these two patches by Amit
Kapila:

avoid_restricted_clause_below_gather_v1.patch
Message-ID: 

Re: [HACKERS] [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

2016-05-22 Thread Andreas Seltenreich
Amit Kapila writes:

> avoid_restricted_clause_below_gather_v1.patch
> prohibit_parallel_clause_below_rel_v1.patch

I didn't observe any parallel worker related coredumps since applying
these.  The same amount of testing done before applying them yielded
about a dozend.

Dilip Kumar writes:

> So now its clear that because of sub query pullup, we may get expression in
> targetlist while creating single table path list. So we need to avoid
> parallel plan if it contains expression.

This sounds like a rather heavy restriction though…

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertions on parallel worker shutdown

2016-05-23 Thread Andreas Seltenreich
I wrote:
>> There's another class of parallel worker core dumps when testing master
>> with sqlsmith.  In these cases, the following assertion fails for all
>> workers simulataneously:
>>
>> TRAP: FailedAssertion("!(mqh->mqh_partial_bytes <= nbytes)", File: 
>> "shm_mq.c", Line: 386)
>
> I no longer observe these after applying these two patches by Amit
> Kapila

I spoke too soon: These still occur with the patches applied, but with
much lower probability. (one core dump per 20e6 random queries instead
of 1e6).

Most of the collected plans look inconspicuous to me now, except for one
that again had a subplan below a gather node (plan6).  Tarball of all
collected plans attached.

regards,
Andreas



plans.tar.gz
Description: application/gzip

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertions on parallel worker shutdown

2016-05-23 Thread Andreas Seltenreich
Amit Kapila writes:

> Earlier problems were due to the reason that some unsafe/restricted
> expressions were pushed below Gather node as part of target list whereas in
> the plan6, it seems some unsafe node is pushed below Gather node. It will
> be helpful if you can share the offending query?

plan6 corresponds to this query:

select
pg_catalog.anyarray_out(
cast((select most_common_vals from pg_catalog.pg_stats limit 1 offset 41)
 as anyarray)) as c0
 from
public.quad_point_tbl as ref_0 where ref_0.p ~= ref_0.p;

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertions on parallel worker shutdown

2016-05-24 Thread Andreas Seltenreich
Amit Kapila writes:

> On Mon, May 23, 2016 at 4:48 PM, Andreas Seltenreich <seltenre...@gmx.de>
> wrote:
>> plan6 corresponds to this query:
>>
> Are you sure that the core dumps you are seeing are due to plan6?

Each of the plans sent was harvested from a controlling process when the
above assertion failed in its workers.  I do not know whether the plans
themselves really are at fault, as most of the collected plans look ok
to me.  The backtrace in the controlling process always look like the
one reported. (Except when the coredumping took so long as to trigger a
statement_timeout in the still-running master. There are no
plans/queries available in this case, as the the state is no longer
available in an aborted transaction.)

> I have tried to generate a parallel plan for above query and it seems to me 
> that
> after applying the patches (avoid_restricted_clause_below_gather_v1.patch
> and prohibit_parallel_clause_below_rel_v1.patch), the plan it generates
> doesn't have subplan below gather node [1].

> Without patch avoid_restricted_clause_below_gather_v1.patch, it will allow to 
> push
> subplan below gather node, so I think either there is some other plan
> (query) due to which you are seeing core dumps or the above two patches
> haven't been applied before testing.

According to my notes, the patches were applied in the instance that
crashed.  The fact that I do not see the other variants of the crashes
the patches fix anymore, and the probability for this failed assertion
per random query is reduced by about a factor of 20 in contrast to
testing with the patches not applied, I'm pretty certain that this is
not a bookkeeping error on my part.

> Is it possible that core dump is due to plan2 or some other similar
> plan (I am not sure at this stage about the cause of the problem you
> are seeing, but if due to some reason PARAM_EXEC params are pushed
> below gather, then such a plan might not work)?  If you think plan
> other than plan6 can cause such a problem, then can you share the
> query for plan2?

Each of the sent plans was collected when a worker dumped core due to
the failed assertion.  More core dumps than plans were actually
observed, since with this failed assertion, multiple workers usually
trip on and dump core simultaneously.

The following query corresponds to plan2:

--8<---cut here---start->8---
select
  pg_catalog.pg_stat_get_bgwriter_requested_checkpoints() as c0,
  subq_0.c3 as c1, subq_0.c1 as c2, 31 as c3, 18 as c4,
  (select unique1 from public.bprime limit 1 offset 9) as c5,
  subq_0.c2 as c6
from
(select ref_0.tablename as c0, ref_0.inherited as c1,
ref_0.histogram_bounds as c2, 100 as c3
  from
pg_catalog.pg_stats as ref_0
  where 49 is not NULL limit 55) as subq_0
where true
limit 58;
--8<---cut here---end--->8---

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertions on parallel worker shutdown

2016-05-22 Thread Andreas Seltenreich
There's another class of parallel worker core dumps when testing master
with sqlsmith.  In these cases, the following assertion fails for all
workers simulataneously:

TRAP: FailedAssertion("!(mqh->mqh_partial_bytes <= nbytes)", File: "shm_mq.c", 
Line: 386)

The backtraces of the controlling process is always in
ExecShutdownGatherWorkers.  The queries always work fine on re-running,
so I guess there is some race condition on worker shutdown?  Backtraces
below.

regards
andreas

Core was generated by `postgres: bgworker: parallel worker for PID 30525   
'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7f5a3df91067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x7f5a3df91067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f5a3df92448 in __GI_abort () at abort.c:89
#2  0x007eabe1 in ExceptionalCondition 
(conditionName=conditionName@entry=0x984e10 "!(mqh->mqh_partial_bytes <= 
nbytes)", errorType=errorType@entry=0x82a75d "FailedAssertion", 
fileName=fileName@entry=0x984b8c "shm_mq.c", lineNumber=lineNumber@entry=386) 
at assert.c:54
#3  0x006d8042 in shm_mq_sendv (mqh=0x25f17b8, 
iov=iov@entry=0x7ffc6352af00, iovcnt=iovcnt@entry=1, nowait=) at 
shm_mq.c:386
#4  0x006d807d in shm_mq_send (mqh=, nbytes=, data=, nowait=) at shm_mq.c:327
#5  0x005d96b9 in ExecutePlan (dest=0x25f1850, direction=, numberTuples=0, sendTuples=, operation=CMD_SELECT, 
use_parallel_mode=, planstate=0x2612da8, estate=0x2612658) at 
execMain.c:1596
#6  standard_ExecutorRun (queryDesc=0x261a660, direction=, 
count=0) at execMain.c:338
#7  0x005dc7cf in ParallelQueryMain (seg=, 
toc=0x7f5a3ea6c000) at execParallel.c:735
#8  0x004e617b in ParallelWorkerMain (main_arg=) at 
parallel.c:1035
#9  0x00683862 in StartBackgroundWorker () at bgworker.c:726
#10 0x0068e9a2 in do_start_bgworker (rw=0x2590760) at postmaster.c:5531
#11 maybe_start_bgworker () at postmaster.c:5706
#12 0x0046cbba in ServerLoop () at postmaster.c:1762
#13 0x0069081e in PostmasterMain (argc=argc@entry=4, 
argv=argv@entry=0x256d580) at postmaster.c:1298
#14 0x0046d80d in main (argc=4, argv=0x256d580) at main.c:228
(gdb) attach 30525
0x7f5a3e044e33 in __epoll_wait_nocancel () at 
../sysdeps/unix/syscall-template.S:81
81  ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0  0x7f5a3e044e33 in __epoll_wait_nocancel () at 
../sysdeps/unix/syscall-template.S:81
#1  0x006d1b4e in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7ffc6352aec0, cur_timeout=-1, set=0x44251c0) at latch.c:981
#2  WaitEventSetWait (set=set@entry=0x44251c0, timeout=timeout@entry=-1, 
occurred_events=occurred_events@entry=0x7ffc6352aec0, nevents=nevents@entry=1) 
at latch.c:935
#3  0x006d1f96 in WaitLatchOrSocket (latch=0x7f5a3d898494, 
wakeEvents=wakeEvents@entry=1, sock=sock@entry=-1, timeout=timeout@entry=-1) at 
latch.c:347
#4  0x006d205d in WaitLatch (latch=, 
wakeEvents=wakeEvents@entry=1, timeout=timeout@entry=-1) at latch.c:302
#5  0x004e6d64 in WaitForParallelWorkersToFinish (pcxt=0x442d4e8) at 
parallel.c:537
#6  0x005dcf84 in ExecParallelFinish (pei=0x441cab8) at 
execParallel.c:541
#7  0x005eeead in ExecShutdownGatherWorkers (node=node@entry=0x3e3a070) 
at nodeGather.c:416
#8  0x005ef389 in ExecShutdownGather (node=0x3e3a070) at 
nodeGather.c:430
#9  0x005dd03d in ExecShutdownNode (node=0x3e3a070) at 
execProcnode.c:807
#10 0x0061ad73 in planstate_tree_walker (planstate=0x3e361a8, 
walker=0x5dd010 , context=0x0) at nodeFuncs.c:3442
#11 0x0061ad73 in planstate_tree_walker (planstate=0xf323c30, 
walker=0x5dd010 , context=0x0) at nodeFuncs.c:3442
#12 0x0061ad73 in planstate_tree_walker (planstate=0xf323960, 
walker=0x5dd010 , context=0x0) at nodeFuncs.c:3442
#13 0x005d96da in ExecutePlan (dest=0xb826868, direction=, numberTuples=0, sendTuples=, operation=CMD_SELECT, 
use_parallel_mode=, planstate=0xf323960, estate=0xf322b28) at 
execMain.c:1576
#14 standard_ExecutorRun (queryDesc=0xddca888, direction=, 
count=0) at execMain.c:338
#15 0x006f6e88 in PortalRunSelect (portal=portal@entry=0x258ccc8, 
forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, 
dest=dest@entry=0xb826868) at pquery.c:946
#16 0x006f83ae in PortalRun (portal=0x258ccc8, 
count=9223372036854775807, isTopLevel=, dest=0xb826868, 
altdest=0xb826868, completionTag=0x7ffc6352b3d0 "") at pquery.c:787
#17 0x006f5c63 in exec_simple_query (query_string=) at 
postgres.c:1094
#18 PostgresMain (argc=39374024, argv=0x25ed130, dbname=0x256e480 "regression", 
username=0x25ed308 "0\321^\002") at postgres.c:4059
#19 0x0046c8b2 in BackendRun (port=0x25935d0) at postmaster.c:4258
#20 

[HACKERS] [sqlsmith] PANIC: failed to add BRIN tuple

2016-05-22 Thread Andreas Seltenreich
There was one instance of this PANIC when testing with the regression db
of master at 50e5315.

,
| WARNING:  specified item offset is too large
| PANIC:  failed to add BRIN tuple
| server closed the connection unexpectedly
`

It is reproducible with the query below on this instance only.  I've put
the data directory (20MB) here:

http://ansel.ydns.eu/~andreas/brincrash.tar.xz

The instance was running on Debian Jessie amd64.  Query and Backtrace
below.

regards,
Andreas

--8<---cut here---start->8---
update public.brintest set byteacol = null, charcol =
public.brintest.charcol, int2col = null, int4col =
public.brintest.int4col, textcol = public.brintest.textcol, oidcol =
cast(coalesce(cast(coalesce(null, public.brintest.oidcol) as oid),
pg_catalog.pg_my_temp_schema()) as oid), tidcol =
public.brintest.tidcol, float8col = public.brintest.float8col,
macaddrcol = null, cidrcol = public.brintest.cidrcol, datecol =
public.brintest.datecol, timecol = public.brintest.timecol,
timestamptzcol = pg_catalog.clock_timestamp(), intervalcol =
public.brintest.intervalcol, timetzcol = public.brintest.timetzcol,
bitcol = public.brintest.bitcol, varbitcol =
public.brintest.varbitcol, uuidcol = null returning
public.brintest.byteacol as c0;
--8<---cut here---end--->8---

Core was generated by `postgres: smith regression [local] UPDATE
   '.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fd2cda67067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x7fd2cda67067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7fd2cda68448 in __GI_abort () at abort.c:89
#2  0x007ec969 in errfinish (dummy=dummy@entry=0) at elog.c:557
#3  0x007f011c in elog_finish (elevel=elevel@entry=20, 
fmt=fmt@entry=0x82ca8f "failed to add BRIN tuple") at elog.c:1378
#4  0x00470618 in brin_doupdate (idxrel=0x101f4c0, pagesPerRange=1, 
revmap=0x10d20e50, heapBlk=8, oldbuf=2878, oldoff=9, origtup=0x10d864a8, 
origsz=6144, newtup=0x5328a88, newsz=6144, samepage=1 '\001') at 
brin_pageops.c:184
#5  0x0046e5bb in brininsert (idxRel=0x101f4c0, values=0x211b, 
nulls=0x6 , 
heaptid=0x, heapRel=0x7fd2ce6fd700, 
checkUnique=UNIQUE_CHECK_NO) at brin.c:244
#6  0x005d887f in ExecInsertIndexTuples (slot=0xe92a560, 
tupleid=0x10d21084, estate=0x9ed8a68, noDupErr=0 '\000', specConflict=0x0, 
arbiterIndexes=0x0) at execIndexing.c:383
#7  0x005f74d5 in ExecUpdate (tupleid=0x7ffe11ea74a0, oldtuple=0x211b, 
slot=0xe92a560, planSlot=0x, epqstate=0x7fd2ce6fd700, 
estate=0x9ed8a68, canSetTag=1 '\001') at nodeModifyTable.c:1015
#8  0x005f7b6c in ExecModifyTable (node=0x9ed8d28) at 
nodeModifyTable.c:1501
#9  0x005dd5d8 in ExecProcNode (node=node@entry=0x9ed8d28) at 
execProcnode.c:396
#10 0x005d962f in ExecutePlan (dest=0xde86040, direction=, numberTuples=0, sendTuples=, operation=CMD_UPDATE, 
use_parallel_mode=, planstate=0x9ed8d28, estate=0x9ed8a68) at 
execMain.c:1567
#11 standard_ExecutorRun (queryDesc=0xde860d8, direction=, 
count=0) at execMain.c:338
#12 0x006f74c9 in ProcessQuery (plan=, 
sourceText=0xd74e88 "update public.brintest[...]", params=0x0, dest=0xde86040, 
completionTag=0x7ffe11ea7670 "") at pquery.c:185
#13 0x006f775f in PortalRunMulti (portal=portal@entry=0xde8abf0, 
isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0xde86040, 
altdest=0xc96680 , 
completionTag=completionTag@entry=0x7ffe11ea7670 "") at pquery.c:1267
#14 0x006f7a0c in FillPortalStore (portal=portal@entry=0xde8abf0, 
isTopLevel=isTopLevel@entry=1 '\001') at pquery.c:1044
#15 0x006f845d in PortalRun (portal=0xde8abf0, 
count=9223372036854775807, isTopLevel=, dest=0x9ee76b8, 
altdest=0x9ee76b8, completionTag=0x7ffe11ea7a20 "") at pquery.c:782
#16 0x006f5c63 in exec_simple_query (query_string=) at 
postgres.c:1094
#17 PostgresMain (argc=233352176, argv=0xe8ad358, dbname=0xcf7508 "regression", 
username=0xe8ad3b0 "Xӊ\016") at postgres.c:4059
#18 0x0046c8b2 in BackendRun (port=0xd1c580) at postmaster.c:4258
#19 BackendStartup (port=0xd1c580) at postmaster.c:3932
#20 ServerLoop () at postmaster.c:1690
#21 0x0069081e in PostmasterMain (argc=argc@entry=4, 
argv=argv@entry=0xcf64f0) at postmaster.c:1298
#22 0x0046d80d in main (argc=4, argv=0xcf64f0) at main.c:228


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Just-in-time compiling things (was: asynchronous and vectorized execution)

2016-05-14 Thread Andreas Seltenreich
Konstantin Knizhnik writes:

> Latest information from ISP RAS guys: them have made good progress
> since February: them have rewritten most of methods of Scan, Aggregate
> and Join to LLVM API.

Is their work available somewhere?  I'm experimenting in that area as
well, although I'm using libFirm instead of LLVM.  I wonder what their
motivation to rewrite backend code in LLVM IR was, since I am following
the approach of keeping the IR around when compiling the vanilla
postgres C code, possibly inlining it during JIT and then doing
optimizations on this IR.  That way the logic doesn't have to be
duplicated.

regrads
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-05-05 Thread Andreas Seltenreich
> Amit Kapila writes:
>> Sounds good.  So can we assume that you will try to get us the new report
>> with more information?

I don't see these crashes anymore in c1543a8.  By the amount of fuzzing
done it should have happened a dozen times, so it's highly likely
something in 23b09e15..c1543a8 fixed it.

regards,
andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in parallel worker (ExecInitSubPlan)

2016-05-05 Thread Andreas Seltenreich
Hi,

when fuzz testing master as of c1543a8, parallel workers trigger the
following assertion in ExecInitSubPlan every couple hours.

TRAP: FailedAssertion("!(list != ((List *) ((void *)0)))", File: "list.c", 
Line: 390)

Sample backtraces of a worker and leader below, plan of leader attached.
The collected queries don't seem to reproduce it.  Curiously, running
explain on them on the failed instance after crash recovery never shows
any gather nodes…

regards,
andreas

Core was generated by `postgres: bgworker: parallel worker for PID 28062
   '.
Program terminated with signal SIGABRT, Aborted.
(gdb) bt
#3  0x0061bad2 in list_nth_cell (list=0x0, n=) at 
list.c:390
#4  0x0061bb26 in list_nth (list=, n=) at 
list.c:413
#5  0x005fe566 in ExecInitSubPlan (subplan=subplan@entry=0x1522a08, 
parent=parent@entry=0x1538188) at nodeSubplan.c:705
#6  0x005e3b54 in ExecInitExpr (node=0x1522a08, 
parent=parent@entry=0x1538188) at execQual.c:4724
#7  0x005e415c in ExecInitExpr (node=, 
parent=parent@entry=0x1538188) at execQual.c:5164
#8  0x005ff3fc in ExecInitAlternativeSubPlan 
(asplan=asplan@entry=0x1522060, parent=parent@entry=0x1538188) at 
nodeSubplan.c:1185
#9  0x005e35c4 in ExecInitExpr (node=0x1522060, 
parent=parent@entry=0x1538188) at execQual.c:4740
#10 0x005e3f8a in ExecInitExpr (node=0x1522978, 
parent=parent@entry=0x1538188) at execQual.c:4845
#11 0x005e415c in ExecInitExpr (node=, 
parent=parent@entry=0x1538188) at execQual.c:5164
#12 0x005e3687 in ExecInitExpr (node=0x1522920, 
parent=parent@entry=0x1538188) at execQual.c:4648
#13 0x005e330f in ExecInitExpr (node=0x15228c8, 
parent=parent@entry=0x1538188) at execQual.c:5132
#14 0x005e326f in ExecInitExpr (node=0x1522870, 
parent=parent@entry=0x1538188) at execQual.c:5152
#15 0x005e415c in ExecInitExpr (node=, 
parent=parent@entry=0x1538188) at execQual.c:5164
#16 0x005fbb62 in ExecInitSeqScan (node=0x1522728, estate=0x15379b8, 
eflags=16) at nodeSeqscan.c:192
#17 0x005dd567 in ExecInitNode (node=0x1522728, 
estate=estate@entry=0x15379b8, eflags=eflags@entry=16) at execProcnode.c:192
#18 0x005f12a5 in ExecInitHashJoin (node=0x1522530, estate=0x15379b8, 
eflags=16) at nodeHashjoin.c:489
#19 0x005dd497 in ExecInitNode (node=node@entry=0x1522530, 
estate=estate@entry=0x15379b8, eflags=eflags@entry=16) at execProcnode.c:275
#20 0x005dae6c in InitPlan (eflags=16, queryDesc=) at 
execMain.c:959
#21 standard_ExecutorStart (queryDesc=, eflags=16) at 
execMain.c:238
#22 0x005dcac4 in ParallelQueryMain (seg=, 
toc=0x7f442d27b000) at execParallel.c:729
#23 0x004e631b in ParallelWorkerMain (main_arg=) at 
parallel.c:1033
#24 0x00683af2 in StartBackgroundWorker () at bgworker.c:726
#25 0x0068ec32 in do_start_bgworker (rw=0x14c4a20) at postmaster.c:5531
#26 maybe_start_bgworker () at postmaster.c:5706
#27 0x0068f686 in sigusr1_handler (postgres_signal_arg=) 
at postmaster.c:4967
#28 
#29 0x7f442c839ac3 in __select_nocancel () at 
../sysdeps/unix/syscall-template.S:81
#30 0x0046c144 in ServerLoop () at postmaster.c:1654
#31 0x00690aae in PostmasterMain (argc=argc@entry=4, 
argv=argv@entry=0x14a1560) at postmaster.c:1298
#32 0x0046d78d in main (argc=4, argv=0x14a1560) at main.c:228
(gdb) frame 5
#5  0x005fe566 in ExecInitSubPlan (subplan=subplan@entry=0x1522a08, 
parent=parent@entry=0x1538188) at nodeSubplan.c:705
(gdb) list
704 /* Link the SubPlanState to already-initialized subplan */
705 sstate->planstate = (PlanState *) 
list_nth(estate->es_subplanstates,
706subplan->plan_id - 
1);

(gdb) attach 28062
Attaching to program: /home/smith/postgres/inst/master/bin/postgres, process 
28062
(gdb) bt
#0  0x7f442c840e33 in __epoll_wait_nocancel () at 
../sysdeps/unix/syscall-template.S:81
#1  0x006d1dde in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7fffdedd75a0, cur_timeout=-1, set=0xe3eedb8) at latch.c:981
#2  WaitEventSetWait (set=set@entry=0xe3eedb8, timeout=timeout@entry=-1, 
occurred_events=occurred_events@entry=0x7fffdedd75a0, nevents=nevents@entry=1) 
at latch.c:935
#3  0x006d2226 in WaitLatchOrSocket (latch=0x7f442c0d9644, 
wakeEvents=wakeEvents@entry=1, sock=sock@entry=-1, timeout=-1, timeout@entry=0) 
at latch.c:347
#4  0x006d22ed in WaitLatch (latch=, 
wakeEvents=wakeEvents@entry=1, timeout=timeout@entry=0) at latch.c:302
#5  0x005ef4e3 in gather_readnext (gatherstate=0xe3d7ce8) at 
nodeGather.c:384
#6  gather_getnext (gatherstate=0xe3d7ce8) at nodeGather.c:283
#7  ExecGather (node=node@entry=0xe3d7ce8) at nodeGather.c:229
#8  0x005dd728 in ExecProcNode (node=node@entry=0xe3d7ce8) at 
execProcnode.c:515
#9  0x005f995c in ExecNestLoop (node=node@entry=0x10ef9d90) at 
nodeNestloop.c:174
#10 

Re: [HACKERS] [sqlsmith] Failed assertion in BecomeLockGroupLeader

2016-05-05 Thread Andreas Seltenreich
Alvaro Herrera writes:

> Robert Haas wrote:
>> On Thu, May 5, 2016 at 4:11 PM, Andreas Seltenreich <seltenre...@gmx.de> 
>> wrote:
>> > I don't see these crashes anymore in c1543a8.  By the amount of fuzzing
>> > done it should have happened a dozen times, so it's highly likely
>> > something in 23b09e15..c1543a8 fixed it.
>> 
>> Hmm, I'd guess c45bf5751b6338488bd79ce777210285531da373 to be the most
>> likely candidate.
>
> I thought so too, but then that patch change things in the planner side,
> but it seems to me that the reported crash is in the executor, unless I'm
> misreading.

Tom had a theory in Message-ID: <12751.1461937...@sss.pgh.pa.us> on how
the planner bug could cause the executor crash.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] PANIC: failed to add BRIN tuple

2016-05-25 Thread Andreas Seltenreich
I wrote:

> Re-fuzzing now with your patch applied.

This so far yielded three BRIN core dumps on different machines with the
same backtraces.  Crash recovery fails in these cases.

I've put the data directory here (before recovery):

http://ansel.ydns.eu/~andreas/brincrash2-spidew.tar.xz (9.1M)

Corresponding backtraces of the backend and startup core files below.

regards,
Andreas

Core was generated by `postgres: smith brintest [local] UPDATE  
   '.
Program terminated with signal SIGABRT, Aborted.
#0  0x7f5a49a9c067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
56  ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x7f5a49a9c067 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f5a49a9d448 in __GI_abort () at abort.c:89
#2  0x007ec979 in errfinish (dummy=dummy@entry=0) at elog.c:557
#3  0x007f012c in elog_finish (elevel=elevel@entry=20, 
fmt=fmt@entry=0x989cf0 "incorrect index offsets supplied") at elog.c:1378
#4  0x006eda2f in PageIndexDeleteNoCompact 
(page=page@entry=0x7f5a48c6f200 "Y", itemnos=itemnos@entry=0x7ffe6d0d0abc, 
nitems=nitems@entry=1) at bufpage.c:1011
#5  0x00470119 in brin_doupdate (idxrel=0x1c5a530, pagesPerRange=1, 
revmap=0x91a7d90, heapBlk=2166, oldbuf=3782, oldoff=2, origtup=0x719db80, 
origsz=3656, newtup=0x754e188, newsz=3656, samepage=1 '\001') at 
brin_pageops.c:181
#6  0x0046e5db in brininsert (idxRel=0x1c5a530, values=0x6591, 
nulls=0x6 , 
heaptid=0x, heapRel=0x7f5a4a72f700, 
checkUnique=UNIQUE_CHECK_NO) at brin.c:244
#7  0x005d888f in ExecInsertIndexTuples (slot=0x91a4870, 
tupleid=0x91a7df4, estate=0x91b9b38, noDupErr=0 '\000', specConflict=0x0, 
arbiterIndexes=0x0) at execIndexing.c:383
#8  0x005f74e5 in ExecUpdate (tupleid=0x7ffe6d0d0ed0, oldtuple=0x6591, 
slot=0x91a4870, planSlot=0x, epqstate=0x7f5a4a72f700, 
estate=0x91b9b38, canSetTag=1 '\001') at nodeModifyTable.c:1015
#9  0x005f7b7c in ExecModifyTable (node=0x71861c0) at 
nodeModifyTable.c:1501
#10 0x005dd5e8 in ExecProcNode (node=node@entry=0x71861c0) at 
execProcnode.c:396
#11 0x005d963f in ExecutePlan (dest=0x17bb870, direction=, numberTuples=0, sendTuples=, operation=CMD_UPDATE, 
use_parallel_mode=, planstate=0x71861c0, estate=0x91b9b38) at 
execMain.c:1567
#12 standard_ExecutorRun (queryDesc=0x17bb908, direction=, 
count=0) at execMain.c:338
#13 0x006f74d9 in ProcessQuery (plan=, 
sourceText=0x1670e18 "update public.brintest set \n  charcol = null, \n  
namecol = pg_catalog.name(cast(null as character varying)\n), \n  int8col = 
null, \n  int2col = public.brintest.int2col, \n  textcol = null, \n  float4col 
= null, \n  float8col = cast(coalesce(null,\n\npublic.brintest.float8col) 
as double precision), \n  inetcol = public.brintest.inetcol, \n  cidrcol = 
public.brintest.cidrcol, \n  bpcharcol = null, \n  datecol = (select datecol 
from public.brintest limit 1 offset 25)\n, \n  timecol = (select timecol from 
public.brintest limit 1 offset 42)\n, \n  timetzcol = (select timetzcol from 
public.brintest limit 1 offset 3)\n, \n  numericcol = (select numericcol from 
public.brintest limit 1 offset 32)\n, \n  uuidcol = 
public.brintest.uuidcol\nreturning \n  public.brintest.datecol as c0;", 
params=0x0, dest=0x17bb870, completionTag=0x7ffe6d0d10a0 "") at pquery.c:185
#14 0x006f776f in PortalRunMulti (portal=portal@entry=0x1611b68, 
isTopLevel=isTopLevel@entry=1 '\001', dest=dest@entry=0x17bb870, 
altdest=0xc96680 , 
completionTag=completionTag@entry=0x7ffe6d0d10a0 "") at pquery.c:1267
#15 0x006f7a1c in FillPortalStore (portal=portal@entry=0x1611b68, 
isTopLevel=isTopLevel@entry=1 '\001') at pquery.c:1044
#16 0x006f846d in PortalRun (portal=0x1611b68, 
count=9223372036854775807, isTopLevel=, dest=0x756c870, 
altdest=0x756c870, completionTag=0x7ffe6d0d1450 "") at pquery.c:782
#17 0x006f5c73 in exec_simple_query (query_string=) at 
postgres.c:1094
#18 PostgresMain (argc=23141224, argv=0x2cab518, dbname=0x15f3578 "brintest", 
username=0x2cab570 "\030\265\312\002") at postgres.c:4059
#19 0x0046c8d2 in BackendRun (port=0x1618880) at postmaster.c:4258
#20 BackendStartup (port=0x1618880) at postmaster.c:3932
#21 ServerLoop () at postmaster.c:1690
#22 0x006907fe in PostmasterMain (argc=argc@entry=4, 
argv=argv@entry=0x15f2560) at postmaster.c:1298
#23 0x0046d82d in main (argc=4, argv=0x15f2560) at main.c:228
(gdb) frame 4
#4  0x006eda2f in PageIndexDeleteNoCompact 
(page=page@entry=0x7f5a48c6f200 "Y", itemnos=itemnos@entry=0x7ffe6d0d0abc, 
nitems=nitems@entry=1) at bufpage.c:1011
1011elog(ERROR, "incorrect index offsets supplied");
(gdb) list
1006}
1007}
1008
1009/* this will catch invalid or out-of-order itemnos[] */
1010 

Re: [HACKERS] GiST index build versus NaN coordinates

2016-07-12 Thread Andreas Seltenreich
Tom Lane writes:

> More generally, this example makes me fearful that NaN coordinates in
> geometric values are likely to cause all sorts of issues.  It's too late
> to disallow them, probably, but I wonder how can we identify other bugs
> of this ilk.

Sounds like some fuzz testing with nan/infinity is in order.  sqlsmith
doesn't generate any float literals, but it calls functions to satisfy
its need for values of specific types.  Adding suitable functions[1] to
the regression db, I made the following observations:

The infinite loop from the bug report was triggered. Further, two
previously unseen errors are logged:

ERROR:  timestamp cannot be NaN
ERROR:  getQuadrant: impossible case

The first is porbably as boring as it gets, the second one is from the
getQuadrant() in spgquadtreeproc.c.

Curiously, the getQuadrant()s in geo_spgist.c and rangetypes_spgist.c do
not have such a check.  I guess the boxes will just end up in an
undefined position in the index for these.

regards
Andreas

Footnotes:
[1]
create function smith_double_inf() returns float as $$select 
'infinity'::float$$ language sql immutable;
create function smith_double_ninf() returns float as $$select 
'-infinity'::float$$ language sql immutable;
create function smith_double_nan() returns float as $$select 'nan'::float$$ 
language sql immutable;
create function smith_real_nan() returns real as $$select 'nan'::real$$ 
language sql immutable;
create function smith_real_inf() returns real as $$select 'infinity'::real$$ 
language sql immutable;
create function smith_real_ninf() returns real as $$select '-infinity'::real$$ 
language sql immutable;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Issue in pg_catalog.pg_indexes view definition

2016-07-14 Thread Andreas Seltenreich
Tom Lane writes:

> Dilip Kumar  writes:
>> So I think changing the view definition and calling this function on
>> indexrelid will remove the error. So I think
>> correct fix is to change view definition, as I proposed in above patch.
[...]
> We've dealt with similar issues in places like pg_relation_size() by
> making the functions return NULL instead of throwing an error for an
> unmatched argument OID.

Note that Michael Paquier sent a patch implementing this in another
thread:

https://www.postgresql.org/message-id/cab7npqtxf5dtxjezb7xkjvowxx8d_2atxmtu3psnkhcwt_j...@mail.gmail.com

regards,
andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving executor performance

2016-07-14 Thread Andreas Seltenreich
Andres Freund writes:

> Having expression evaluation and slot deforming as a series of simple
> sequential steps, instead of complex recursive calls, would also make it
> fairly straightforward to optionally just-in-time compile those.

I don't think that JIT becomes easier by this change.  Constructing the
IR for LLVM, libFirm or any other JIT library from expression trees is
straightforward already.  It's probably more of a nuisance for those
that already have some code/design on JIT-compiling expressions
(vitessedb, ISP RAS, yours truly)

I like your patch for all the other reasons stated though!

regards
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving executor performance

2016-07-14 Thread Andreas Seltenreich
Andres Freund writes:

> The problem is that the previous form has a lot of ad-hoc analysis
> strewn in. The interesting part is getting rid of all that. That's what
> the new ExecInitExpr2() does. The target form can be both evaluated more
> efficiently in the dispatch manner in the patch, and quite simply
> converted to a JIT - without duplicating the analysis code.  I did write
> a small ad-hoc x86 jit, and it was really quite straightforward this
> way.

Ja, I see the advantage when doing ad-hoc-JIT compilation.

> What did you do with JIT and expression evaluation? You basically just
> replaced the toplevel ExprState note with a different evalfunc, pointing
> into your code?

That's the plan, yes.  I'm sorry there's no publishable code yet on the
the postgres side of things.  Using libFirm[1], the plan is to.

1. Automatically generate Firm-IR for the static C code around
   expression evaluation as well operators in the system catalog.

2. Construct IR for expression trees (essentially all the function calls
   the executor would do).

3. Push libFirm's optimize button.  At this stage, most of the
   dispatching goes away by inlining the calls including functions from
   the catalog implementing operators.

4. Generate code and replace the toplevel expression node with a funcall
   node.

I did implement this recipe with a toy Forth interpreter to see whether
libFirm was up to the job (Nobody's done JIT with libFirm before).  The
results were favorable[2].  Currently, a student at credativ is working
on applying these techniques to postgres.

regards,
Andreas

Footnotes: 
[1]  http://libfirm.org/

[2]  https://korte.credativ.com/~ase/firm-postgres-jit-forth.pdf



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving executor performance

2016-07-14 Thread Andreas Seltenreich
Andres Freund writes:

> On 2016-07-14 23:03:10 +0200, Andreas Seltenreich wrote:
>> That's the plan, yes.  I'm sorry there's no publishable code yet on the
>> the postgres side of things.  Using libFirm[1], the plan is to.
>
> Why libfirm?

- It has a more modern IR than LLVM (they're catching up though)
- It's very lightweight, compiles faster than LLVM's ./configure run
- I do have lots of experience with it and a committer bit

> It seems to only have x86 and sparc backends, and no windows support?

Ack, it's mostly used in research projects, that's why the number of
supported ISAs is small.  It's enough to answer the burning question
what speedup is to expected by jit-compiling things in the backend
though.  Also, if this thing actually takes off, adding more backends is
something that is easier with libFirm than LLVM, IMHO.

>> 1. Automatically generate Firm-IR for the static C code around
>>expression evaluation as well operators in the system catalog.
>
>> 2. Construct IR for expression trees (essentially all the function calls
>>the executor would do).
>
> But that essentially means redoing most of execQual's current code in IR
> - or do you want to do that via 1) above?

Manually re-doing backend logic in IR is a can of worms I do not want to
open.  This would guarantee bugs and result in a maintenance nightmare,
so doing 1) for the code is the only option when something turns out to
be a bottleneck IMHO.

> As long as the preparation code (which is currently intermixed with
> the execution phase) isn't separated, that means pulling essentially
> the whole backend given that we do catalog lookups and everything
> during that phase.

Right, the catalog lookups need to be done before JIT-compilation to
allow inlining operators.

>> Currently, a student at credativ is working on applying these
>> techniques to postgres.
>
> Are you planning to support this to postgres proper?

The goal is to publish it as an extension that sneaks into planner_hook.
I think BSD-licensing is ok as long as libfirm (LGPL) is kept as an
external dependency.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] FailedAssertion("!(k == indices_count)", File: "tsvector_op.c", Line: 511)

2016-08-03 Thread Andreas Seltenreich
Hi,

the following statement triggers an assertion in tsearch:

select ts_delete(array_to_tsvector('{smith,smith,smith}'::text[]),  
'{smith,smith}'::text[]);
-- TRAP: FailedAssertion("!(k == indices_count)", File: "tsvector_op.c", Line: 
511)

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 10200)

2016-08-03 Thread Andreas Seltenreich
Hi,

testing with sqlsmith shows that the following assertion doesn't hold:

FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 
10200)

The triggering statements always contain a call to pg_start_backup with
the third argument 'true', i.e. it's trying to start an exlusive backup.

I didn't manage to put together a stand-alone testcase yet.

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST index build versus NaN coordinates

2016-07-16 Thread Andreas Seltenreich
I wrote:

> Sounds like some fuzz testing with nan/infinity is in order.

related fallout: close_ps returns a NULL pointer with NaNs around:

select close_ps('(nan,nan)', '(nan,nan),(nan,nan)');
-- TRAP: FailedAssertion("!(result != ((void *)0))", File: "geo_ops.c", Line: 
2860)

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Crash in GetOldestSnapshot()

2016-08-06 Thread Andreas Seltenreich
Hi,

since updating master from c93d873..fc509cd, I see crashes in
GetOldestSnapshot() on update/delete returning statements.

I reduced the triggering statements down to this:

update clstr_tst set d = d returning d;

Backtrace below.

regards,
Andreas

Program received signal SIGSEGV, Segmentation fault.
(gdb) bt
#0  GetOldestSnapshot () at snapmgr.c:422
#1  0x004b8279 in init_toast_snapshot (toast_snapshot=0x7ffcd824b010) 
at tuptoaster.c:2314
#2  0x004b83bc in toast_fetch_datum (attr=) at 
tuptoaster.c:1869
#3  0x004b9ab5 in heap_tuple_untoast_attr (attr=0x18226c8) at 
tuptoaster.c:179
#4  0x007f71ad in pg_detoast_datum_packed (datum=) at 
fmgr.c:2266
#5  0x007cfc12 in text_to_cstring (t=0x18226c8) at varlena.c:186
#6  0x007f5735 in FunctionCall1Coll (flinfo=flinfo@entry=0x18221c0, 
collation=collation@entry=0, arg1=arg1@entry=25306824) at fmgr.c:1297
#7  0x007f68ee in OutputFunctionCall (flinfo=0x18221c0, val=25306824) 
at fmgr.c:1946
#8  0x00478bc1 in printtup (slot=0x1821f80, self=0x181ce48) at 
printtup.c:359
#9  0x006f9c8e in RunFromStore (portal=portal@entry=0x177cbf8, 
direction=direction@entry=ForwardScanDirection, count=count@entry=0, 
dest=0x181ce48) at pquery.c:1117
#10 0x006f9d52 in PortalRunSelect (portal=portal@entry=0x177cbf8, 
forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, 
dest=dest@entry=0x181ce48) at pquery.c:942
#11 0x006fb41e in PortalRun (portal=portal@entry=0x177cbf8, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', 
dest=dest@entry=0x181ce48, altdest=altdest@entry=0x181ce48, 
completionTag=completionTag@entry=0x7ffcd824b920 "") at pquery.c:787
#12 0x006f822b in exec_simple_query (query_string=0x17db878 "update 
clstr_tst set d = d returning d;") at postgres.c:1094
#13 PostgresMain (argc=, argv=argv@entry=0x1781ce0, 
dbname=0x1781b40 "regression", username=) at postgres.c:4074
#14 0x0046c9bd in BackendRun (port=0x1786920) at postmaster.c:4262
#15 BackendStartup (port=0x1786920) at postmaster.c:3936
#16 ServerLoop () at postmaster.c:1693
#17 0x00693044 in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x175d5f0) at postmaster.c:1301
#18 0x0046dd26 in main (argc=3, argv=0x175d5f0) at main.c:228
(gdb) list
417
418 if (OldestActiveSnapshot != NULL)
419 ActiveLSN = OldestActiveSnapshot->as_snap->lsn;
420
421 if (XLogRecPtrIsInvalid(RegisteredLSN) || RegisteredLSN > 
ActiveLSN)
422 return OldestActiveSnapshot->as_snap;
423
424 return OldestRegisteredSnapshot;
425 }
426
(gdb) bt full
#0  GetOldestSnapshot () at snapmgr.c:422
OldestRegisteredSnapshot = 
RegisteredLSN = 
ActiveLSN = 
#1  0x004b8279 in init_toast_snapshot (toast_snapshot=0x7ffcd824b010) 
at tuptoaster.c:2314
snapshot = 
#2  0x004b83bc in toast_fetch_datum (attr=) at 
tuptoaster.c:1869
toastrel = 0x7f8b4ca88920
toastidxs = 0x18447c8
toastkey = {
  sk_flags = 0,
  sk_attno = 1,
  sk_strategy = 3,
  sk_subtype = 0,
  sk_collation = 100,
  sk_func = {
fn_addr = 0x77c490 ,
fn_oid = 184,
fn_nargs = 2,
fn_strict = 1 '\001',
fn_retset = 0 '\000',
fn_stats = 2 '\002',
fn_extra = 0x0,
fn_mcxt = 0x18282a8,
fn_expr = 0x0
  },
  sk_argument = 34491
}
toastscan = 
ttup = 
toasttupDesc = 0x7f8b4ca88c50
result = 0x18422d8
toast_pointer = 
ressize = 5735
residx = 
nextidx = 0
numchunks = 3
chunk = 
isnull = 
chunkdata = 
chunksize = 
num_indexes = 1
validIndex = 0
SnapshotToast = {
  satisfies = 0x112,
  xmin = 3626283536,
  xmax = 32764,
  xip = 0xf8ac628,
  xcnt = 5221870,
  subxip = 0x0,
  subxcnt = 0,
  suboverflowed = 0 '\000',
  takenDuringRecovery = 0 '\000',
  copied = 0 '\000',
  curcid = 14,
  speculativeToken = 0,
  active_count = 260753304,
  regd_count = 0,
  ph_node = {
first_child = 0xf8ac680,
next_sibling = 0xa400112,
prev_or_parent = 0x0
  },
  whenTaken = 274,
  lsn = 0
}
__func__ = "toast_fetch_datum"
#3  0x004b9ab5 in heap_tuple_untoast_attr (attr=0x18226c8) at 
tuptoaster.c:179
No locals.
#4  0x007f71ad in pg_detoast_datum_packed (datum=) at 
fmgr.c:2266
No locals.
#5  0x007cfc12 in text_to_cstring (t=0x18226c8) at varlena.c:186
tunpacked = 
result = 
#6  0x007f5735 in FunctionCall1Coll (flinfo=flinfo@entry=0x18221c0, 

Re: [HACKERS] [sqlsmith] FailedAssertion("!(XLogCtl->Insert.exclusiveBackup)", File: "xlog.c", Line: 10200)

2016-08-06 Thread Andreas Seltenreich
Michael Paquier writes:

> Andreas, with the patch attached is the assertion still triggered?
> [2. text/x-diff; base-backup-crash-v2.patch]

I didn't observe the crashes since applying this patch.  There should
have been about five by the amount of fuzzing done.

regards
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Crash in pg_get_viewdef_name_ext()

2016-08-07 Thread Andreas Seltenreich
Hi,

sqlsmith just triggered a crash in pg_get_viewdef_name_ext().  Looks
like commit 976b24fb4 failed to update this caller of
pg_get_viewdef_worker().  Backtrace below.  Patch attached.

regards,
Andreas

Program terminated with signal SIGSEGV, Segmentation fault.
(gdb) bt
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
#1  0x007cdf09 in cstring_to_text (s=s@entry=0x0) at varlena.c:152
#2  0x007a3409 in string_to_text (str=0x0) at ruleutils.c:10083
#3  pg_get_viewdef_name_ext (fcinfo=) at ruleutils.c:681
#4  0x005dfae2 in ExecMakeFunctionResultNoSets (fcache=0x403ed80, 
econtext=0x3fb0eb8, isNull=0x403e0a1 "", isDone=) at 
execQual.c:2041
[...]

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index d68ca7a..3c3fce4 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -671,6 +671,7 @@ pg_get_viewdef_name_ext(PG_FUNCTION_ARGS)
 	int			prettyFlags;
 	RangeVar   *viewrel;
 	Oid			viewoid;
+	char		*res;
 
 	prettyFlags = pretty ? PRETTYFLAG_PAREN | PRETTYFLAG_INDENT : PRETTYFLAG_INDENT;
 
@@ -678,7 +679,12 @@ pg_get_viewdef_name_ext(PG_FUNCTION_ARGS)
 	viewrel = makeRangeVarFromNameList(textToQualifiedNameList(viewname));
 	viewoid = RangeVarGetRelid(viewrel, NoLock, false);
 
-	PG_RETURN_TEXT_P(string_to_text(pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT)));
+	res = pg_get_viewdef_worker(viewoid, prettyFlags, WRAP_COLUMN_DEFAULT);
+
+	if (res == NULL)
+		PG_RETURN_NULL();
+
+	PG_RETURN_TEXT_P(string_to_text(res));
 }
 
 /*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] crashes in RestoreSnapshot on hot standby

2016-06-30 Thread Andreas Seltenreich
Running sqlsmith on a streaming slave (master as of f8c5855) is
inconspicuous as long as the master is idle.  As soon as I start it on
the master as well, the standby frequently crashes in RestoreSnapshot.
It doesn't seem to be specific to the queries, as they don't trigger a
crash when re-run.

Backtraces always look like the ones below.

regards,
Andreas

 BEGIN BACKTRACE OF CORE FILE ./slave/postgres.9826@.core ON doombat 
Core was generated by `postgres: smith regression [local] SELECT
 '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __memcpy_sse2_unaligned () at 
../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:167
167 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Datei oder 
Verzeichnis nicht gefunden.
(gdb) bt
#0  __memcpy_sse2_unaligned () at 
../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:167
#1  0x00822032 in RestoreSnapshot 
(start_address=start_address@entry=0x7f2701d5a110 ) at snapmgr.c:2020
#2  0x004a934a in heap_beginscan_parallel (relation=0x2060a90, 
parallel_scan=parallel_scan@entry=0x7f2701d5a0f8) at heapam.c:1657
#3  0x005fbedf in ExecSeqScanInitializeDSM (node=0x1f5b470, 
pcxt=0x221af88) at nodeSeqscan.c:327
#4  0x005dd0ad in ExecParallelInitializeDSM 
(planstate=planstate@entry=0x1f5b470, d=d@entry=0x7ffd4ba200d0) at 
execParallel.c:245
#5  0x005dd425 in ExecInitParallelPlan (planstate=0x1f5b470, 
estate=estate@entry=0x1f5ab28, nworkers=2) at execParallel.c:477
#6  0x005ef4a4 in ExecGather (node=node@entry=0x1f5b048) at 
nodeGather.c:159
#7  0x005dda48 in ExecProcNode (node=node@entry=0x1f5b048) at 
execProcnode.c:515
#8  0x005f4b30 in ExecLimit (node=node@entry=0x1f5acd0) at 
nodeLimit.c:91
#9  0x005dd9d8 in ExecProcNode (node=node@entry=0x1f5acd0) at 
execProcnode.c:531
#10 0x005fef7c in ExecSetParamPlan (node=, 
econtext=0x1f5c138) at nodeSubplan.c:999
#11 0x005e28b5 in ExecEvalParamExec (exprstate=, 
econtext=, isNull=0x22045b0 "", isDone=) at 
execQual.c:1135
#12 0x005deb6d in ExecMakeFunctionResultNoSets (fcache=0x2204200, 
econtext=0x1f5c138, isNull=0x2203d98 "", isDone=) at 
execQual.c:2015
#13 0x005de29a in ExecEvalCoalesce (coalesceExpr=, 
econtext=0x1f5c138, isNull=0x2203d98 "", isDone=) at 
execQual.c:3446
#14 0x005deb6d in ExecMakeFunctionResultNoSets (fcache=0x22039e8, 
econtext=0x1f5c138, isNull=0x7ffd4ba203df "", isDone=) at 
execQual.c:2015
#15 0x005e4939 in ExecQual (qual=, 
econtext=econtext@entry=0x1f5c138, resultForNull=resultForNull@entry=0 '\000') 
at execQual.c:5269
#16 0x005faef1 in ExecResult (node=node@entry=0x1f5c020) at 
nodeResult.c:82
#17 0x005ddbf8 in ExecProcNode (node=node@entry=0x1f5c020) at 
execProcnode.c:392
#18 0x005d9c1f in ExecutePlan (dest=0x1ebb7d0, direction=, numberTuples=0, sendTuples=, operation=CMD_SELECT, 
use_parallel_mode=, planstate=0x1f5c020, estate=0x1f5ab28) at 
execMain.c:1567
#19 standard_ExecutorRun (queryDesc=0x1f5a718, direction=, 
count=0) at execMain.c:338
#20 0x006f7238 in PortalRunSelect (portal=portal@entry=0x1d13be8, 
forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, 
dest=dest@entry=0x1ebb7d0) at pquery.c:946
#21 0x006f875e in PortalRun (portal=0x1d13be8, 
count=9223372036854775807, isTopLevel=, dest=0x1ebb7d0, 
altdest=0x1ebb7d0, completionTag=0x7ffd4ba20840 "") at pquery.c:787
#22 0x006f6003 in exec_simple_query (query_string=) at 
postgres.c:1094
#23 PostgresMain (argc=30489576, argv=0x1ecfb08, dbname=0x1cf5a00 "regression", 
username=0x1ecfc20 "\b\373\354\001") at postgres.c:4074
#24 0x0046ca67 in BackendRun (port=0x1d17b50) at postmaster.c:4262
#25 BackendStartup (port=0x1d17b50) at postmaster.c:3936
#26 ServerLoop () at postmaster.c:1693
#27 0x00690ab7 in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x1cf45e0) at postmaster.c:1301
#28 0x0046d9cd in main (argc=3, argv=0x1cf45e0) at main.c:228
(gdb) p debug_query_string
$1 = 0x1d68a78 "select  \n  sample_0.j as c0\nfrom \n  public.testjsonb as 
sample_0 tablesample system (8) \nwhere cast(coalesce(pg_catalog.char_length(\n 
 cast((select comment from public.room limit 1 offset 20)\n as 
text)),\npg_catalog.pg_trigger_depth()) as integer) <> 3"

 BEGIN BACKTRACE OF CORE FILE ./slave/postgres.8104@.core ON marbit 
Core was generated by `postgres: bgworker: parallel worker for PID 2610 
 '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __memcpy_sse2_unaligned () at 
../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:167
167 ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S: Datei oder 
Verzeichnis nicht gefunden.
(gdb) bt
#0  __memcpy_sse2_unaligned () at 
../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:167
#1  0x00822032 in RestoreSnapshot (start_address=0x7f7b1ee4fa58 ) at snapmgr.c:2020
#2  

Re: [HACKERS] [sqlsmith] crashes in RestoreSnapshot on hot standby

2016-06-30 Thread Andreas Seltenreich
Amit Kapila writes:
> On Fri, Jul 1, 2016 at 9:38 AM, Thomas Munro  
> wrote:
>> Or maybe just like this?
>>
>> -   snapshot->subxip = snapshot->xip + serialized_snapshot->xcnt;
>> +   snapshot->subxip = ((TransactionId *) (snapshot + 1)) +
>> +   serialized_snapshot->xcnt;
>>
>
> This way it looks better to me.  Thanks for the patch.

I no longer see these crashes when testing with the patch applied.

thanks,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] ERROR: plan should not reference subplan's variable

2016-07-01 Thread Andreas Seltenreich
Updating master from f8c5855..1bdae16, sqlsmith triggers "failed to
generate plan" errors again.  Below is the smallest query logged so far.

regards,
Andreas

-- ERROR:  plan should not reference subplan's variable
set force_parallel_mode = 'on';
set max_parallel_workers_per_gather = '1';

explain WITH
jennifer_0 AS (select
(select b from public.rtest_v1 limit 1 offset 5)
   as c0,
pg_catalog.pg_current_xlog_location() as c1,
sample_0.a as c2,
sample_0.a as c3
  from
public.rtest_view4 as sample_0 tablesample system (5.9)
  where cast(null as bigint) = pg_catalog.hashinet(
  cast((select client_addr from pg_catalog.pg_stat_activity limit 1 offset 
35)
 as inet))
  limit 76)
select
ref_0.sl_name as c0
  from
public.shoelace as ref_0
  where (cast(null as anyrange) < cast(null as anyrange))
and (EXISTS (
  select
  39 as c0
from
  jennifer_0 as ref_1
where cast(null as real) = cast(null as real)
limit 81))
  limit 96;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS planning

2017-01-19 Thread Andreas Seltenreich
Tom Lane writes:

> Thanks for reviewing --- I'll do something with that test case and
> push it.

sqlsmith doesn't seem to like 215b43cdc:

select 1 from information_schema.usage_privileges
where information_schema._pg_keysequal(
   (select null::smallint[]),
   '{16,25,23}');

-- TRAP: FailedAssertion("!(!and_clause((Node *) clause))", File: 
"restrictinfo.c", Line: 81)

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in numeric aggregate

2016-09-03 Thread Andreas Seltenreich
Hi,

updating master from be7f7ee..39b691f, the following assertion is
triggered frequently by sqlsmith:

TRAP: BadArgument("!(((context) != ((void *)0) && (const 
Node*)((context)))->type) == T_AllocSetContext", File: "mcxt.c", Line: 1010)

Digging in the coredumps, it looks like set_var_from_num() is invoked on
an uninitialized NumericVar.  Sample gdb session below.

Below is also one of the generated queries that eventually triggers it
for me when invoked a dozen times or so.

regards,
Andreas

--8<---cut here---start->8---
select
  subq_0.c0 as c0,
  subq_0.c0 as c1,
  5 as c2,
  (select pg_catalog.min(class) from public.f_star)
 as c3
from
  (select
sample_2.cc as c0
  from
public.shoelace_arrive as ref_0
  inner join public.hub as sample_1
right join public.e_star as sample_2
on (sample_1.name = sample_2.class )
  on (ref_0.arr_name = sample_2.class )
  limit 63) as subq_0
where ((subq_0.c0 is not NULL)
and ((select pg_catalog.var_pop(enumsortorder) from pg_catalog.pg_enum)
 is not NULL))
  and (((select pg_catalog.var_samp(random) from public.bt_txt_heap)
 is NULL)
or ((select m from public.money_data limit 1 offset 1)
 <> (select pg_catalog.min(salary) from public.rtest_empmass)
));
--8<---cut here---end--->8---

(gdb) bt
#0  0x7ff011f221c8 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
#1  0x7ff011f2364a in __GI_abort () at abort.c:89
#2  0x007ef1b1 in ExceptionalCondition 
(conditionName=conditionName@entry=0x9d26c8 "!(((context) != ((void *)0) && 
(const Node*)((context)))->type) == T_AllocSetContext", 
errorType=errorType@entry=0x835c25 "BadArgument", 
fileName=fileName@entry=0x9d2640 "mcxt.c", lineNumber=lineNumber@entry=1010) at 
assert.c:54
#3  0x00813561 in pfree (pointer=) at mcxt.c:1010
#4  0x00773169 in alloc_var (var=var@entry=0x7ffe3a6d18d0, 
ndigits=ndigits@entry=6) at numeric.c:5387
#5  0x00774230 in set_var_from_num (num=0x1e49180, dest=0x7ffe3a6d18d0) 
at numeric.c:5608
#6  0x0077be2c in numeric_poly_deserialize (fcinfo=) at 
numeric.c:4196
#7  0x005ec48c in combine_aggregates (aggstate=0x1e255d8, 
pergroup=) at nodeAgg.c:986
#8  0x005edcc5 in agg_retrieve_direct (aggstate=0x1e255d8) at 
nodeAgg.c:2095
#9  ExecAgg (node=node@entry=0x1e255d8) at nodeAgg.c:1837
#10 0x005e0078 in ExecProcNode (node=node@entry=0x1e255d8) at 
execProcnode.c:503
#11 0x0060173c in ExecSetParamPlan (node=, 
econtext=0x1e2e710) at nodeSubplan.c:995
#12 0x005e4f75 in ExecEvalParamExec (exprstate=, 
econtext=, isNull=0x7ffe3a6d1b3f "", isDone=) at 
execQual.c:1140
#13 0x005e14c6 in ExecEvalNullTest (nstate=0x1e2ec50, 
econtext=0x1e2e710, isNull=0x7ffe3a6d1b3f "", isDone=0x0) at execQual.c:3902
#14 0x005e0656 in ExecEvalOr (orExpr=, 
econtext=0x1e2e710, isNull=0x7ffe3a6d1b3f "", isDone=) at 
execQual.c:2809
#15 0x005e7089 in ExecQual (qual=, 
econtext=econtext@entry=0x1e2e710, resultForNull=resultForNull@entry=0 '\000') 
at execQual.c:5379
#16 0x005fd6b1 in ExecResult (node=node@entry=0x1e2e5f8) at 
nodeResult.c:82
#17 0x005e01f8 in ExecProcNode (node=node@entry=0x1e2e5f8) at 
execProcnode.c:392
#18 0x005dc27e in ExecutePlan (dest=0x7ff0129e22b0, 
direction=, numberTuples=0, sendTuples=, 
operation=CMD_SELECT, use_parallel_mode=, planstate=0x1e2e5f8, 
estate=0x1e1aba8) at execMain.c:1567
#19 standard_ExecutorRun (queryDesc=0x1d563b8, direction=, 
count=0) at execMain.c:338
#20 0x006faad8 in PortalRunSelect (portal=portal@entry=0x1def878, 
forward=forward@entry=1 '\001', count=0, count@entry=9223372036854775807, 
dest=dest@entry=0x7ff0129e22b0) at pquery.c:948
#21 0x006fc04e in PortalRun (portal=portal@entry=0x1def878, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=1 '\001', 
dest=dest@entry=0x7ff0129e22b0, altdest=altdest@entry=0x7ff0129e22b0, 
completionTag=completionTag@entry=0x7ffe3a6d1fa0 "") at pquery.c:789
#22 0x006f8deb in exec_simple_query (query_string=0x1dc2e58 "select  \n 
 subq_0.c0 as c0, \n  subq_0.c0 as c1, \n  5 as c2, \n  (select 
pg_catalog.min(class) from public.f_star)\n as c3\nfrom \n  (select  \n 
   sample_2.cc as c0\n  from \npublic.shoel"...) at postgres.c:1094
#23 PostgresMain (argc=, argv=argv@entry=0x1d64730, 
dbname=0x1d64590 "regression", username=) at postgres.c:4070
#24 0x0046cf81 in BackendRun (port=0x1d4ffd0) at postmaster.c:4260
#25 BackendStartup (port=0x1d4ffd0) at postmaster.c:3934
#26 ServerLoop () at postmaster.c:1691
#27 0x00693634 in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x1d2c5d0) at postmaster.c:1299
#28 0x0046e0d6 in main (argc=3, argv=0x1d2c5d0) at main.c:228
(gdb) frame 5
#5  0x00774230 in 

[HACKERS] [sqlsmith] Missing CHECK_FOR_INTERRUPTS in tsquery_rewrite

2016-10-29 Thread Andreas Seltenreich
Hi,

testing with sqlsmith yielded an uncancellable backend hogging CPU time.
Gdb showed it was busy in findeq() of tsquery_rewrite.c.  This function
appears to have exponential complexity wrt. the size of the involved
tsqueries.  The following query runs for 12s on my machine with no way
to cancel it and incrementing the length of the first argument by 1
doubles this time.

select ts_rewrite(
  (select string_agg(i::text, '&')::tsquery from generate_series(1,32) g(i)),
  (select string_agg(i::text, '&')::tsquery from generate_series(1,19) g(i)),
  'foo');

The attached patch adds a CHECK_FOR_INTERRUPTS to make it cancellable.

regards,
Andreas

>From d9910a96c9bd73c16e29ecaa0577945d5e1c091c Mon Sep 17 00:00:00 2001
From: Andreas Seltenreich <seltenre...@gmx.de>
Date: Sun, 30 Oct 2016 03:25:55 +0100
Subject: [PATCH] Add CHECK_FOR_INTERRUPTS in tsquery_rewrite loop.

The loop in findeq() appears to have exponential complexity and
runtime becomes excessive for more than about 30 tokens in the
tsvectors.  Add a CHECK_FOR_INTERRUPTS to make it cancellable.
---
 src/backend/utils/adt/tsquery_rewrite.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/backend/utils/adt/tsquery_rewrite.c b/src/backend/utils/adt/tsquery_rewrite.c
index 28f328d..ef6444f 100644
--- a/src/backend/utils/adt/tsquery_rewrite.c
+++ b/src/backend/utils/adt/tsquery_rewrite.c
@@ -95,6 +95,10 @@ findeq(QTNode *node, QTNode *ex, QTNode *subs, bool *isfind)
 
 			do
 			{
+/* This loop is rather heavyweight, it better be
+ * cancellable. */
+CHECK_FOR_INTERRUPTS();
+
 tnode->sign = 0;
 for (i = 0; i < ex->nchild; i++)
 {
-- 
2.9.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in TS_phrase_execute

2016-11-26 Thread Andreas Seltenreich
Hi,

the query below triggers an assertion in TS_phrase_execute.  Testing was
done on master at dbdfd11.

regards,
Andreas

-- TRAP: FailedAssertion("!(curitem->qoperator.oper == 4)", File: 
"tsvector_op.c", Line: 1432)

select 'moscow' @@
   ts_rewrite('moscow', 'moscow',
  ts_rewrite(
 tsquery_phrase('moscow','moscow'),
 'moscow',
 $$ 'sanct' & 'peter'$$));


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Parallel worker crash on seqscan

2016-11-21 Thread Andreas Seltenreich
Hi,

Ashutosh Sharma writes:

>> the following query appears to reliably crash parallel workers on master
>> as of 0832f2d.

> As suggested, I have tried to reproduce this issue on *0832f2d* commit but
> could not reproduce it.

as Tom pointed out earlier, I had secretly set parallel_setup_cost and
parallel_tuple_cost to 0.  I assumed these were irrelevant when
force_parallel_mode is on.  I'll do less assuming and more testing on a
vanilla install on future reports.

Sorry for the inconvenience,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Failed assertion in parallel worker in ExecInitSubPlan

2016-11-24 Thread Andreas Seltenreich
Hi,

just caught another InitPlan below Gather with the recent patches in
(master as of 4cc6a3f).  Recipe below.

regards,
andreas

set max_parallel_workers_per_gather = 2;
set min_parallel_relation_size = 0;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;

explain select 1 from
   public.quad_point_tbl as ref_0,
   lateral (select
 ref_0.p as c3,
 sample_0.d as c5
   from
 public.nv_child_2010 as sample_0
   left join public.mvtest_tvv as ref_1
   on ('x'< (select contype from pg_catalog.pg_constraint limit 1))
   limit 82) as subq_0;

--QUERY PLAN
-- 

--  Gather  (cost=0.19..13727.52 rows=902246 width=4)
--Workers Planned: 2
--->  Nested Loop  (cost=0.19..13727.52 rows=902246 width=4)
--  ->  Parallel Seq Scan on quad_point_tbl ref_0  (cost=0.00..105.85 
rows=4585 width=16)
--  ->  Limit  (cost=0.19..1.33 rows=82 width=20)
--InitPlan 1 (returns $0)
--  ->  Limit  (cost=0.00..0.19 rows=1 width=1)
--->  Gather  (cost=0.00..10.22 rows=54 width=1)
--  Workers Planned: 2
--  ->  Parallel Seq Scan on pg_constraint  
(cost=0.00..10.22 rows=22 width=1)
--->  Seq Scan on nv_child_2010 sample_0  (cost=0.00..35.50 
rows=2550 width=20)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Crash on GUC serialization

2016-11-19 Thread Andreas Seltenreich
Hi,

sqlsmith just made a GUC that tricks the serialization code into
dereferencing a nullpointer.  Here's a recipe:

--8<---cut here---start->8---
set min_parallel_relation_size to 0;
set max_parallel_workers_per_gather to 2;
set force_parallel_mode to on;
begin;
select set_config('foo.baz', null, b) from (values (false), (true)) g(b);
commit;
select 1;
--8<---cut here---end--->8---

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Crash on GUC serialization

2016-11-19 Thread Andreas Seltenreich
Michael Paquier writes:

> [2. text/plain; fix-guc-string-eval.patch]

I'm afraid taking care of the length computation is not sufficient.
ISTM like it'll still try to serialize the NULL pointer later on in
serialize_variable:

,[ guc.c:9108 ]
| case PGC_STRING:
| {
|   struct config_string *conf = (struct config_string *) gconf;
|   do_serialize(destptr, maxbytes, "%s", *conf->variable);
`


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Parallel worker crash on seqscan

2016-11-20 Thread Andreas Seltenreich
Hi,

the following query appears to reliably crash parallel workers on master
as of 0832f2d.

--8<---cut here---start->8---
set max_parallel_workers_per_gather to 2;
set force_parallel_mode to 1;

select subq.context from pg_settings,
lateral (select context from pg_opclass limit 1) as subq
limit 1;
--8<---cut here---end--->8---

Backtrace of a worker below.

regards,
Andreas

Core was generated by `postgres: bgworker: parallel worker for PID 27448'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  MakeExpandedObjectReadOnlyInternal (d=0) at expandeddatum.c:100
100 if (!VARATT_IS_EXTERNAL_EXPANDED_RW(DatumGetPointer(d)))
(gdb) bt
#0  MakeExpandedObjectReadOnlyInternal (d=0) at expandeddatum.c:100
#1  0x563b0c9a4e98 in ExecTargetList (tupdesc=, 
isDone=0x7ffdd20400ac, itemIsDone=0x563b0e6a8b50, isnull=0x563b0e6a8ae0 "", 
values=0x563b0e6a8ac0, econtext=0x563b0e6a7db8, targetlist=0x563b0e6a8498) at 
execQual.c:5491
#2  ExecProject (projInfo=projInfo@entry=0x563b0e6a8368, 
isDone=isDone@entry=0x7ffdd20400ac) at execQual.c:5710
#3  0x563b0c9a514f in ExecScan (node=node@entry=0x563b0e6a8038, 
accessMtd=accessMtd@entry=0x563b0c9bc910 , 
recheckMtd=recheckMtd@entry=0x563b0c9bc900 ) at execScan.c:220
#4  0x563b0c9bc9c3 in ExecSeqScan (node=node@entry=0x563b0e6a8038) at 
nodeSeqscan.c:127
#5  0x563b0c99d6e8 in ExecProcNode (node=node@entry=0x563b0e6a8038) at 
execProcnode.c:419
#6  0x563b0c9995be in ExecutePlan (dest=0x563b0e67da40, 
direction=, numberTuples=0, sendTuples=, 
operation=CMD_SELECT, use_parallel_mode=, 
planstate=0x563b0e6a8038, estate=0x563b0e6a77f8) at execMain.c:1567
#7  standard_ExecutorRun (queryDesc=0x563b0e6a2828, direction=, 
count=0) at execMain.c:338
#8  0x563b0c99c911 in ParallelQueryMain (seg=, 
toc=0x7f55173aa000) at execParallel.c:745
#9  0x563b0c898b02 in ParallelWorkerMain (main_arg=) at 
parallel.c:1108
#10 0x563b0ca49cad in StartBackgroundWorker () at bgworker.c:726
#11 0x563b0ca55770 in do_start_bgworker (rw=0x563b0e621080) at 
postmaster.c:5535
#12 maybe_start_bgworker () at postmaster.c:5710
#13 0x563b0ca56238 in sigusr1_handler (postgres_signal_arg=) 
at postmaster.c:4959
#14 
#15 0x7f5516788293 in __select_nocancel () at 
../sysdeps/unix/syscall-template.S:84
#16 0x563b0c818249 in ServerLoop () at postmaster.c:1665
#17 0x563b0ca577e2 in PostmasterMain (argc=3, argv=0x563b0e5fa490) at 
postmaster.c:1309
#18 0x563b0c819f6d in main (argc=3, argv=0x563b0e5fa490) at main.c:228


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [sqlsmith] Failed assertion in parallel worker in ExecInitSubPlan

2016-11-20 Thread Andreas Seltenreich
Hi,

the query below triggers a parallel worker assertion for me when run on
the regression database of master as of 0832f2d.  The plan sports a
couple of InitPlan nodes below Gather.

regards,
Andreas

 Gather  (cost=1.64..84.29 rows=128 width=4)
   Workers Planned: 1
   Single Copy: true
   ->  Limit  (cost=1.64..84.29 rows=128 width=4)
 ->  Subquery Scan on subq_0  (cost=1.64..451.06 rows=696 width=4)
   Filter: (subq_0.c6 IS NOT NULL)
   ->  Nested Loop Left Join  (cost=1.64..444.07 rows=699 width=145)
 Join Filter: (sample_0.aa = sample_1.pageno)
 InitPlan 4 (returns $3)
   ->  Result  (cost=1.21..5.36 rows=15 width=0)
 One-Time Filter: ($0 AND ($1 = $2))
 InitPlan 1 (returns $0)
   ->  Result  (cost=0.00..0.00 rows=0 width=0)
 One-Time Filter: false
 InitPlan 2 (returns $1)
   ->  Limit  (cost=0.35..0.52 rows=1 width=4)
 ->  Gather  (cost=0.00..1.04 rows=6 
width=4)
   Workers Planned: 1
   ->  Parallel Seq Scan on reltime_tbl 
 (cost=0.00..1.04 rows=4 width=4)
 InitPlan 3 (returns $2)
   ->  Limit  (cost=0.52..0.69 rows=1 width=4)
 ->  Gather  (cost=0.00..1.04 rows=6 
width=4)
   Workers Planned: 1
   ->  Parallel Seq Scan on reltime_tbl 
reltime_tbl_1  (cost=0.00..1.04 rows=4 width=4)
 ->  Sample Scan on pg_foreign_data_wrapper 
sample_2  (cost=1.21..5.36 rows=15 width=0)
   Sampling: system ('3.1'::real)
 ->  Nested Loop  (cost=0.15..382.85 rows=699 width=4)
   ->  Sample Scan on pg_largeobject sample_1  
(cost=0.00..209.03 rows=699 width=8)
 Sampling: bernoulli ('2.9'::real)
 Filter: (pageno IS NOT NULL)
   ->  Index Only Scan using 
pg_foreign_table_relid_index on pg_foreign_table ref_0  (cost=0.15..0.24 rows=1 
width=4)
 Index Cond: (ftrelid = sample_1.loid)
 ->  Materialize  (cost=0.00..16.06 rows=4 width=4)
   ->  Append  (cost=0.00..16.04 rows=4 width=4)
 ->  Sample Scan on a sample_0  
(cost=0.00..4.01 rows=1 width=4)
   Sampling: system ('5'::real)
 ->  Sample Scan on b sample_0_1  
(cost=0.00..4.01 rows=1 width=4)
   Sampling: system ('5'::real)
 ->  Sample Scan on c sample_0_2  
(cost=0.00..4.01 rows=1 width=4)
   Sampling: system ('5'::real)
 ->  Sample Scan on d sample_0_3  
(cost=0.00..4.01 rows=1 width=4)
   Sampling: system ('5'::real)


--8<---cut here---start->8---
set force_parallel_mode to on;
set max_parallel_workers_per_gather to 2;

select
  91 as c0
from
  (select
(select pfname from public.pslot limit 1 offset 3)
   as c0,
ref_1.grandtot as c1,
(select pg_catalog.min(procost) from pg_catalog.pg_proc)
   as c2,
ref_0.ftoptions as c3,
ref_1.grandtot as c4,
sample_1.loid as c5,
pg_catalog.pg_rotate_logfile() as c6,
(select random from public.hash_i4_heap limit 1 offset 5)
   as c7,
sample_1.loid as c8
  from
public.a as sample_0 tablesample system (5)
right join pg_catalog.pg_largeobject as sample_1 tablesample 
bernoulli (2.9)
  inner join pg_catalog.pg_foreign_table as ref_0
  on (sample_1.loid = ref_0.ftrelid )
on (sample_0.aa = sample_1.pageno )
  left join public.mvtest_tvv as ref_1
  on (EXISTS (
  select
  sample_2.fdwoptions as c0,
  sample_2.fdwhandler as c1,
  (select during from public.test_range_excl limit 1 offset 89)
 as c2
from
  pg_catalog.pg_foreign_data_wrapper as sample_2 tablesample 
system (3.1)
where (EXISTS (
select
sample_3.b as c0,
(select grantee from information_schema.udt_privileges 
limit 1 offset 4)
   as c1,
sample_3.b as c2,
sample_3.rf_a as c3,
sample_3.b as c4,
 

[HACKERS] [sqlsmith] Infinite recursion in bitshift

2016-10-14 Thread Andreas Seltenreich
Hi,

sqlsmith just found another crasher:

select bit '1' >> (-2^31)::int;

This is due to an integer overflow in bitshiftright()/bitshiftleft()
leading to them recursively calling each other.  Patch attached.

regards,
Andreas
>From cfdc425f75da268e1c2af08f936c59f34b69e577 Mon Sep 17 00:00:00 2001
From: Andreas Seltenreich <seltenre...@gmx.de>
Date: Fri, 14 Oct 2016 20:52:52 +0200
Subject: [PATCH] Fix possible infinite recursion on bitshift.

bitshiftright() and bitshiftleft() would recursively call each other
infinitely when the user passed a MIN_INT for the shift amount due to
an integer overflow.  This patch reduces a negative shift amount to
-VARBITMAXLEN to avoid overflow.
---
 src/backend/utils/adt/varbit.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/backend/utils/adt/varbit.c b/src/backend/utils/adt/varbit.c
index 75e6a46..d8ecfb6 100644
--- a/src/backend/utils/adt/varbit.c
+++ b/src/backend/utils/adt/varbit.c
@@ -1387,9 +1387,15 @@ bitshiftleft(PG_FUNCTION_ARGS)
 
 	/* Negative shift is a shift to the right */
 	if (shft < 0)
+	{
+		/* Protect against overflow */
+		if (shft < -VARBITMAXLEN)
+			shft = -VARBITMAXLEN;
+
 		PG_RETURN_DATUM(DirectFunctionCall2(bitshiftright,
 			VarBitPGetDatum(arg),
 			Int32GetDatum(-shft)));
+	}
 
 	result = (VarBit *) palloc(VARSIZE(arg));
 	SET_VARSIZE(result, VARSIZE(arg));
@@ -1447,9 +1453,15 @@ bitshiftright(PG_FUNCTION_ARGS)
 
 	/* Negative shift is a shift to the left */
 	if (shft < 0)
+	{
+		/* Protect against overflow */
+		if (shft < -VARBITMAXLEN)
+			shft = -VARBITMAXLEN;
+
 		PG_RETURN_DATUM(DirectFunctionCall2(bitshiftleft,
 			VarBitPGetDatum(arg),
 			Int32GetDatum(-shft)));
+	}
 
 	result = (VarBit *) palloc(VARSIZE(arg));
 	SET_VARSIZE(result, VARSIZE(arg));
-- 
2.9.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [sqlsmith] Infinite recursion in bitshift

2016-10-14 Thread Andreas Seltenreich
Tom Lane writes:

>> This is due to an integer overflow in bitshiftright()/bitshiftleft()
>> leading to them recursively calling each other.  Patch attached.
>
> Seems sane, though I wonder if it'd be better to use -INT_MAX rather
> than -VARBITMAXLEN.

I am undecided between those two.  -INT_MAX might be a more precise fix
for the problem, but the extra distance to the danger zone was kind of
soothing :-).

regards,
Andreas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >