Re: [HACKERS] PoC: Partial sort

2013-12-28 Thread David Rowley
On Sat, Dec 28, 2013 at 9:28 PM, Alexander Korotkov aekorot...@gmail.comwrote:

 On Tue, Dec 24, 2013 at 6:02 AM, Andreas Karlsson andr...@proxel.sewrote:
 Attached revision of patch implements partial sort usage in merge joins.



I'm looking forward to doing a bit of testing on this patch. I think it is
a really useful feature to get a bit more out of existing indexes.

I was about to test it tonight, but I'm having trouble getting the patch to
compile... I'm really wondering which compiler you are using as it seems
you're declaring your variables in some strange places.. See nodeSort.c
line 101. These variables are declared after there has been an if statement
in the same scope. That's not valid in C. (The patch did however apply
without any complaints).

Here's a list of the errors I get when compiling with visual studios on
windows.

D:\Postgres\c\pgsql.sln (default target) (1) -
D:\Postgres\c\postgres.vcxproj (default target) (2) -
(ClCompile target) -
  src\backend\executor\nodeSort.c(101): error C2275: 'Sort' : illegal use
of this type as an expression [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(101): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(102): error C2275: 'PlanState' : illegal
use of this type as an expression [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(102): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(103): error C2275: 'TupleDesc' : illegal
use of this type as an expression [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(103): error C2146: syntax error : missing
';' before identifier 'tupDesc' [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(103): error C2065: 'tupDesc' : undeclared
identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(120): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(121): error C2065: 'tupDesc' : undeclared
identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(121): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(125): error C2065: 'tupDesc' : undeclared
identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(126): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(126): error C2223: left of '-numCols'
must point to struct/union [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(127): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(127): error C2223: left of '-sortColIdx'
must point to struct/union [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(128): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(128): error C2223: left of
'-sortOperators' must point to struct/union
[D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(129): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(129): error C2223: left of '-collations'
must point to struct/union [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(130): error C2065: 'plannode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(130): error C2223: left of '-nullsFirst'
must point to struct/union [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(132): error C2198: 'tuplesort_begin_heap'
: too few arguments for call [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(143): error C2065: 'outerNode' :
undeclared identifier [D:\Postgres\c\postgres.vcxproj]
  src\backend\executor\nodeSort.c(167): error C2065: 'tupDesc' : undeclared
identifier [D:\Postgres\c\postgres.vcxproj]

13 Warning(s)
24 Error(s)


Regards

David Rowley


[HACKERS] control to don't toast one new type

2013-12-28 Thread Mohsen SM
I create type based on varlena.
I want control it that don't toast.
how to control the length? I must control all of varlena size or data
varlena size?
thank.


Re: [HACKERS] [bug fix] connection service file doesn't take effect with ECPG apps

2013-12-28 Thread Michael Meskes
On Sat, Dec 28, 2013 at 04:37:44PM +0900, MauMau wrote:
 You can confirm it by adding the following code fragment to
 ecpglib/connect.c.  The attached file contains this.
 ...

Sorry for not being precide enough. I did run some tests and it works like a 
charm for me.

Or in other words, I used the connect command you had in your email with a
services file pointing to a local database and it connected to that database.
Instead of adding an additional output I checked the log output which suggested
that host was NULL. 

While I cannot see how your patch could do any harm, I'd still like to figure
out why I don't see the same behaviour.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] ECPG app crashes due to SIGBUS on SPARC Solaris

2013-12-28 Thread Michael Meskes
On Sat, Dec 28, 2013 at 08:04:09AM +0900, MauMau wrote:
 OK, I'll run the ECPG regression test on Solaris without the patch.
 Please wait until Jan 6 2014 or so, because we've just entered new
 year holidays here in Japan.

Sure, we're no in a particular hurry.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at gmail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITHIN GROUP patch

2013-12-28 Thread Andrew Gierth
 Tom == Tom Lane t...@sss.pgh.pa.us writes:

 Tom I've committed this after significant editorialization --- most
 Tom notably, I pushed control of the sort step into the aggregate
 Tom support functions.

Initial tests suggest that your version is ~40% slower than ours on
some workloads.

On my system, this query takes ~950ms using our dev branch of the code,
and ~1050ms on git master (using \timing in psql for timings, and taking
the best of many consecutive runs):

select count(*)
  from (select percentile_disc(0.5) within group (order by i)
  from generate_series(1,3) i, generate_series(1,10) j group by j) 
s;

About ~700ms of that is overhead, as tested by running this query with
enable_hashagg=false:

select count(*)
  from (select j
  from generate_series(1,3) i, generate_series(1,10) j group by j) 
s;

So your version is taking 350ms for the percentile calculations
compared to 250ms for ours.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Regression tests in windows ignore white space

2013-12-28 Thread Amit Kapila
On Thu, Dec 26, 2013 at 3:02 PM, David Rowley dgrowle...@gmail.com wrote:
 In the following thread I discovered that my new regression tests worked
 perfectly on windows, but when they were run on linux they failed.

 http://www.postgresql.org/message-id/CAApHDvo_YCiPYGDz07MpX9o6EGg=3mmyJTb0ysPTwoTg3c=t...@mail.gmail.com

 After looking at pg_regress to see which options it passes to diff I
 discovered that it passes -w on windows to ignore ALL white space.

 The attached simple patch changes this so that it only ignores carriage
 returns. It does this by passing --strip-trailing-cr to diff instead of -w.
 This should help us few developers who use windows to get our white space
 correct in out expected results so that the tests also pass on non windows
 platforms.


I also faced some similar issue few days back and tried to search a
bit for some simpler option, but couldn't spend too much time.
I think it would be good if we can make it work without -w option.

When I tried with your patch on windows, it results into following:

== running regression test queries==
test tablespace   ... diff: unrecognized option `--strip-trailing-cr
'
diff: Try `diff --help' for more information.
diff command failed with status 2: diff --strip-trailing-cr
e:/../postgresql/src/test/regress/expected/tablespace.out
e:/../postgresql/src/test/regress/results/tablespace.out 
e:/../postgresql/src/test/regress/results/tablespace.out.diff


Which version of diff you are using?

Version of diff on my m/c is:
diff - GNU diffutils version 2.7

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] connection service file doesn't take effect with ECPG apps

2013-12-28 Thread MauMau

From: Michael Meskes mes...@postgresql.org

Or in other words, I used the connect command you had in your email with a
services file pointing to a local database and it connected to that 
database.
Instead of adding an additional output I checked the log output which 
suggested

that host was NULL.


Your test case seems different from my original mail.  In my test case, I 
want to connect to a database on another machine, not on the local one.  For 
example:


1. The ECPG app runs on a machine called client-host.
2. The database server to connect to runs on a machine called server-host.
3. There's no database server running on client-host.
4. The ECPG app uses the connection service file whose contents is:
[my_service]
host=server-host
... other necessary parameters like port, dbname, etc.

The app mistakenly tries to connect to the database server on the local 
machine (client-host), instead of the desired server-host, and fails to 
connect because the database server is not running on client-host.  This is 
because ECPGconnect() passes an empty string (), not NULL, to 
PQconnectdbParams().


Regards
MauMau






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Regression tests in windows ignore white space

2013-12-28 Thread David Rowley
On Sun, Dec 29, 2013 at 2:29 AM, Amit Kapila amit.kapil...@gmail.comwrote:

 On Thu, Dec 26, 2013 at 3:02 PM, David Rowley dgrowle...@gmail.com
 wrote:
  In the following thread I discovered that my new regression tests worked
  perfectly on windows, but when they were run on linux they failed.
 
 
 http://www.postgresql.org/message-id/CAApHDvo_YCiPYGDz07MpX9o6EGg=3mmyJTb0ysPTwoTg3c=t...@mail.gmail.com
 
  After looking at pg_regress to see which options it passes to diff I
  discovered that it passes -w on windows to ignore ALL white space.
 
  The attached simple patch changes this so that it only ignores carriage
  returns. It does this by passing --strip-trailing-cr to diff instead of
 -w.
  This should help us few developers who use windows to get our white space
  correct in out expected results so that the tests also pass on non
 windows
  platforms.
 

 I also faced some similar issue few days back and tried to search a
 bit for some simpler option, but couldn't spend too much time.
 I think it would be good if we can make it work without -w option.

 When I tried with your patch on windows, it results into following:

 == running regression test queries==
 test tablespace   ... diff: unrecognized option
 `--strip-trailing-cr
 '
 diff: Try `diff --help' for more information.
 diff command failed with status 2: diff --strip-trailing-cr
 e:/../postgresql/src/test/regress/expected/tablespace.out
 e:/../postgresql/src/test/regress/results/tablespace.out 
 e:/../postgresql/src/test/regress/results/tablespace.out.diff
 

 Which version of diff you are using?

 Version of diff on my m/c is:
 diff - GNU diffutils version 2.7


I had a bit of a look around on the git repository for diffutils and I've
found at least part of the commit which introduced --strip-trailing-cr

http://git.savannah.gnu.org/cgit/diffutils.git/commit/?id=eefb9adae1642dcb0e2ac523c79998f466e94e77

Although here I can only see that they've added the command line args and
not the actual code which implements stripping the carriage returns. Going
by that it seems that was added back in 2001, but the version you're using
is a bit older than than, it seems 2.7 is 19 years old!

http://git.savannah.gnu.org/cgit/diffutils.git/refs/tags

I'm on 2.8.7 which is only 4 years old.

I know we don't normally go with bleeding edge, but I wondering if we could
make the minimum supported diff version at least 2.8 (at least for windows)
which was released in 2002.


Regards

David Rowley


 With Regards,
 Amit Kapila.
 EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] truncating pg_multixact/members

2013-12-28 Thread Kevin Grittner
Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 1. slru.c doesn't consider file names longer than 4 hexadecimal chars.

 Fixing (1) is simple: we can have each SLRU user declare how many digits
 to have in file names.  All existing users but pg_multixact/members
 should declare 4 digits; that one should declare 5.  That way, the
 correct number of zeroes are allocated at the start point and we get
 nice, equal-width file names.  Eventually, predicate.c can change to
 wider file names and get rid of some strange code it has to deal with
 overrun.

That would be nice.

There would be the issue of how to deal with pg_upgrade, though. If
I remember correctly, there is no strong reason not to blow away
any existing files in the pg_serial subdirectory at startup (the
way NOTIFY code does), and at one point I had code to do that.  I
think we took that code out because the files would be deleted
soon enough anyway.  Barring objection, deleting them at startup
seems like a sane way to handle pg_upgrade issues when we do
increase the filename size.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] planner missing a trick for foreign tables w/OR conditions

2013-12-28 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Dec 17, 2013 at 12:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 So at this point I'm pretty much talked into it.  We could eliminate the
 dependence on indexes entirely, and replace this code with a step that
 simply tries to pull single-base-relation quals out of ORs wherever it can
 find one.  You could argue that the produced quals would sometimes not be
 worth testing for, but we could apply a heuristic that says to forget it
 unless the estimated selectivity of the extracted qual is less than,
 I dunno, 0.5 maybe.

 This is an area where I think things are very different from local and
 remote tables.  For a local table, the cost of transmitting a row from
 one plan node to another is basically zero.  For a remote table, it's
 potentially far higher, although unfortunately it's hard to know
 exactly.  But if the qual is cheap to evaluate, and we're getting back
 a lot of rows, I suspect even eliminating 5-10% of them could work out
 to a win.  With a local table, 50% sounds like a reasonable number.

I used 90% as a cutoff in the attached draft patch.  Not sure if it's
worth expending a lot of sweat on the point.

Preliminary testing of this makes it look like a good idea.  In
particular, it will now extract restriction clauses from ORs even if
those clauses have nothing to do with indexes, as exhibited in the
second new regression test case.  (I was unhappy to find that there were
no regression tests covering this behavior at all; though I guess that's
not so surprising given that orindxpath.c dates from before we could put
EXPLAIN output into the regression tests.)

A cosmetic issue that has to be resolved is where to put the new code.
orindxpath.c is clearly now a misnomer; in fact, I don't think that the
code in this form belongs in optimizer/path/ at all.  There's some
argument for putting it in initsplan.c, but that file is pretty big
already.  Maybe a new file in optimizer/util/?  Any thoughts on that?

Also, there's some janitorial cleanup work that remains to be done.
I believe that make_restrictinfo_from_bitmapqual is now dead code,
and generate_bitmap_or_paths could now be static-ified and probably
simplified by dropping its restriction_only option.  I've not investigated
in detail what we could get rid of once create_or_index_quals is gone.

regards, tom lane

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 96fe50f..88a566e 100644
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
*** set_plain_rel_size(PlannerInfo *root, Re
*** 362,378 
  
  	/* Mark rel with estimated output rows, width, etc */
  	set_baserel_size_estimates(root, rel);
- 
- 	/*
- 	 * Check to see if we can extract any restriction conditions from join
- 	 * quals that are OR-of-AND structures.  If so, add them to the rel's
- 	 * restriction list, and redo the above steps.
- 	 */
- 	if (create_or_index_quals(root, rel))
- 	{
- 		check_partial_indexes(root, rel);
- 		set_baserel_size_estimates(root, rel);
- 	}
  }
  
  /*
--- 362,367 
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 606734a..0e05107 100644
*** a/src/backend/optimizer/path/indxpath.c
--- b/src/backend/optimizer/path/indxpath.c
*** choose_bitmap_and(PlannerInfo *root, Rel
*** 1354,1360 
  	 * we can remove this limitation.  (But note that this also defends
  	 * against flat-out duplicate input paths, which can happen because
  	 * match_join_clauses_to_index will find the same OR join clauses that
! 	 * create_or_index_quals has pulled OR restriction clauses out of.)
  	 *
  	 * For the same reason, we reject AND combinations in which an index
  	 * predicate clause duplicates another clause.	Here we find it necessary
--- 1354,1361 
  	 * we can remove this limitation.  (But note that this also defends
  	 * against flat-out duplicate input paths, which can happen because
  	 * match_join_clauses_to_index will find the same OR join clauses that
! 	 * extract_restriction_or_clauses has pulled OR restriction clauses out
! 	 * of.)
  	 *
  	 * For the same reason, we reject AND combinations in which an index
  	 * predicate clause duplicates another clause.	Here we find it necessary
diff --git a/src/backend/optimizer/path/orindxpath.c b/src/backend/optimizer/path/orindxpath.c
index 16f29d3..5a8d73b 100644
*** a/src/backend/optimizer/path/orindxpath.c
--- b/src/backend/optimizer/path/orindxpath.c
***
*** 15,187 
  
  #include postgres.h
  
  #include optimizer/cost.h
  #include optimizer/paths.h
  #include optimizer/restrictinfo.h
  
  
! /*--
!  * create_or_index_quals
   *	  Examine join OR-of-AND quals to see if any useful restriction OR
   *	  clauses can be extracted.  If so, add them to the query.
   *
!  * Although a join clause must reference other relations overall,
!  * an OR of ANDs 

Re: [HACKERS] [BUG FIX] Version number expressed in octal form by mistake

2013-12-28 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 Oh, I just noticed that this is for the *pg_restore* code, not the
 pg_dump code, so there isn't necessarily a conflict with the docs.
 The pg_dump code does match the docs on its version check. The
 question becomes, for each supported version, what do we want to
 set into AHX-minRemoteVersion before opening the connection to the
 target database?  Do we really want a 9.4 executable to be
 attempting to restore to a 7.1 database cluster?  What about
 backpatching?

On reflection, I'm not sure that pg_restore as such should be applying any
server version check at all.  pg_restore itself has precious little to do
with whether there will be a compatibility problem; that's mostly down to
the DDL that pg_dump put into the archive file.  And we don't have enough
information to be very sure about whether it will work, short of actually
trying it.  So why should the code arbitrarily refuse to try?

So I'm inclined to propose that we set min/max to 0 and 99 here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] truncating pg_multixact/members

2013-12-28 Thread Alvaro Herrera
Kevin Grittner wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 
  1. slru.c doesn't consider file names longer than 4 hexadecimal chars.
 
  Fixing (1) is simple: we can have each SLRU user declare how many digits
  to have in file names.  All existing users but pg_multixact/members
  should declare 4 digits; that one should declare 5.  That way, the
  correct number of zeroes are allocated at the start point and we get
  nice, equal-width file names.  Eventually, predicate.c can change to
  wider file names and get rid of some strange code it has to deal with
  overrun.
 
 That would be nice.
 
 There would be the issue of how to deal with pg_upgrade, though. If
 I remember correctly, there is no strong reason not to blow away
 any existing files in the pg_serial subdirectory at startup (the
 way NOTIFY code does), and at one point I had code to do that.  I
 think we took that code out because the files would be deleted
 soon enough anyway.  Barring objection, deleting them at startup
 seems like a sane way to handle pg_upgrade issues when we do
 increase the filename size.

Agreed.  It's easy to have the files deleted at startup now that the
truncation stuff uses a callback.  There is already a callback that's
used to delete all files, so you won't need to write any code to make it
behave that way.

FWIW for pg_multixact/members during pg_upgrade from 9.3 to 9.4 we will
need to rename existing files, prepending a zero to each file whose name
is four chars in length.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] trailing comment ghost-timing

2013-12-28 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-12-24 12:27:59 -0500, Tom Lane wrote:
 What I was proposing was that we do include comments in what we send,
 as long as those comments are embedded in the statement text, not
 on lines before it.

 The common way I've seen what I've described above done as is something
 like:
 /* File:path/to/some/file.whatever Function:foo::something()
  * Comment: Expensive, but that's ok!
  */
 SELECT here_comes FROM my WHERE some_sql;

 If I unerstood what you propose correctly, we'd now drop that comment,
 where we sent it before?

Yeah.  But I just noticed that this would cause the regression test
results to change dramatically --- right now, most -- comment comments
get echoed to the output, and that would stop.  So maybe it's not such
a great idea after all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix typo in src/backend/utils/mmgr/README

2013-12-28 Thread Peter Eisentraut
On Wed, 2013-12-25 at 17:50 +0900, Etsuro Fujita wrote:
 Peter Eisentraut wrote:
  On 12/24/13, 1:33 AM, Etsuro Fujita wrote:
   This is a small patch to fix a typo in src/backend/utils/mmgr/README.
 
  I don't think that change is correct.
 
 I've fixed the patch, though that might be still wrong.  I'm not a native
 English speaker ;).  Please find attached an updated version of the patch.

Committed.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] work_mem calculation possible overflow

2013-12-28 Thread Tom Lane
David Rowley dgrowle...@gmail.com writes:
 I was just looking through a few of the warnings flagged up by PVS Studio.
 I found some warnings around some calculations that were doing work_mem *
 1024L and comparing that to a double. On windows 64 sizeof(long) is 4 bytes.
 Currently work_mem's maximum value is INT_MAX / 1024, so this should not
 overflow on windows 64 at the moment, but perhaps if work_mem's maximum is
 raised in the future then it will. In any case the L suffix on 1024 to
 widen the type here just seems a bit wrong giving that postgresql supports
 platforms where sizeof(int) and sizeof(long) is the same.

This is not a bug.  Note the limitation on the allowed range of work_mem
in guc.c; that's designed precisely to ensure that work_mem * 1024L will
not overflow a long.  That's perhaps a bit klugy, but if we wanted to
remove that assumption we'd have to touch far more places than what you
have done here, and most of them would require much uglier changes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgcrypto missing header file inclusions

2013-12-28 Thread Peter Eisentraut
While playing around with the pginclude tools, I noticed that pgcrypto
header files are failing to include some header files whose symbols they
use.  This change would fix it:

diff --git a/contrib/pgcrypto/pgp.h b/contrib/pgcrypto/pgp.h
index 3022abf..f856e07 100644
--- a/contrib/pgcrypto/pgp.h
+++ b/contrib/pgcrypto/pgp.h
@@ -29,6 +29,9 @@
  * contrib/pgcrypto/pgp.h
  */
 
+#include mbuf.h
+#include px.h
+
 enum PGP_S2K_TYPE
 {
PGP_S2K_SIMPLE = 0,

Does that look reasonable?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planning time in explain/explain analyze

2013-12-28 Thread Andreas Karlsson
New version of the patch with updated documentation and which does not 
display the planning time when the COSTS are off.


I will add it to the next commitfest.

--
Andreas Karlsson
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
new file mode 100644
index 2af1738..240a3d5
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
*** EXPLAIN SELECT * FROM tenk1;
*** 89,94 
--- 89,95 
   QUERY PLAN
  -
   Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
+  Planning time: 0.113 ms
  /screen
 /para
  
*** EXPLAIN SELECT * FROM tenk1;
*** 162,167 
--- 163,175 
 /para
  
 para
+ The literalPlanning time/literal shown is the time it took to generate
+ the query plan from the parsed query and optimize it. It does not include
+ rewriting and parsing. We will not include this line in later examples in
+ this section.
+/para
+ 
+para
  Returning to our example:
  
  screen
*** WHERE t1.unique1 lt; 10 AND t1.unique2
*** 564,569 
--- 572,578 
 Index Cond: (unique1 lt; 10)
 -gt;  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10)
   Index Cond: (unique2 = t1.unique2)
+  Planning time: 0.181 ms
   Total runtime: 0.501 ms
  /screen
  
*** WHERE t1.unique1 lt; 100 AND t1.unique2
*** 612,617 
--- 621,627 
   Recheck Cond: (unique1 lt; 100)
   -gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0) (actual time=0.049..0.049 rows=100 loops=1)
 Index Cond: (unique1 lt; 100)
+  Planning time: 0.194 ms
   Total runtime: 8.008 ms
  /screen
  
*** EXPLAIN ANALYZE SELECT * FROM tenk1 WHER
*** 635,640 
--- 645,651 
   Seq Scan on tenk1  (cost=0.00..483.00 rows=7000 width=244) (actual time=0.016..5.107 rows=7000 loops=1)
 Filter: (ten lt; 7)
 Rows Removed by Filter: 3000
+  Planning time: 0.083 ms
   Total runtime: 5.905 ms
  /screen
  
*** EXPLAIN ANALYZE SELECT * FROM polygon_tb
*** 657,662 
--- 668,674 
   Seq Scan on polygon_tbl  (cost=0.00..1.05 rows=1 width=32) (actual time=0.044..0.044 rows=0 loops=1)
 Filter: (f1 @gt; '((0.5,2))'::polygon)
 Rows Removed by Filter: 4
+  Planning time: 0.040 ms
   Total runtime: 0.083 ms
  /screen
  
*** EXPLAIN ANALYZE SELECT * FROM polygon_tb
*** 675,680 
--- 687,693 
   Index Scan using gpolygonind on polygon_tbl  (cost=0.13..8.15 rows=1 width=32) (actual time=0.062..0.062 rows=0 loops=1)
 Index Cond: (f1 @gt; '((0.5,2))'::polygon)
 Rows Removed by Index Recheck: 1
+  Planning time: 0.034 ms
   Total runtime: 0.144 ms
  /screen
  
*** EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM
*** 705,710 
--- 718,724 
   -gt;  Bitmap Index Scan on tenk1_unique2  (cost=0.00..19.78 rows=999 width=0) (actual time=0.227..0.227 rows=999 loops=1)
 Index Cond: (unique2 gt; 9000)
 Buffers: shared hit=5
+  Planning time: 0.088 ms
   Total runtime: 0.423 ms
  /screen
  
*** EXPLAIN ANALYZE UPDATE tenk1 SET hundred
*** 732,737 
--- 746,752 
   Recheck Cond: (unique1 lt; 100)
   -gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0) (actual time=0.043..0.043 rows=100 loops=1)
 Index Cond: (unique1 lt; 100)
+  Planning time: 0.079 ms
   Total runtime: 14.727 ms
  
  ROLLBACK;
*** EXPLAIN ANALYZE SELECT * FROM tenk1 WHER
*** 817,822 
--- 832,838 
   Index Cond: (unique2 gt; 9000)
   Filter: (unique1 lt; 100)
   Rows Removed by Filter: 287
+  Planning time: 0.096 ms
   Total runtime: 0.336 ms
  /screen
  
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
new file mode 100644
index 35264dc..4e3f571
*** a/doc/src/sgml/ref/explain.sgml
--- b/doc/src/sgml/ref/explain.sgml
*** ROLLBACK;
*** 145,151 
   para
Include information on the estimated startup and total cost of each
plan node, as well as the estimated number of rows and the estimated
!   width of each row.  This parameter defaults to literalTRUE/literal.
   /para
  /listitem
 /varlistentry
--- 145,152 
   para
Include information on the estimated startup and total cost of each
plan node, as well as the estimated number of rows and the estimated
!   width of each row.  Additionally it controls if the planning time is
!   displayed.  This parameter defaults to literalTRUE/literal.
   /para
  /listitem
 /varlistentry
*** EXPLAIN SELECT * FROM foo;
*** 289,295 
 QUERY PLAN
  -
   

[HACKERS] Bogus error handling in pg_upgrade

2013-12-28 Thread Tom Lane
A credulous person might suppose that this chunk of code is designed
to abort if pg_resetxlog fails:

prep_status(Setting next transaction ID for new cluster);
exec_prog(UTILITY_LOG_FILE, NULL, true,
  \%s/pg_resetxlog\ -f -x %u \%s\,
  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid,
  new_cluster.pgdata);
check_ok();

In point of fact, it does no such thing, but blithely continues
(even though pg_resetxlog has corrupted things horribly before failing).

check_ok() is particularly badly named, since it contains not one iota
of error checking.  misleadingly_claim_ok() would be a better name.

If this isn't broken-by-design, I'd like an explanation why not.

In case you're wondering, I'm investigating the problem mentioned
at 1387636762.30013.13.ca...@vanquo.pezone.net.  I see this output:

Performing Upgrade
--
Analyzing all rows in the new cluster   ok
Freezing all rows on the new clusterok
Deleting files from new pg_clog ok
Copying old pg_clog to new server   ok
Setting next transaction ID for new cluster 
*failure*

Consult the last few lines of pg_upgrade_utility.log for
the probable cause of the failure.
ok
Deleting files from new pg_multixact/offsetsok
Copying old pg_multixact/offsets to new server  ok
Deleting files from new pg_multixact/membersok
Copying old pg_multixact/members to new server  ok
Setting next multixact ID and offset for new clusterok
Resetting WAL archives  
*failure*

Consult the last few lines of pg_upgrade_utility.log for
the probable cause of the failure.
ok


*failure*
Consult the last few lines of pg_upgrade_server_start.log or 
pg_upgrade_server.log for
the probable cause of the failure.

connection to database failed: could not connect to server: No such file or 
directory
Is the server running locally and accepting
connections on Unix domain socket 
/Users/tgl/pgsql/contrib/pg_upgrade/.s.PGSQL.57632?


could not connect to new postmaster started with the command:
/Users/tgl/pgsql/contrib/pg_upgrade/tmp_check/install//Users/tgl/testversion/bin/pg_ctl
 -w -l pg_upgrade_server.log -D 
/Users/tgl/pgsql/contrib/pg_upgrade/tmp_check/data -o -p 57632 -b -c 
synchronous_commit=off -c fsync=off -c full_page_writes=off  -c 
listen_addresses='' -c unix_socket_permissions=0700 -c 
unix_socket_directories='/Users/tgl/pgsql/contrib/pg_upgrade' start
Failure, exiting
make: *** [check] Error 1

I think the actual problem is that pg_resetxlog rewrites pg_control, zaps
everything in pg_xlog/, and then fails before writing a new initial xlog
segment.  However, pg_upgrade isn't making this any easier to investigate
by failing to stop at the first sign of trouble.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-28 Thread Amit Kapila
On Tue, Dec 24, 2013 at 11:38 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sun, Dec 22, 2013 at 8:32 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Sun, Dec 22, 2013 at 3:01 PM, Fujii Masao masao.fu...@gmail.com wrote:
 I found the bug of ALTER SYSTEM SET patch. The procedure to reproduce
 it is here.

 $ psql
 =# ALTER SYSTEM SET shared_buffers = '512MB';
 ALTER SYSTEM
 =# \q
 $ pg_ctl -D data reload
 server signaled
 2013-12-22 18:24:13 JST LOG:  received SIGHUP, reloading configuration 
 files
 2013-12-22 18:24:13 JST LOG:  parameter shared_buffers cannot be
 changed without restarting the server
 2013-12-22 18:24:13 JST LOG:  configuration file X?? contains
 errors; unaffected changes were applied


 Your analysis is absolutely right.
 The reason for this issue is that we want to display filename to which
 erroneous parameter
 belongs and in this case we are freeing the parameter info before actual 
 error.
 To fix, we need to save the filename value before freeing parameter list.

 Please find the attached patch to fix this issue.

 When I read ProcessConfigFile() more carefully, I found another related
 problem. The procedure to reproduce the problem is here.

 
 $ psql
 =# ALTER SYSTEM SET shared_buffers = '256MB';
 =# \q

 $ echo shared_buffers = '256MB'  $PGDATA/postgresql.conf
 $ pg_ctl -D $PGDATA reload
 2013-12-25 03:05:44 JST LOG:  parameter shared_buffers cannot be
 changed without restarting the server
 2013-12-25 03:05:44 JST LOG:  parameter shared_buffers cannot be
 changed without restarting the server
 2013-12-25 03:05:44 JST LOG:  configuration file
 /dav/alter-system/data/postgresql.auto.conf contains errors;
 unaffected changes were applied
 

 Both postgresql.conf and postgresql.auto.conf contain the setting of
 the parameter that cannot be changed without restarting the server.
 But only postgresql.auto.conf was logged, and which would be confusing,
 I'm afraid.

The reason for above behaviour is that while applying configuration parameters,
it just note down the last file which has error and ignore others. Now
if we just
want to change behaviour for above case that it display both the files, then
during processing of parameters, we can save the errorconffile if it
is different
from previous. However, let us say if user uses include mechanism of conf files
and used new file for specifying some of the changed parameters, then again
similar thing can be observed.

I think here basic idea is user should avoid using multiple methods to change
configuration parameters, however we cannot stop them from doing the same.
So in some cases like above where user use multiple methods to change
configuration can add more work for him to understand and detect the error.

Also, some similar behaviour can be observed if user uses include mechanism
to specify some config parameters without even Alter System changes.

I think that this problem should be fixed together with the
 problem that I reported before. Thought?

Here I think you might be worried that if we want to fix the behaviour reported
in this mail, it might overlap with the changes of previous fix. So I
think let us
first decide if we want to keep the current behaviour as it is or would like to
change it and if it needs change, what should be the new behaviour?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-28 Thread Amit Kapila
On Fri, Dec 27, 2013 at 5:01 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Dec 24, 2013 at 10:08 AM, Fujii Masao masao.fu...@gmail.com wrote:
 When I read ProcessConfigFile() more carefully, I found another related
 problem. The procedure to reproduce the problem is here.

 
 $ psql
 =# ALTER SYSTEM SET shared_buffers = '256MB';
 =# \q

 $ echo shared_buffers = '256MB'  $PGDATA/postgresql.conf
 $ pg_ctl -D $PGDATA reload
 2013-12-25 03:05:44 JST LOG:  parameter shared_buffers cannot be
 changed without restarting the server
 2013-12-25 03:05:44 JST LOG:  parameter shared_buffers cannot be
 changed without restarting the server
 2013-12-25 03:05:44 JST LOG:  configuration file
 /dav/alter-system/data/postgresql.auto.conf contains errors;
 unaffected changes were applied
 

 Both postgresql.conf and postgresql.auto.conf contain the setting of
 the parameter that cannot be changed without restarting the server.
 But only postgresql.auto.conf was logged, and which would be confusing,
 I'm afraid. I think that this problem should be fixed together with the
 problem that I reported before. Thought?

 I have run into this problem on many occasions because my benchmark
 scripts usually append a hunk of stuff to postgresql.conf, and any
 parameters that were already set generate this warning every time you
 reload, because postgresql.conf now mentions that parameter twice.
 I'm not quite sure what other problem you want to fix it together
 with,

The 2 problems are:

1. First is that if user changes the config parameter (for ex.
shared_buffers) both by manually
editing postgresql.conf and by using Alter System command and then
reload the config,
it will show the message for parameter 'shared_buffers' twice, but
will only show the last file
where it exists. The detailed description of problem is in current mail.

2. The other problem is in some cases the name of file it display is
junk, problem and fix can be found
at link:

http://www.postgresql.org/message-id/CAA4eK1+-yD2p=+6tdj_xpruhn4m_ckvfr51ah0gv3sxlgoa...@mail.gmail.com

and I'm not sure what the right fix is either, but +1 for coming
 up with some solution that's better than what we have now.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2013-12-28 Thread Peter Geoghegan
Attached revision only uses heavyweight page locks across complex
operations. I haven't benchmarked it, but it appears to perform
reasonably well. I haven't attempted to measure a regression for
regular insertion, but offhand it seems likely that any regression
would be well within the noise - more or less immeasurably small. I
won't repeat too much of what is already well commented in the patch.
For those that would like a relatively quick summary of what I've
done, I include inline a new section that I've added to the nbtree
README:

Notes about speculative insertion
-

As an amcanunique AM, the btree implementation is required to support
speculative insertion.  This means that the value locking method
through which unique index enforcement conventionally occurs is
extended and generalized, such that insertion is staggered:  the core
code attempts to get full consensus on whether values proposed for
insertion will not cause duplicate key violations.  Speculative
insertion is only possible for unique index insertion without deferred
uniqueness checking (since speculative insertion into a deferred
unique constraint's index is a contradiction in terms).

For conventional unique index insertion, the Btree implementation
exclusive locks a buffer holding the first page that the value to be
inserted could possibly be on, though only for an instant, during and
shortly after uniqueness verification.  It would not be acceptable to
hold this lock across complex operations for the duration of the
remainder of the first phase of speculative insertion.  Therefore, we
convert this exclusive buffer lock to an exclusive page lock managed
by the lock manager, thereby greatly ameliorating the consequences of
undiscovered deadlocking implementation bugs (though deadlocks are not
expected), and minimizing the impact on system interruptibility, while
not affecting index scans.

It may be useful to informally think of the page lock type acquired by
speculative insertion as similar to an intention exclusive lock, a
type of lock found in some legacy 2PL database systems that use
multi-granularity locking.  A session establishes the exclusive right
to subsequently establish a full write lock, without actually blocking
reads of the page unless and until a lock conversion actually occurs,
at which point both reads and writes are blocked.  Under this mental
model, buffer shared locks can be thought of as intention shared
locks.

As implemented, these heavyweight locks are only relevant to the
insertion case; at no other point are they actually considered, since
insertion is the only way through which new values are introduced.
The first page a value proposed for insertion into an index could be
on represents a natural choke point for our extended, though still
rather limited system of value locking.  Naturally, when we perform a
lock escalation and acquire an exclusive buffer lock, all other
buffer locks on the same buffer are blocked, which is how the
implementation localizes knowledge about the heavyweight lock to
insertion-related routines.  Apart from deletion, which is
concomitantly prevented by holding a pin on the buffer throughout, all
exclusive locking of Btree buffers happen as a direct or indirect
result of insertion, so this approach is sufficient. (Actually, an
exclusive lock may still be acquired without insertion to initialize a
root page, but that hardly matters.)

Note that all value locks (including buffer pins) are dropped
immediately as speculative insertion is aborted, as the implementation
waits on the outcome of another xact, or as insertion proper occurs.
These page-level locks are not intended to last more than an instant.
In general, the protocol for heavyweight locking Btree pages is that
heavyweight locks are acquired before any buffer locks are held, while
the locks are only released after all buffer locks are released.
While not a hard and fast rule, presently we avoid heavyweight page
locking more than one page per unique index concurrently.


Happy new year
-- 
Peter Geoghegan


btreelock_insert_on_dup.v5.2013_12_28.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2013-12-28 Thread David Rowley
On Sun, Dec 29, 2013 at 4:51 AM, Alexander Korotkov aekorot...@gmail.comwrote:

 I've compiled it with clang. Yeah, there was mixed declarations. I've
 rechecked it with gcc, now it gives no warnings. I didn't try it with
 visual studio, but I hope it will be OK.


Thanks for the patch. It now compiles without any problems.
I've been doing a bit of testing with the patch testing a few different
workloads. One thing that I've found is that in my test case when the table
only contains 1 tuple for any given presort columns that the query is
actually slower than when there are say 100 tuples to sort for any given
presort group.

Here is my test case:

DROP TABLE IF EXISTS temperature_readings;

CREATE TABLE temperature_readings (
  readingid SERIAL NOT NULL,
  timestamp TIMESTAMP NOT NULL,
  locationid INT NOT NULL,
  temperature INT NOT NULL,
  PRIMARY KEY (readingid)
);

INSERT INTO temperature_readings (timestamp,locationid,temperature)
SELECT ts.timestamp, loc.locationid, -10 + random() * 40
FROM generate_series('1900-04-01','2000-04-01','1 day'::interval)
ts(timestamp)
CROSS JOIN generate_series(1,1) loc(locationid);

VACUUM ANALYZE temperature_readings;

-- Warm buffers
SELECT AVG(temperature) FROM temperature_readings;

explain (buffers, analyze) select * from temperature_readings order by
timestamp,locationid; -- (seqscan - sort) 70.805ms

-- create an index to allow presorting on timestamp.
CREATE INDEX temperature_readings_timestamp_idx ON temperature_readings
(timestamp);

-- warm index buffers
SELECT COUNT(*) FROM (SELECT timestamp FROM temperature_readings ORDER BY
timestamp) c;

explain (buffers, analyze) select * from temperature_readings order by
timestamp,locationid; -- index scan - partial sort 253.032ms

The first query without the index to presort on runs in 70.805 ms, the 2nd
query uses the index to presort and runs in 253.032 ms.

I ran the code through a performance profiler and found that about 80% of
the time is spent in tuplesort_end and tuplesort_begin_heap.

If it was possible to devise some way to reuse any previous tuplesortstate
perhaps just inventing a reset method which clears out tuples, then we
could see performance exceed the standard seqscan - sort. The code the way
it is seems to lookup the sort functions from the syscache for each group
then allocate some sort space, so quite a bit of time is also spent in
palloc0() and pfree()

If it was not possible to do this then maybe adding a cost to the number of
sort groups would be better so that the optimization is skipped if there
are too many sort groups.

Regards

David Rowley






 --
 With best regards,
 Alexander Korotkov.



Re: [HACKERS] [COMMITTERS] pgsql: Upgrade to Autoconf 2.69

2013-12-28 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On Fri, 2013-12-20 at 10:54 -0300, Alvaro Herrera wrote:
 Evidently something is not going well in ReadRecord.  It should have
 reported the read failure, but didn't.  That seems a separate bug that
 needs fixed.

 This is enabling large-file support on OS X, so that seems kind of
 important.  It's not failing with newer versions of OS X, so that leaves
 the following possibilities, I think:

[ gets out ancient PPC laptop ... ]

[ man, this thing is slow ... ]

[ hours later ... ]

There are three or four different bugs here, but the key points are:

1. pg_resetxlog is diligently trashing every single WAL file in pg_xlog/,
and then failing (by virtue of some ancient OS X bug in readdir()), so
that it doesn't get to the point of recreating a WAL file with a
checkpoint record.

2. pg_resetxlog already rewrote pg_control, which might not be such a hot
idea; but whether it had or not, pg_control's last-checkpoint pointer
would be pointing to a WAL file that is not in pg_xlog/ now.

3. pg_upgrade ignores the fact that pg_resetxlog failed, and keeps going.

4. The server tries to start, and fails because it can't find a WAL file
containing the last checkpoint record.  This is pretty unsurprising given
the facts above.  The reason you don't see any no such file report is
that XLogFileRead() will report any BasicOpenFile() failure *other than*
ENOENT.  And nothing else makes up for that.

Re point 4: the logic, if you can call it that, in xlog.c and xlogreader.c
is making my head spin.  There are about four levels of overcomplicated
and undercommented code before you ever get down to XLogFileRead, so I
have no idea which level to blame for the lack of error reporting in this
specific case.  But there are pretty clearly some cases in which ignoring
ENOENT in XLogFileRead isn't such a good idea, and XLogFileRead isn't
being told when to do that or not.

Re point 3: I already bitched about that.

Re point 2: I wonder whether pg_resetxlog shouldn't do things in a
different order.  Rewriting pg_control to point to a file that doesn't
exist yet doesn't sound great.  I'm not sure though if there's any
great improvement to be had here; basically, if we fail partway
through, we're probably screwed no matter what.

Re point 1: there are threads in our archives suggesting that EINVAL
from readdir() after unlinking a directory entry is a known problem
on some versions of OS X, eg
http://www.postgresql.org/message-id/flat/47c45b07-8459-48d8-8fbe-291864019...@me.com
and I also find stuff on the intertubes suggesting that renaming an entry
can cause it too, eg
http://www.dovecot.org/list/dovecot/2009-July/041009.html

We thought at the time that the readdir bug was new in OS X 10.6, but I'll
bet it was there in 10.5's 64-bit-inode support and was only exposed to
default builds when 10.6 turned on 64-bit-inodes in userland by default.
So Apple fixed it in 10.6.2 but never back-patched the fix into 10.5.x ---
shame on them.

I'm disinclined to contort our stuff to any great extent to fix it,
because all the OS X versions mentioned are several years obsolete.
Perhaps though we should override Autoconf's setting of
_DARWIN_USE_64_BIT_INODE, if we can do that easily?  It's clearly
not nearly as problem-free on 10.5 as the Autoconf boys believe,
and it's already enabled by default on the release series where it
does work.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers