Re: [HACKERS] invalid magic number in log segment

2013-12-11 Thread Heikki Linnakangas

On 12/11/2013 09:44 AM, Erikjan Rijkers wrote:

I don't know whether the below constitutes a bug, but:

Daily (sometimes even more often) I recompile 9.4devel (after git pull)  to run 
a large dev database (100 GB or so).

To avoid frequent initdb and many-hour-restore of data, I do this only when the 
following two #defines are unchanged:
   CATALOG_VERSION_NO in src/include/catalog/catversion.h, and
   PG_CONTROL_VERSION in src/include/catalog/pg_control.h

the goal being to always run the latest db, without having to reload the ~100 
GB dev db unexpectedly at inconvenient times.

Generally, this works OK.

However, the last few weeks I sometimes get, after such recompiling,  'invalid 
magic number' errors from which I don't know
how to recover (it means, apparently, an initdb is needed and I have then to 
reload the database).


2013-12-11 00:15:25.627 CET 25304 LOG:  received smart shutdown request
2013-12-11 00:15:25.631 CET 25306 LOG:  shutting down
2013-12-11 00:15:25.904 CET 25306 LOG:  database system is shut down
2013-12-11 08:11:59.858 CET 25490 LOG:  database system was shut down at 
2013-12-11 00:15:25 CET
2013-12-11 08:11:59.901 CET 25490 LOG:  invalid magic number D078 in log 
segment 000100630034, offset 0
2013-12-11 08:11:59.901 CET 25490 LOG:  invalid primary checkpoint record
2013-12-11 08:11:59.901 CET 25490 LOG:  invalid magic number D078 in log 
segment 000100630034, offset 0
2013-12-11 08:11:59.901 CET 25490 LOG:  invalid secondary checkpoint record
2013-12-11 08:11:59.901 CET 25490 PANIC:  could not locate a valid checkpoint 
record
2013-12-11 08:12:00.326 CET 25492 FATAL:  the database system is starting up
2013-12-11 08:12:01.328 CET 25493 FATAL:  the database system is starting up
2013-12-11 08:12:01.682 CET 25489 LOG:  startup process (PID 25490) was 
terminated by signal 6: Aborted
2013-12-11 08:12:01.682 CET 25489 LOG:  aborting startup due to startup process 
failure


My question is two-fold:

1. (general:)  is this 'invalid magic number' unexpected, and should it be 
reported always?


The magic number it's complaining about is the constant stored in the 
WAL logs, that acts as a version number. Think of it as the 
CATALOG_VERSION_NO, but for the WAL log format. It's bumped whenever the 
WAL format changes.



2. (for my setup specifically:)  is there any way that I can recognize, 
beforehand, at the code base level, such an
impending 'invalid magic number' state?


Keep an eye on the XLOG_PAGE_MAGIC constaint in xlog_internal.h.


Can de db be recovered from easily? (although this dev database is expendable, 
it takes many hours to rebuild; I'd like to
avoid that if possible).


Yes, you can use pg_resetxlog to recover.

BTW, you could also use pg_upgrade. That should work also when 
catversion has been bumped.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Mark Kirkwood

On 11/12/13 19:34, Simon Riggs wrote:


Realistically, I never heard of an Oracle DBA doing advanced
statistical mathematics before setting the sample size on ANALYZE. You
use the default and bump it up if the sample is insufficient for the
data.



I'm not sure that Oracle's stats and optimizer design is an example to 
be envied - pretty much all Oracle DBA's I've encountered will apply 
hints all queries to get the plan they want...


Regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-11 Thread Simon Riggs
On 11 December 2013 06:36, KONDO Mitsumasa
kondo.mitsum...@lab.ntt.co.jp wrote:

 I think this feature will be used in a lot of scenarios in
 which PITR is currently used.

 We have to judge which is better, we get something potential or to protect
 stupid.
 And we had better to wait author's comment...

I'd say just document that it wouldn't make sense to use it for PITR.

There may be some use case we can't see yet, so specifically
prohibiting a use case that is not dangerous seems too much at this
point. I will no doubt be reminded of these words in the future...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.8

2013-12-11 Thread Andres Freund
On 2013-12-10 19:11:03 -0500, Robert Haas wrote:
 Committed #1 (again).

Thanks!

  Regarding this:
 
 +   /* XXX: we could also do this unconditionally, the space is used 
 anyway
 +   if (copy_oid)
 +   HeapTupleSetOid(key_tuple, HeapTupleGetOid(tp));
 
 I would like to put in a big +1 for doing that unconditionally.  I
 didn't make that change before committing, but I think it'd be a very
 good idea.

Ok. I wasn't sure if it wouldn't be wierd to include the oid in the
tuple logged for a replica identity that doesn't cover the oid. But the
downside is pretty small...

Will send a patch.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Question about sorting internals

2013-12-11 Thread hubert depesz lubaczewski
Hi,

before I'll go any further - this is only thought-experiment. I do not
plan to use such queries in real-life applications. I was just presented
with a question that I can't answer in any logical way.

There are two simple queries:

#v+
with rok2005 (miesiac,wynik) as (VALUES (1,1),(2,2)  ,(4,4),(5,NULL),(6,6))
,rok2004 (miesiac,wynik) as (VALUES (1,3)  ,(3,3),(4,5) ,(6,6))
SELECT
distinct on (miesiac) *
FROM (
SELECT miesiac, 2005 as rok, wynik FROM rok2005
union all
SELECT miesiac, 2004 as rok, wynik FROM rok2004
) as polaczone
ORDER BY miesiac, wynik desc;
#v-

#v+
with rok2005 (miesiac,wynik) as (VALUES (1,1),(2,2)  ,(4,4),(5,NULL),(6,6))
,rok2004 (miesiac,wynik) as (VALUES (1,3)  ,(3,3),(4,5) ,(6,6))
SELECT
distinct on (miesiac) *
FROM (
SELECT miesiac, 2004 as rok, wynik FROM rok2004
union all
SELECT miesiac, 2005 as rok, wynik FROM rok2005
) as polaczone
ORDER BY miesiac, wynik desc;
#v-

They differ only in order of queries in union all part.

The thing is that they return the same result. Why isn't one of them returning
2005 for 6th miesiac?

I know I'm not sorting using rok, which means I'm getting undefined
functionality. Fine. But what exactly is happening that regardless of
order of rows in subquery, I get the same, always lower, rok in output?

Best regards,

depesz

-- 
The best thing about modern society is how easy it is to avoid contact with it.
 http://depesz.com/


signature.asc
Description: Digital signature


Re: [HACKERS] Question about sorting internals

2013-12-11 Thread Ashutosh Bapat
Hi deepesz,
You might want to see their EXPLAIN VERBOSE outputs. Having one of them
(2004 one) lesser number of rows, might be getting picked up as first
relation being union and thus ends up having it's rows before the second
one. Explain output would make it more clear. Also, try having same number
of rows in both the relations.


On Wed, Dec 11, 2013 at 3:26 PM, hubert depesz lubaczewski 
dep...@depesz.com wrote:

 Hi,

 before I'll go any further - this is only thought-experiment. I do not
 plan to use such queries in real-life applications. I was just presented
 with a question that I can't answer in any logical way.

 There are two simple queries:

 #v+
 with rok2005 (miesiac,wynik) as (VALUES (1,1),(2,2)
  ,(4,4),(5,NULL),(6,6))
 ,rok2004 (miesiac,wynik) as (VALUES (1,3)  ,(3,3),(4,5)
 ,(6,6))
 SELECT
 distinct on (miesiac) *
 FROM (
 SELECT miesiac, 2005 as rok, wynik FROM rok2005
 union all
 SELECT miesiac, 2004 as rok, wynik FROM rok2004
 ) as polaczone
 ORDER BY miesiac, wynik desc;
 #v-

 #v+
 with rok2005 (miesiac,wynik) as (VALUES (1,1),(2,2)
  ,(4,4),(5,NULL),(6,6))
 ,rok2004 (miesiac,wynik) as (VALUES (1,3)  ,(3,3),(4,5)
 ,(6,6))
 SELECT
 distinct on (miesiac) *
 FROM (
 SELECT miesiac, 2004 as rok, wynik FROM rok2004
 union all
 SELECT miesiac, 2005 as rok, wynik FROM rok2005
 ) as polaczone
 ORDER BY miesiac, wynik desc;
 #v-

 They differ only in order of queries in union all part.

 The thing is that they return the same result. Why isn't one of them
 returning
 2005 for 6th miesiac?

 I know I'm not sorting using rok, which means I'm getting undefined
 functionality. Fine. But what exactly is happening that regardless of
 order of rows in subquery, I get the same, always lower, rok in output?

 Best regards,

 depesz

 --
 The best thing about modern society is how easy it is to avoid contact
 with it.

 http://depesz.com/




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Completing PL support for Event Triggers

2013-12-11 Thread Dimitri Fontaine
Peter Eisentraut pete...@gmx.net writes:
 I think you are mistaken.  My patch includes all changes between your v1
 and v2 patch.

I mistakenly remembered that we did remove all the is_event_trigger
business from the plperl patch too, when it's not the case. Sorry about
this confusion.

My vote is for “ready for commit” then.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-12-11 Thread Haribabu kommi
On 10 December 2013 19:55 Alvaro Herrera wrote:
 Haribabu kommi escribió:
 
  To detect provided data and xlog directories are same or not, I
 reused
  the Existing make_absolute_path() code as follows.
 
  1. Moved the make_absolute_path() function from miscinit.c to path.c
  and Changed all error reporting functions. And also it returns NULL
  incase of any error.
 
  2. Added a new file called fe_path.c which contains
  make_absolute_path() only for frontend code.
 
 Whatever you do, please don't add #include lines to postgres_fe.h.  Add
 them to whatever .c files that need to include the new header, instead.
 (This results in a longer patch, yes, but that consideration shouldn't
 drive anything.  There is a desire to include as less headers as
 possible in each source file, and adding more include lines to
 postgres_fe.h means the new header will be included by every single
 frontend file, even those not in core.)
 
 See a nearby patch by Bruce Momjian to deal with getpwnam() and
 getpwuid() failures; perhaps the idea of returning an error string
 should be designed similarly in both these patches.  Also consider
 using the psprintf stuff, which works on both backend and frontend,
 avoiding malloc etc so that code can be shared by both frontend and
 backend, eliminating the duplicity.

The make_absolute_path() function moving to port is changed in similar way as
Bruce Momjian approach. The psprintf is used to store the error string which
Occurred in the function. But psprintf is not used for storing the absolute path
As because it is giving problems in freeing the allocated memory in 
SelectConfigFiles.
Because the same memory is allocated in a different code branch from guc_malloc.

After adding the make_absolute_path() function with psprintf stuff in path.c 
file
It is giving linking problem in compilation of ecpg. I am not able to find the 
problem.
So I added another file abspath.c in port which contains these two functions.

Updated patches are attached in the mail. Please provide your suggestions.

Regards,
Hari babu.


same_dir_error_v2.patch
Description: same_dir_error_v2.patch


make_abs_path_v2.patch
Description: make_abs_path_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about sorting internals

2013-12-11 Thread hubert depesz lubaczewski
On Wed, Dec 11, 2013 at 03:34:38PM +0530, Ashutosh Bapat wrote:
 Hi deepesz,
 You might want to see their EXPLAIN VERBOSE outputs. Having one of them
 (2004 one) lesser number of rows, might be getting picked up as first
 relation being union and thus ends up having it's rows before the second
 one. Explain output would make it more clear. Also, try having same number
 of rows in both the relations.

Explains:

 QUERY PLAN

 Unique  (cost=0.44..0.48 rows=9 width=12) (actual time=0.030..0.035 rows=6 
loops=1)
   Output: rok2004.miesiac, (2004), rok2004.wynik
   CTE rok2004
 -  Values Scan on *VALUES*  (cost=0.00..0.06 rows=5 width=8) (actual 
time=0.001..0.003 rows=5 loops=1)
   Output: *VALUES*.column1, *VALUES*.column2
   CTE rok2005
 -  Values Scan on *VALUES*_1  (cost=0.00..0.05 rows=4 width=8) (actual 
time=0.000..0.001 rows=4 loops=1)
   Output: *VALUES*_1.column1, *VALUES*_1.column2
   -  Sort  (cost=0.32..0.35 rows=9 width=12) (actual time=0.029..0.031 rows=9 
loops=1)
 Output: rok2004.miesiac, (2004), rok2004.wynik
 Sort Key: rok2004.miesiac, rok2004.wynik
 Sort Method: quicksort  Memory: 25kB
 -  Append  (cost=0.00..0.18 rows=9 width=12) (actual 
time=0.007..0.018 rows=9 loops=1)
   -  CTE Scan on rok2004  (cost=0.00..0.10 rows=5 width=12) 
(actual time=0.006..0.011 rows=5 loops=1)
 Output: rok2004.miesiac, 2004, rok2004.wynik
   -  CTE Scan on rok2005  (cost=0.00..0.08 rows=4 width=12) 
(actual time=0.002..0.004 rows=4 loops=1)
 Output: rok2005.miesiac, 2005, rok2005.wynik
 Total runtime: 0.077 ms
(18 rows)

 QUERY PLAN

 Unique  (cost=0.44..0.48 rows=9 width=12) (actual time=0.024..0.027 rows=6 
loops=1)
   Output: rok2005.miesiac, (2005), rok2005.wynik
   CTE rok2004
 -  Values Scan on *VALUES*  (cost=0.00..0.06 rows=5 width=8) (actual 
time=0.001..0.003 rows=5 loops=1)
   Output: *VALUES*.column1, *VALUES*.column2
   CTE rok2005
 -  Values Scan on *VALUES*_1  (cost=0.00..0.05 rows=4 width=8) (actual 
time=0.001..0.003 rows=4 loops=1)
   Output: *VALUES*_1.column1, *VALUES*_1.column2
   -  Sort  (cost=0.32..0.35 rows=9 width=12) (actual time=0.023..0.024 rows=9 
loops=1)
 Output: rok2005.miesiac, (2005), rok2005.wynik
 Sort Key: rok2005.miesiac, rok2005.wynik
 Sort Method: quicksort  Memory: 25kB
 -  Append  (cost=0.00..0.18 rows=9 width=12) (actual 
time=0.004..0.015 rows=9 loops=1)
   -  CTE Scan on rok2005  (cost=0.00..0.08 rows=4 width=12) 
(actual time=0.003..0.006 rows=4 loops=1)
 Output: rok2005.miesiac, 2005, rok2005.wynik
   -  CTE Scan on rok2004  (cost=0.00..0.10 rows=5 width=12) 
(actual time=0.001..0.006 rows=5 loops=1)
 Output: rok2004.miesiac, 2004, rok2004.wynik
 Total runtime: 0.053 ms
(18 rows)

So, it looks like rowcount is the one thing that's different. Not
entirely sure how the logic would be to make rowcount differ.

After some more talk on #postgresql, it looks like I will have to spend
some time with debugger to see what's happening there.

Best regards,

depesz



signature.asc
Description: Digital signature


Re: [HACKERS] same-address mappings vs. relative pointers

2013-12-11 Thread Florian Pflug
On Dec5, 2013, at 15:44 , Andres Freund and...@2ndquadrant.com wrote:
 There might be some ugly compiler dependent magic we could do. Depending
 on how we decide to declare offsets. Like (very, very roughly)
 
 #define relptr(type, struct_name, varname) union struct_name##_##varname{ \
type relptr_type; \
Offset relptr_off;
 }
 
 And then, for accessing have:
 #define relptr_access(seg, off) \
  typeof(off.relptr_type)* (((char *)seg-base_address) + off.relptr_off)
 
 But boy, that's ugly.

Well, uglyness we can live with, especially if it's less ugly than the
alternatives. But I'm afraid is also unportable - typeof() is a GCC
extension, not a part of ANSI C, no?

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] same-address mappings vs. relative pointers

2013-12-11 Thread Andres Freund
On 2013-12-11 11:42:25 +0100, Florian Pflug wrote:
 On Dec5, 2013, at 15:44 , Andres Freund and...@2ndquadrant.com wrote:
  There might be some ugly compiler dependent magic we could do. Depending
  on how we decide to declare offsets. Like (very, very roughly)
  
  #define relptr(type, struct_name, varname) union struct_name##_##varname{ \
 type relptr_type; \
 Offset relptr_off;
  }
  
  And then, for accessing have:
  #define relptr_access(seg, off) \
   typeof(off.relptr_type)* (((char *)seg-base_address) + off.relptr_off)
  
  But boy, that's ugly.
 
 Well, uglyness we can live with, especially if it's less ugly than the
 alternatives. But I'm afraid is also unportable - typeof() is a GCC
 extension, not a part of ANSI C, no?

Yes (although there's C11 stuff to do equivalent stuff afair) - I was
thinking of only doing it for compilers we support that dark magic for
and fall back to returning a void* for the others. We'll probably miss a
cast or two required on !gcc that way, but it's still likely to be less
error prone.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Greg Stark
On Wed, Dec 11, 2013 at 12:58 AM, Simon Riggs si...@2ndquadrant.com wrote:

 Yes, it is not a perfect statistical sample. All sampling is subject
 to an error that is data dependent.

Well there's random variation due to the limitations of dealing with a
sample. And then there's systemic biases due to incorrect algorithms.
You wouldn't be happy if the samples discarded every row with NULLs or
every row older than some date etc. These things would not be
corrected by larger samples. That's the kind of error we're talking
about here.

But the more I think about things the less convinced I am that there
is a systemic bias introduced by reading the entire block. I had
assumed larger rows would be selected against but that's not really
true, they're just selected against relative to the number of bytes
they occupy which is the correct frequency to sample.

Even blocks that are mostly empty don't really bias things. Picture a
table that consists of 100 blocks with 100 rows each (value A) and
another 100 blocks with only 1 row each (value B). The rows with value
B have a 50% chance of being in any given block which is grossly
inflated however each block selected with value A will produce 100
rows. So if you sample 10 blocks you'll get 100x10xA and 1x10xB which
will be the correct proportion.

I'm not actually sure there is any systemic bias here. The larger
number of rows per block generate less precise results but from my
thought experiments they seem to still be accurate?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why the buildfarm is all pink

2013-12-11 Thread Andres Freund
On 2013-12-10 19:55:12 -0500, Tom Lane wrote:
 I was surprised to see that my back-patches of the recent SubLink
 unpleasantness were failing on many of the buildfarm members, but
 only in the 9.1 and 9.0 branches.  The difficulty appears to be
 that the EXPLAIN output for the new test query changes depending on
 whether or not tenk1 has been analyzed yet.  In 9.2 and up,
 it reliably has been, because create_index runs first and that script
 does this:

 create_index.sql:901:vacuum analyze tenk1;-- ensure we get 
 consistent plans here

 so depending on timing, one of those might have gotten the job done,
 or maybe autovacuum would show up in time to save the day.

 We need a more consistent strategy for this :-(

Agreed, although I have no clue how it should look like. As a further
datapoint I'll add that installcheck already regularly fails in HEAD if
you have a HS standby connected via SR and hot_standby_feedback=on on
the standby. Some plans just change from index(only) scans to sequential
scans, presumably because of the lower xmin horizon changed the
stats. Since there's nothing running on the standby in those cases,
there has to be a pretty damn tiny window here somewhere.

I've wondered whether we could fix that by a) more explicit
vacuum/analyzes b) a function waiting for quiescent state.

Arguably we could just define that being unsupported, but given there's
no testing of recovery but that at all that doesn't seem like a good
idea.

There's also fun in running with vacuum_defer_cleanup_age  0, but I
don't think there's much chance of supporting that.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-11 Thread Dimitri Fontaine
Hi,

Stephen Frost sfr...@snowman.net writes:
 * Jeff Davis (pg...@j-davis.com) wrote:
 What is stopping Extension Templates, as proposed, from being this
 special extension creation mode? What would be a better design?

 The extra catalog tables which store SQL scripts in text columns is one
 of my main objections to the as-proposed Extension Templates.  I view
 those scripts as a poor man's definition of database objects which are
 defined properly in the catalog already.

I have a very hard time to understand this objection.

PL/SQL functions are just a SQL script stored as-is in the catalogs.
That applies the same way to any other PL language too, with scripts
stored as-is in the catalogs in different languages.

Even views are stored in a textual way in the catalogs, albeit in a
specific pre-processed format, it's still a text blob that could pass
for a script in a backend specific language, parsed by the rewriter.

So while I hear your objection to the script in catalog idea Stephen,
I think we should move forward. We don't have the luxury of only
applying patches where no compromise has to be made, where everyone is
fully happy with the solution we find as a community.

  The other big issue is that
 there isn't an easy way to see how we could open up the ability to
 create extensions to non-superusers with this approach.

The main proposal here is to only allow the owner of a template to
install it as an extension. For superusers, we can implement the needed
SET ROLE command automatically in the CREATE EXTENSION command.

Is there another security issue that this “same role” approach is not
solving? I don't think so.

 It seems like the porting issue is just a matter of finding someone to
 write a tool to reliably translate packages from PGXN into a form
 suitable to be sent using SQL commands; which we would need anyway for
 this special mode.

I already mentionned that's on my roadmap, part of the vision I'm trying
to implement here. My goal is to deliver the full solution for 9.4, and
this Extension Templates facility is the missing in-core bits of it.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] same-address mappings vs. relative pointers

2013-12-11 Thread Florian Pflug
On Dec11, 2013, at 11:47 , Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-11 11:42:25 +0100, Florian Pflug wrote:
 On Dec5, 2013, at 15:44 , Andres Freund and...@2ndquadrant.com wrote:
 There might be some ugly compiler dependent magic we could do. Depending
 on how we decide to declare offsets. Like (very, very roughly)
 
 #define relptr(type, struct_name, varname) union struct_name##_##varname{ \
   type relptr_type; \
   Offset relptr_off;
 }
 
 And then, for accessing have:
 #define relptr_access(seg, off) \
 typeof(off.relptr_type)* (((char *)seg-base_address) + off.relptr_off)
 
 But boy, that's ugly.
 
 Well, uglyness we can live with, especially if it's less ugly than the
 alternatives. But I'm afraid is also unportable - typeof() is a GCC
 extension, not a part of ANSI C, no?
 
 Yes (although there's C11 stuff to do equivalent stuff afair) - I was
 thinking of only doing it for compilers we support that dark magic for
 and fall back to returning a void* for the others. We'll probably miss a
 cast or two required on !gcc that way, but it's still likely to be less
 error prone.


Would it? For this to catch type mismatches, you'd both need to develop
on a typeof-supporting compiler *and* don't cast the result of relptr_access().
But you can't really do that, because the code will then fail on compilers
which don't support typeof()...

What we could do, I guess, is to pass the type to relptr_access() and to
relptr(), and let the compiler verify that they are the same.

Something like

#define relptr(type) union { \
  type relptr_type; \
  Offset relptr_off; \
}

#define relptr_access(type, seg, rptr) \
  (type)( \
(rptr.relptr_type - (type)0), \
((char*)seg-base_address) + rptr.relptr_off \
  )

And, yes, ouch ;-)

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] same-address mappings vs. relative pointers

2013-12-11 Thread Andres Freund
On 2013-12-11 12:37:56 +0100, Florian Pflug wrote:
 On Dec11, 2013, at 11:47 , Andres Freund and...@2ndquadrant.com wrote:
  On 2013-12-11 11:42:25 +0100, Florian Pflug wrote:
  Yes (although there's C11 stuff to do equivalent stuff afair) - I was
  thinking of only doing it for compilers we support that dark magic for
  and fall back to returning a void* for the others. We'll probably miss a
  cast or two required on !gcc that way, but it's still likely to be less
  error prone.
 
 
 Would it? For this to catch type mismatches, you'd both need to develop
 on a typeof-supporting compiler *and* don't cast the result of 
 relptr_access().
 But you can't really do that, because the code will then fail on compilers
 which don't support typeof()...

Yea, right.

 What we could do, I guess, is to pass the type to relptr_access() and to
 relptr(), and let the compiler verify that they are the same.

Tom and I actually added a macro that's helpful for that recently:
AssertVariableIsOfType(). With that we should be able to get something
reasonable, failing at compile time, with a useful error message even ;)

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Greg Stark
On Wed, Dec 11, 2013 at 11:01 AM, Greg Stark st...@mit.edu wrote:
 I'm not actually sure there is any systemic bias here. The larger
 number of rows per block generate less precise results but from my
 thought experiments they seem to still be accurate?

So I've done some empirical tests for a table generated by:
create table sizeskew as (select i,j,repeat('i',i) from
generate_series(1,1000) as i, generate_series(1,1000) as j);

I find that using the whole block doesn't cause any problem with the
avg_width field for the repeat column.That does reinforce my belief
that we might not need any particularly black magic here.

It does however cause a systemic error in the histogram bounds. It
seems the median is systematically overestimated by more and more the
larger the number of rows per block are used:

1: 524
4: 549
8: 571
12: 596
16: 602
20: 618 (total sample slightly smaller than normal)
30: 703 (substantially smaller sample)

So there is something clearly wonky in the histogram stats that's
affected by the distribution of the sample. The only thing I can think
of is maybe the most common elements are being selected preferentially
from the early part of the sample which is removing a substantial part
of the lower end of the range. But even removing 100 from the
beginning shouldn't be enough to push the median above 550.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Greg Stark
On Wed, Dec 11, 2013 at 12:08 PM, Greg Stark st...@mit.edu wrote:
 The only thing I can think
 of is maybe the most common elements are being selected preferentially
 from the early part of the sample which is removing a substantial part
 of the lower end of the range. But even removing 100 from the
 beginning shouldn't be enough to push the median above 550.

Just to follow up here. I think what's going is that not only are the
most_common_vals being preferentially taken from the beginning of the
sample but also their frequency is being massively overestimated. All
values have a frequency of about .001 but the head of the MCV has a
frequency as high as .10 in some of my tests.



-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add transforms feature

2013-12-11 Thread Robert Haas
On Tue, Dec 10, 2013 at 10:35 PM, Peter Eisentraut pete...@gmx.net wrote:
 On Fri, 2013-12-06 at 11:28 +0100, Dimitri Fontaine wrote:
  Here is an idea.  Add a GUC that basically says something like
  use_transforms = on|off.  You can then attach that to individual
  functions, which is the right granularity, because only the function
  knows whether its code expects transforms or not.  But you can use
 the
  full power of GUC to configure it any way you want.

 Here is an updated patch that implements this, makes some of the
 documentation improvements that you suggested, and rebases everything.

I'm still kinda unimpressed by this.  Behavior-changing GUC, uggh.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] should we add a XLogRecPtr/LSN SQL type?

2013-12-11 Thread Andres Freund
Hi,

There's already a couple of SQL function dealing with XLogRecPtrs and
the logical replication work will add a couple of more. Currently each
of those funtions taking/returning an LSN does sprintf/scanf to
print/parse the strings. Which both is awkward and potentially
noticeable performancewise.

It seems relatively simple to add a proper type, with implicit casts
from text, instead?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra functionality to createuser

2013-12-11 Thread Robert Haas
On Tue, Dec 10, 2013 at 9:55 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Tue, Dec 10, 2013 at 12:20 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Dec 7, 2013 at 11:39 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Fri, Dec 6, 2013 at 10:31 AM, Peter Eisentraut pete...@gmx.net wrote:

 How about only one role name per -g option, but allowing the -g option
 to be repeated?

I think that might simplify the problem and patch, but do you think
 it is okay to have inconsistency
for usage of options between Create User statement and this utility?

 Yes.  In general, command-line utilities use a very different syntax
 for options-passing that SQL commands.  Trying to make them consistent
 feels unnecessary or perhaps even counterproductive.  And the proposed
 syntax is certainly a convention common to many other command-line
 utilities, so I think it's fine.

 Okay, the new way for syntax suggested by Peter has simplified the problem.
 Please find the updated patch and docs for multiple -g options.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Heikki Linnakangas

On 12/11/2013 02:08 PM, Greg Stark wrote:

On Wed, Dec 11, 2013 at 11:01 AM, Greg Stark st...@mit.edu wrote:

I'm not actually sure there is any systemic bias here. The larger
number of rows per block generate less precise results but from my
thought experiments they seem to still be accurate?


So I've done some empirical tests for a table generated by:
create table sizeskew as (select i,j,repeat('i',i) from
generate_series(1,1000) as i, generate_series(1,1000) as j);

I find that using the whole block doesn't cause any problem with the
avg_width field for the repeat column.That does reinforce my belief
that we might not need any particularly black magic here.


How large a sample did you use? Remember that the point of doing 
block-level sampling instead of the current approach would be to allow 
using a significantly smaller sample (in # of blocks), and still achieve 
the same sampling error. If the sample is large enough, it will mask 
any systemic bias caused by block-sampling, but the point is to reduce 
the number of sampled blocks.


The practical question here is this: What happens to the quality of the 
statistics if you only read 1/2 the number of blocks than you normally 
would, but included all the rows in the blocks we read in the sample? 
How about 1/10 ?


Or to put it another way: could we achieve more accurate statistics by 
including all rows from the sampled rows, while reading the same number 
of blocks? In particular, I wonder if it would help with estimating 
ndistinct. It generally helps to have a larger sample for ndistinct 
estimation, so it might be beneficial.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 7:41 AM, Andres Freund and...@2ndquadrant.com wrote:
 There's already a couple of SQL function dealing with XLogRecPtrs and
 the logical replication work will add a couple of more. Currently each
 of those funtions taking/returning an LSN does sprintf/scanf to
 print/parse the strings. Which both is awkward and potentially
 noticeable performancewise.

 It seems relatively simple to add a proper type, with implicit casts
 from text, instead?

I'm pretty sure that this was discussed last year, and I voted for it
-- except for the implicit casts part, perhaps -- but more people
voted against it, so it died.  I still think that was a mistake, but I
just work here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2013-12-11 Thread Euler Taveira
On 11-12-2013 09:41, Andres Freund wrote:
 There's already a couple of SQL function dealing with XLogRecPtrs and
 the logical replication work will add a couple of more. Currently each
 of those funtions taking/returning an LSN does sprintf/scanf to
 print/parse the strings. Which both is awkward and potentially
 noticeable performancewise.
 
While discussing pg_xlog_location_diff function, Robert posted a lsn
datatype [1]. At that time we wouldn't go that far (a new datatype) to
cover only one function. If your proposal is just validation, I think
generic validation functions is the way to follow. However, if you are
thinking in adding operators, the lsn datatype should be implemented.

 It seems relatively simple to add a proper type, with implicit casts
 from text, instead?
 
Do you want to change the function signatures too?


[1]
http://www.postgresql.org/message-id/ca+tgmozrmnn0evesd-kxb9e-mvdmwoti6guujuvqp_8q2c5...@mail.gmail.com


-- 
   Euler Taveira   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Robert Haas
On Tue, Dec 10, 2013 at 11:40 PM, Fujii Masao masao.fu...@gmail.com wrote:
 While I was investigaing PQhost() for that approach, I found several
 problems of PQhost().

 (1) PQhost() can return Unix-domain socket directory path even in the
 platform that
 doesn't support Unix-domain socket.

 (2) In the platform that doesn't support Unix-domain socket, when
 neither host nor hostaddr
 are specified, the default host 'localhost' is used to connect to
 the server and
 PQhost() must return that, but it doesn't.

I think changing PQhost() so that it returns DefaultHost rather than
conn-pgunixsocket when we don't HAVE_UNIX_SOCKETS is a back-patchable
bug fix, and I'd say go for it.

 (3) PQhost() cannot return the hostaddr.

However, I'm much less sure whether this is something that we want to
do at all.  It seems like this might be a definition of what the
function does, and I'm not sure whether the new definition is what
everyone will want.  On the other hand, I'm also not sure that it
isn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Fujii Masao
On Wed, Dec 11, 2013 at 10:26 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Dec 10, 2013 at 11:40 PM, Fujii Masao masao.fu...@gmail.com wrote:
 While I was investigaing PQhost() for that approach, I found several
 problems of PQhost().

 (1) PQhost() can return Unix-domain socket directory path even in the
 platform that
 doesn't support Unix-domain socket.

 (2) In the platform that doesn't support Unix-domain socket, when
 neither host nor hostaddr
 are specified, the default host 'localhost' is used to connect to
 the server and
 PQhost() must return that, but it doesn't.

 I think changing PQhost() so that it returns DefaultHost rather than
 conn-pgunixsocket when we don't HAVE_UNIX_SOCKETS is a back-patchable
 bug fix, and I'd say go for it.

Agreed.

 (3) PQhost() cannot return the hostaddr.

 However, I'm much less sure whether this is something that we want to
 do at all.  It seems like this might be a definition of what the
 function does, and I'm not sure whether the new definition is what
 everyone will want.  On the other hand, I'm also not sure that it
 isn't.

If we don't change PQhost() in that way, we cannot fix the following problem
of wrong output of \conninfo, for example.

$ psql -d hostaddr=127.0.0.1
=# \conninfo
You are connected to database postgres as user postgres via socket
in /tmp at port 5432.

Regards.

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum_work_mem

2013-12-11 Thread Robert Haas
On Sun, Oct 20, 2013 at 7:21 AM, Magnus Hagander mag...@hagander.net wrote:
 It seemed neater to me to create a new flag, so that in principle any
 vacuum() code path can request autovacuum_work_mem, rather than having
 lazyvacuum.c code call IsAutoVacuumWorkerProcess() for the same
 purpose. To date, that's only been done within vacuumlazy.c for things
 like logging.

 Hmm. I'm not entirely sure I agree that that makes it neater :)

 We could also look at autovacuum_vacuum_cost_limit etc above, but
 those just override what the non-autovac parameters do. But since the
 parameter is called maintenance_work_mem in that case, I think that
 would make it harder to read.

 But I'd suggest just a:
 int vac_work_mem = (IsAutoVacuumWorkerProcess()  autovacuum_work_mem
 != -1) ? autovacuum_work_mem : maintenance_work_mem;

 and not sending around a boolean flag through a bunch of places when
 it really means just the same thing,

+1 for that change.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 8:45 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Dec 11, 2013 at 10:26 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Dec 10, 2013 at 11:40 PM, Fujii Masao masao.fu...@gmail.com wrote:
 While I was investigaing PQhost() for that approach, I found several
 problems of PQhost().

 (1) PQhost() can return Unix-domain socket directory path even in the
 platform that
 doesn't support Unix-domain socket.

 (2) In the platform that doesn't support Unix-domain socket, when
 neither host nor hostaddr
 are specified, the default host 'localhost' is used to connect to
 the server and
 PQhost() must return that, but it doesn't.

 I think changing PQhost() so that it returns DefaultHost rather than
 conn-pgunixsocket when we don't HAVE_UNIX_SOCKETS is a back-patchable
 bug fix, and I'd say go for it.

 Agreed.

 (3) PQhost() cannot return the hostaddr.

 However, I'm much less sure whether this is something that we want to
 do at all.  It seems like this might be a definition of what the
 function does, and I'm not sure whether the new definition is what
 everyone will want.  On the other hand, I'm also not sure that it
 isn't.

 If we don't change PQhost() in that way, we cannot fix the following problem
 of wrong output of \conninfo, for example.

 $ psql -d hostaddr=127.0.0.1
 =# \conninfo
 You are connected to database postgres as user postgres via socket
 in /tmp at port 5432.

Yeah, that's true.  But the whole point of having both host and
hostaddr seems to be that you can lie about where you're connecting.
If you set host=some.pretty.domain.name hostaddr=1.2.3.4, the point is
to say that you're connecting to the first while, under the covers,
actually connecting to the second.  Now, I am unclear what value this
has, but someone at some point evidently thought it was a good idea,
so we need to be careful about changing it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add transforms feature

2013-12-11 Thread Hannu Krosing
On 12/11/2013 01:40 PM, Robert Haas wrote:
 On Tue, Dec 10, 2013 at 10:35 PM, Peter Eisentraut pete...@gmx.net wrote:
 On Fri, 2013-12-06 at 11:28 +0100, Dimitri Fontaine wrote:
 Here is an idea.  Add a GUC that basically says something like
 use_transforms = on|off.  You can then attach that to individual
 functions, which is the right granularity, because only the function
 knows whether its code expects transforms or not.  But you can use
 the
 full power of GUC to configure it any way you want.
 Here is an updated patch that implements this, makes some of the
 documentation improvements that you suggested, and rebases everything.
 I'm still kinda unimpressed by this.  Behavior-changing GUC, uggh.

It should work ok if we could somehow check that the GUC is set
on the function and fall back to session GUC in case it is not.

Not sure if this is possible though.

The need from this arises from calling other functions from a new func.
At the moment if there is a new function defined as

CREATE FUNCTION f_uses_xforms() AS $$ ... $$ SET use_transforms=on;

calls a legacy function which will break if transforms are used then the
_old_ function declaration needs to be modified to add (use_transforms=off)

It is much easier than debugging/rewriting the function, but this is
something I'd like us to be able to avoid.

PS. maybe we could resurrect the  WITH (attribute, ...) available in
CREATE FUNCTION syntax for passing function-specific flags ?


Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about sorting internals

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 4:56 AM, hubert depesz lubaczewski
dep...@depesz.com wrote:
 before I'll go any further - this is only thought-experiment. I do not
 plan to use such queries in real-life applications. I was just presented
 with a question that I can't answer in any logical way.

 There are two simple queries:

 #v+
 with rok2005 (miesiac,wynik) as (VALUES (1,1),(2,2)  
 ,(4,4),(5,NULL),(6,6))
 ,rok2004 (miesiac,wynik) as (VALUES (1,3)  ,(3,3),(4,5) 
 ,(6,6))
 SELECT
 distinct on (miesiac) *
 FROM (
 SELECT miesiac, 2005 as rok, wynik FROM rok2005
 union all
 SELECT miesiac, 2004 as rok, wynik FROM rok2004
 ) as polaczone
 ORDER BY miesiac, wynik desc;
 #v-

 #v+
 with rok2005 (miesiac,wynik) as (VALUES (1,1),(2,2)  
 ,(4,4),(5,NULL),(6,6))
 ,rok2004 (miesiac,wynik) as (VALUES (1,3)  ,(3,3),(4,5) 
 ,(6,6))
 SELECT
 distinct on (miesiac) *
 FROM (
 SELECT miesiac, 2004 as rok, wynik FROM rok2004
 union all
 SELECT miesiac, 2005 as rok, wynik FROM rok2005
 ) as polaczone
 ORDER BY miesiac, wynik desc;
 #v-

 They differ only in order of queries in union all part.

 The thing is that they return the same result. Why isn't one of them returning
 2005 for 6th miesiac?

The query planner sees that in order for the output ordering to match
the ORDER BY clause, it's got to sort by miesiac, wynik desc.  The
DISTINCT ON clause can be implemented very cheaply after that - every
time the value of miesiac changes, it emits only the first of the rows
with the new value.  So it's a good plan.  However, because the sort
happens before the unique step, the results you get are dependent on
what order the sort happens to emit the rows.  Our sort algorithms are
not stable, so there's no particular guarantee about the order in
which rows will pop out, beyond the fact that they must obey the sort
key.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why the buildfarm is all pink

2013-12-11 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:

 I haven't touched matview.sql here; that seems like a distinct issue.


I'll fix that.

--

Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Florian Pflug
On Dec10, 2013, at 15:32 , Claudio Freire klaussfre...@gmail.com wrote:
 On Tue, Dec 10, 2013 at 11:02 AM, Greg Stark st...@mit.edu wrote:
 
 On 10 Dec 2013 08:28, Albe Laurenz laurenz.a...@wien.gv.at wrote:
 
 
 Doesn't all that assume a normally distributed random variable?
 
 I don't think so because of the law of large numbers. If you have a large
 population and sample it the sample behaves like a normal distribution when
 if the distribution of the population isn't.
 
 No, the large population says that if you have an AVERAGE of many
 samples of a random variable, the random variable that is the AVERAGE
 behaves like a normal.

Actually, that's the central limit theorem, and it doesn't hold for all
random variables, only for those with finite expected value and variance.

The law of large numbers, in contrast, only tells you that the AVERAGE of
n samples of a random variable will converge to the random variables'
expected value as n goes to infinity (there are different versions of the
law which guarantee different kinds of convergence, weak or strong).

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] In-Memory Columnar Store

2013-12-11 Thread Merlin Moncure
On Mon, Dec 9, 2013 at 1:40 PM, knizhnik knizh...@garret.ru wrote:
 Hello!

 I want to annouce my implementation of In-Memory Columnar Store extension
 for PostgreSQL:

  Documentation: http://www.garret.ru/imcs/user_guide.html
  Sources: http://www.garret.ru/imcs-1.01.tar.gz

 Any feedbacks, bug reports and suggestions are welcome.

 Vertical representation of data is stored in PostgreSQL shared memory.
 This is why it is important to be able to utilize all available physical
 memory.
 Now servers with Tb or more RAM are not something exotic, especially in
 financial world.
 But there is limitation in Linux with standard 4kb pages  for maximal size
 of mapped memory segment: 256Gb.
 It is possible to overcome this limitation either by creating multiple
 segments - but it requires too much changes in PostgreSQL memory manager.
 Or just set MAP_HUGETLB flag (assuming that huge pages were allocated in the
 system).

 I found several messages related with MAP_HUGETLB flag, the most recent one
 was from 21 of November:
 http://www.postgresql.org/message-id/20131125032920.ga23...@toroid.org

 I wonder what is the current status of this patch?

I looked over your extension.  I think it's a pretty amazing example
of the postgres extension and type systems -- up there with postgis.
Very well done.  How long did this take you to write?

MAP_HUGETLB patch was marked 'returned with feedback'.
https://commitfest.postgresql.org/action/patch_view?id=1308.  It seems
likely to be revived, perhaps in time for 9.4.

Honestly, I think your efforts here provide more argument for adding
huge tbl support.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Andres Freund
On 2013-12-11 08:56:43 -0500, Robert Haas wrote:
  $ psql -d hostaddr=127.0.0.1
  =# \conninfo
  You are connected to database postgres as user postgres via socket
  in /tmp at port 5432.
 
 Yeah, that's true.  But the whole point of having both host and
 hostaddr seems to be that you can lie about where you're connecting.
 If you set host=some.pretty.domain.name hostaddr=1.2.3.4, the point is
 to say that you're connecting to the first while, under the covers,
 actually connecting to the second.  Now, I am unclear what value this
 has, but someone at some point evidently thought it was a good idea,
 so we need to be careful about changing it.

One use case is accessing a particular host when using DNS round robin
to standbys in combination with SSL.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum_work_mem

2013-12-11 Thread Simon Riggs
On 25 November 2013 21:51, Peter Geoghegan p...@heroku.com wrote:
 On Sun, Nov 24, 2013 at 9:06 AM, Simon Riggs si...@2ndquadrant.com wrote:
 VACUUM uses 6 bytes per dead tuple. And autovacuum regularly removes
 dead tuples, limiting their numbers.

 In what circumstances will the memory usage from multiple concurrent
 VACUUMs become a problem? In those circumstances, reducing
 autovacuum_work_mem will cause more passes through indexes, dirtying
 more pages and elongating the problem workload.

 Yes, of course, but if we presume that the memory for autovacuum
 workers to do everything in one pass simply isn't there, it's still
 better to do multiple passes.

That isn't clear to me. It seems better to wait until we have the memory.

My feeling is this parameter is a fairly blunt approach to the
problems of memory pressure on autovacuum and other maint tasks. I am
worried that it will not effectively solve the problem. I don't wish
to block the patch; I wish to get to an effective solution to the
problem.

A better aproach to handling memory pressure would be to globally
coordinate workers so that we don't oversubscribe memory, allocating
memory from a global pool.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add transforms feature

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 9:19 AM, Hannu Krosing ha...@2ndquadrant.com wrote:
 On 12/11/2013 01:40 PM, Robert Haas wrote:
 On Tue, Dec 10, 2013 at 10:35 PM, Peter Eisentraut pete...@gmx.net wrote:
 On Fri, 2013-12-06 at 11:28 +0100, Dimitri Fontaine wrote:
 Here is an idea.  Add a GUC that basically says something like
 use_transforms = on|off.  You can then attach that to individual
 functions, which is the right granularity, because only the function
 knows whether its code expects transforms or not.  But you can use
 the
 full power of GUC to configure it any way you want.
 Here is an updated patch that implements this, makes some of the
 documentation improvements that you suggested, and rebases everything.
 I'm still kinda unimpressed by this.  Behavior-changing GUC, uggh.

 It should work ok if we could somehow check that the GUC is set
 on the function and fall back to session GUC in case it is not.

 Not sure if this is possible though.

 The need from this arises from calling other functions from a new func.
 At the moment if there is a new function defined as

 CREATE FUNCTION f_uses_xforms() AS $$ ... $$ SET use_transforms=on;

 calls a legacy function which will break if transforms are used then the
 _old_ function declaration needs to be modified to add (use_transforms=off)

Yeah, exactly.

 It is much easier than debugging/rewriting the function, but this is
 something I'd like us to be able to avoid.

 PS. maybe we could resurrect the  WITH (attribute, ...) available in
 CREATE FUNCTION syntax for passing function-specific flags ?

It's a thought.  Or you could put some annotation in the function
body, as we do in PL/pgsql with the #option syntax.

Of course, making everyone decorate their new functions with
references to the transforms they want to use isn't wonderful either,
but it might be good at least to have the option.  You could allow the
use of all installed transforms by default, but let people say WITH
(transforms='') if they don't want to use them or WITH
(transforms='comma, separated, list') if the want to require certain
ones.

Unfortunately, that'll probably mean that virtually all portable code
for procedural languages has to include some form of this incantation,
just as any nearly any PL/pgsql function has to include SET
search_path = '' if it wants to be not trivially subvertable.  It's
annoying to grow more such decoration, but the alternative seems to be
hoping that nobody wants to write portable code that uses non-core
types, and that doesn't seem better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum_work_mem

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 9:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 25 November 2013 21:51, Peter Geoghegan p...@heroku.com wrote:
 On Sun, Nov 24, 2013 at 9:06 AM, Simon Riggs si...@2ndquadrant.com wrote:
 VACUUM uses 6 bytes per dead tuple. And autovacuum regularly removes
 dead tuples, limiting their numbers.

 In what circumstances will the memory usage from multiple concurrent
 VACUUMs become a problem? In those circumstances, reducing
 autovacuum_work_mem will cause more passes through indexes, dirtying
 more pages and elongating the problem workload.

 Yes, of course, but if we presume that the memory for autovacuum
 workers to do everything in one pass simply isn't there, it's still
 better to do multiple passes.

 That isn't clear to me. It seems better to wait until we have the memory.

 My feeling is this parameter is a fairly blunt approach to the
 problems of memory pressure on autovacuum and other maint tasks. I am
 worried that it will not effectively solve the problem. I don't wish
 to block the patch; I wish to get to an effective solution to the
 problem.

 A better aproach to handling memory pressure would be to globally
 coordinate workers so that we don't oversubscribe memory, allocating
 memory from a global pool.

This is doubtless true, but that project is at least two if not three
orders of magnitude more complex than what's being proposed here, and
I don't think we should make the perfect the enemy of the good.

Right now, maintenance_work_mem controls the amount of memory that
we're willing to use for either a vacuum operation or an index build.
Those things don't have much to do with each other, so it's not hard
for me to imagine that someone might want to configure different
memory usage for one than the other.  This patch would allow that, and
I think that's good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 9:35 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-11 08:56:43 -0500, Robert Haas wrote:
  $ psql -d hostaddr=127.0.0.1
  =# \conninfo
  You are connected to database postgres as user postgres via socket
  in /tmp at port 5432.

 Yeah, that's true.  But the whole point of having both host and
 hostaddr seems to be that you can lie about where you're connecting.
 If you set host=some.pretty.domain.name hostaddr=1.2.3.4, the point is
 to say that you're connecting to the first while, under the covers,
 actually connecting to the second.  Now, I am unclear what value this
 has, but someone at some point evidently thought it was a good idea,
 so we need to be careful about changing it.

 One use case is accessing a particular host when using DNS round robin
 to standbys in combination with SSL.

Ah, interesting point.  And it's not inconceivable that some
application out there could be using PQhost() to retrieve the host
from an existing connection object and reusing that value for a new
connection, in which case redefining it to sometimes return hostaddr
would break things.  So I think we shouldn't do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why the buildfarm is all pink

2013-12-11 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:

 I haven't touched matview.sql here; that seems like a distinct
 issue.

 I'll fix that.

Done.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2013-12-11 Thread MauMau

From: Amit Kapila amit.kapil...@gmail.com

  I think it is better to keep it like what I suggested above,
because in that case
  it will assign default name even if postgres -C fails due to some 
reason.



2. What will happen if user doesn't change the name in event_source
or kept the same name,
   won't it hit the same problem again? So shouldn't it try to
generate different name by appending
   version string to it?




I re-considered that.  As you suggested, I think I'll do as follows.  Would 
this be OK?


[pg_ctl.c]
evtHandle = RegisterEventSource(NULL, *event_source ? event_source :
   PostgreSQL  PG_MAJORVERSION);


[guc.c]
{event_source, PGC_POSTMASTER, LOGGING_WHERE,
...
PostgreSQL  PG_MAJORVERSION,
NULL, NULL, NULL



[elog.c]
Writing the default value in this file was redundant, because event_source 
cannot be NULL.  So change


evtHandle = RegisterEventSource(NULL, event_source ? event_source : 
PostgreSQL);


to

evtHandle = RegisterEventSource(NULL, event_source);


Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why the buildfarm is all pink

2013-12-11 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-12-10 19:55:12 -0500, Tom Lane wrote:
 We need a more consistent strategy for this :-(

 Agreed, although I have no clue how it should look like. As a further
 datapoint I'll add that installcheck already regularly fails in HEAD if
 you have a HS standby connected via SR and hot_standby_feedback=on on
 the standby. Some plans just change from index(only) scans to sequential
 scans, presumably because of the lower xmin horizon changed the
 stats. Since there's nothing running on the standby in those cases,
 there has to be a pretty damn tiny window here somewhere.

The case in create_index does a vacuum analyze tenk1 and expects
to get an index-only scan in the very next SQL command.  So any delay
in considering the table all-visible could break that test.  I'm not
sure if that's what you're talking about though.  We could easily
create some more delay for that case, for instance by moving the
vacuum step to copy.sql as I was idly speculating about upthread.
Do you remember offhand where the failures are?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] In-Memory Columnar Store

2013-12-11 Thread k...@rice.edu
On Mon, Dec 09, 2013 at 11:40:41PM +0400, knizhnik wrote:
 Hello!
 
 I want to annouce my implementation of In-Memory Columnar Store
 extension for PostgreSQL:
 
  Documentation: http://www.garret.ru/imcs/user_guide.html
  Sources: http://www.garret.ru/imcs-1.01.tar.gz
 
 Any feedbacks, bug reports and suggestions are welcome.
 
 Vertical representation of data is stored in PostgreSQL shared memory.
 This is why it is important to be able to utilize all available
 physical memory.

Hi,

This is very neat! The question I have, which applies to the matview
support as well, is How can we transparently substitute usage of the
in-memory columnar store/matview in a SQL query?.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Simon Riggs
On 11 December 2013 12:08, Greg Stark st...@mit.edu wrote:

 So there is something clearly wonky in the histogram stats that's
 affected by the distribution of the sample.

...in the case where the avg width changes in a consistent manner
across the table.

Well spotted.

ISTM we can have a specific cross check for bias in the sample of that
nature. We just calculate the avg width per block and then check for
correlation of the avg width against block number. If we find bias we
can calculate how many extra blocks to sample and from where.

There may be other biases also, so we can check for them and respond
accordingly.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-11 Thread MauMau

From: Kevin Grittner kgri...@ymail.com
It seems to be a fairly common term of art for a problem which
requires a restart or reconnection. FATAL is used when the problem
is severe enough that the process or connection must end. It seems
to me to be what should consistently be used when a client
connection or its process must be terminated for a reason other
than a client-side request to terminate.


What do you think of #5 and #6 when matching the above criteria?

5. FATAL:  terminating walreceiver process due to administrator command
6. FATAL:  terminating background worker \%s\ due to administrator command

These are output when the DBA shuts down the database server and there's no 
client connection.  That is, these don't meet the criteria.  I believe these 
should be suppressed, or use LOG instead of FATAL.


Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question about sorting internals

2013-12-11 Thread Tom Lane
hubert depesz lubaczewski dep...@depesz.com writes:
 There are two simple queries: ...
 They differ only in order of queries in union all part.
 The thing is that they return the same result. Why isn't one of them returning
 2005 for 6th miesiac?

With such a small amount of data, you're getting an in-memory quicksort,
and a well-known property of quicksort is that it isn't stable --- that
is, there are no guarantees about the order in which it will return items
that have equal keys.  In this case it's evidently making different
partitioning choices, as a consequence of the different arrival order of
the rows, that just by chance end up with the 6/2004/6 row being returned
before the 6/2005/6 row in both cases.  You could trace through the logic
and see exactly how that's happening, but I doubt it'd be a very edifying
exercise.

If you want to get well-defined results with DISTINCT ON, you should
make the ORDER BY sort by a candidate key.  Anything less opens you to
uncertainty about which rows the DISTINCT will select.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-11 Thread Andres Freund
On 2013-12-12 00:31:25 +0900, MauMau wrote:
 What do you think of #5 and #6 when matching the above criteria?
 
 5. FATAL:  terminating walreceiver process due to administrator command
 6. FATAL:  terminating background worker \%s\ due to administrator command

Those are important if they happen outside a shutdown. So, if you really
want to remove them from there, you'd need to change the signalling to
handle the cases differently.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum_work_mem

2013-12-11 Thread Simon Riggs
On 11 December 2013 14:50, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Dec 11, 2013 at 9:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 25 November 2013 21:51, Peter Geoghegan p...@heroku.com wrote:
 On Sun, Nov 24, 2013 at 9:06 AM, Simon Riggs si...@2ndquadrant.com wrote:
 VACUUM uses 6 bytes per dead tuple. And autovacuum regularly removes
 dead tuples, limiting their numbers.

 In what circumstances will the memory usage from multiple concurrent
 VACUUMs become a problem? In those circumstances, reducing
 autovacuum_work_mem will cause more passes through indexes, dirtying
 more pages and elongating the problem workload.

 Yes, of course, but if we presume that the memory for autovacuum
 workers to do everything in one pass simply isn't there, it's still
 better to do multiple passes.

 That isn't clear to me. It seems better to wait until we have the memory.

 My feeling is this parameter is a fairly blunt approach to the
 problems of memory pressure on autovacuum and other maint tasks. I am
 worried that it will not effectively solve the problem. I don't wish
 to block the patch; I wish to get to an effective solution to the
 problem.

 A better aproach to handling memory pressure would be to globally
 coordinate workers so that we don't oversubscribe memory, allocating
 memory from a global pool.

 This is doubtless true, but that project is at least two if not three
 orders of magnitude more complex than what's being proposed here, and
 I don't think we should make the perfect the enemy of the good.

It looks fairly easy to estimate the memory needed for an auto vacuum,
since we know the scale factor and the tuple estimate. We can then use
the memory estimate to alter the scheduling of work. And/or we can use
actual memory usage and block auto vac workers if they need more
memory than is currently available because of other workers.

We would still benefit from a new parameter in the above sketch, but
it would achieve something useful in practice.

That's about 2-3 days work and I know Peter can hack it. So the
situation is not perfection-sought-blocking-good, this is more like
fairly poor solution being driven through when a better solution is
available within the time and skills available.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-11 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Stephen Frost sfr...@snowman.net writes:
 The extra catalog tables which store SQL scripts in text columns is one
 of my main objections to the as-proposed Extension Templates.  I view
 those scripts as a poor man's definition of database objects which are
 defined properly in the catalog already.

 I have a very hard time to understand this objection.

Why?  I think it has considerable force.

 PL/SQL functions are just a SQL script stored as-is in the catalogs.

Those are the exception not the rule.

 Even views are stored in a textual way in the catalogs, albeit in a
 specific pre-processed format,

This is utter nonsense.  That representation has nothing to do with
the original text of the CREATE VIEW command, and the fact that we
store it as an ASCII string rather than some binary blob has more to do
with debuggability than anything else.  The important point is that we
can (and sometimes do) transform the view to something else based
on semantic understanding of what's in it.  And we also have the ability
to figure out what the view depends on, something that is mighty hard
to get out of a text blob.  (The fact that we don't have that for SQL
functions is a serious minus of our approach to functions.)

Stephen is concerned that a pure textual representation lacks any deep
semantic understanding of what's in the extension, and I think that's
indeed something to be concerned about.  It's perhaps not a 100% show
stopper, but it's something to be avoided unless the benefits of
storing pure text are overwhelming.  Which you don't seem to have
convinced people of.

 So while I hear your objection to the script in catalog idea Stephen,
 I think we should move forward. We don't have the luxury of only
 applying patches where no compromise has to be made, where everyone is
 fully happy with the solution we find as a community.

You've got that backwards.  We do have the luxury of rejecting new
features until people are generally satisfied that the basic design is
right.  There's no overlord decreeing that this must be in 9.4.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-11 Thread Kevin Grittner
MauMau maumau...@gmail.com wrote:
 From: Kevin Grittner kgri...@ymail.com

 FATAL is used when the problem is severe enough that the process
 or connection must end. It seems to me to be what should
 consistently be used when a client connection or its process must
 be terminated for a reason other than a client-side request to
 terminate.

 What do you think of #5 and #6 when matching the above criteria?

 5. FATAL:  terminating walreceiver process due to administrator
 command
 6. FATAL:  terminating background worker \%s\ due to
 administrator command

Those are client connections and their backends terminated for a
reason other than the client side of the connection requesting it. 
If we don't classify those as FATAL then the definition of FATAL
becomes much more fuzzy.  What would you define it to mean?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why the buildfarm is all pink

2013-12-11 Thread Andres Freund
On 2013-12-11 10:07:19 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-12-10 19:55:12 -0500, Tom Lane wrote:
  We need a more consistent strategy for this :-(

  Agreed, although I have no clue how it should look like. As a further
  datapoint I'll add that installcheck already regularly fails in HEAD if
  you have a HS standby connected via SR and hot_standby_feedback=on on
  the standby. Some plans just change from index(only) scans to sequential
  scans, presumably because of the lower xmin horizon changed the
  stats. Since there's nothing running on the standby in those cases,
  there has to be a pretty damn tiny window here somewhere.
 
 The case in create_index does a vacuum analyze tenk1 and expects
 to get an index-only scan in the very next SQL command.  So any delay
 in considering the table all-visible could break that test.  I'm not
 sure if that's what you're talking about though.  We could easily
 create some more delay for that case, for instance by moving the
 vacuum step to copy.sql as I was idly speculating about upthread.
 Do you remember offhand where the failures are?

No, but they are easy enough to reproduce. Out of 10 runs, I've attached
the one with the most failures and checked that it seems to contain all
the failures from other runs. All of them probably could be fixed by
moving things around, but I am not sure how maintainable that approach
is :/

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
--- /home/andres/src/postgresql/src/test/regress/expected/create_index.out  
2013-12-08 19:57:01.646559353 +0100
+++ 
/home/andres/build/postgres/dev-assert/vpath/src/test/regress/results/create_index.out
  2013-12-11 16:54:00.043641015 +0100
@@ -2720,11 +2720,15 @@
 SELECT unique1 FROM tenk1
 WHERE unique1 IN (1,42,7)
 ORDER BY unique1;
-  QUERY PLAN   

- Index Only Scan using tenk1_unique1 on tenk1
-   Index Cond: (unique1 = ANY ('{1,42,7}'::integer[]))
-(2 rows)
+QUERY PLAN 
+---
+ Sort
+   Sort Key: unique1
+   -  Bitmap Heap Scan on tenk1
+ Recheck Cond: (unique1 = ANY ('{1,42,7}'::integer[]))
+ -  Bitmap Index Scan on tenk1_unique1
+   Index Cond: (unique1 = ANY ('{1,42,7}'::integer[]))
+(6 rows)
 
 SELECT unique1 FROM tenk1
 WHERE unique1 IN (1,42,7)

==

--- /home/andres/src/postgresql/src/test/regress/expected/subselect.out 
2013-12-11 16:42:18.791039738 +0100
+++ 
/home/andres/build/postgres/dev-assert/vpath/src/test/regress/results/subselect.out
 2013-12-11 16:54:06.823376218 +0100
@@ -727,7 +727,7 @@
-  Seq Scan on public.tenk1 b
  Output: b.unique1, b.unique2, b.two, b.four, b.ten, b.twenty, 
b.hundred, b.thousand, b.twothousand, b.fivethous, b.tenthous, b.odd, b.even, 
b.stringu1, b.stringu2, b.string4
SubPlan 1
- -  Index Only Scan using tenk1_unique1 on public.tenk1 a
+ -  Seq Scan on public.tenk1 a
Output: a.unique1
 (10 rows)
 

==

--- /home/andres/src/postgresql/src/test/regress/expected/join.out  
2013-12-08 19:57:01.649892559 +0100
+++ 
/home/andres/build/postgres/dev-assert/vpath/src/test/regress/results/join.out  
2013-12-11 16:54:08.439979745 +0100
@@ -3354,16 +3354,16 @@
 explain (costs off)
   select count(*) from tenk1 a,
 tenk1 b join lateral (values(a.unique1)) ss(x) on b.unique2 = ss.x;
-QUERY PLAN
---
+ QUERY PLAN  
+-
  Aggregate
-  Hash Join
  Hash Cond: (*VALUES*.column1 = b.unique2)
  -  Nested Loop
-   -  Index Only Scan using tenk1_unique1 on tenk1 a
+   -  Seq Scan on tenk1 a
-  Values Scan on *VALUES*
  -  Hash
-   -  Index Only Scan using tenk1_unique2 on tenk1 b
+   -  Seq Scan on tenk1 b
 (8 rows)
 
 select count(*) from tenk1 a,

==

--- /home/andres/src/postgresql/src/test/regress/expected/rowtypes.out  
2013-12-08 19:14:10.614689754 +0100
+++ 
/home/andres/build/postgres/dev-assert/vpath/src/test/regress/results/rowtypes.out
  2013-12-11 16:54:31.072429133 +0100
@@ -240,11 +240,15 @@
 select thousand, tenthous from tenk1
 where (thousand, tenthous) = (997, 5000)
 order by thousand, tenthous;
-QUERY PLAN 

- 

Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 So I've done some empirical tests for a table generated by:
 create table sizeskew as (select i,j,repeat('i',i) from
 generate_series(1,1000) as i, generate_series(1,1000) as j);

 I find that using the whole block doesn't cause any problem with the
 avg_width field for the repeat column.That does reinforce my belief
 that we might not need any particularly black magic here.

 It does however cause a systemic error in the histogram bounds. It
 seems the median is systematically overestimated by more and more the
 larger the number of rows per block are used:

Hm.  You can only take N rows from a block if there actually are at least
N rows in the block.  So the sampling rule I suppose you are using is
select up to N rows from each sampled block --- and that is going to
favor the contents of blocks containing narrower-than-average rows.

Now in this case, it looks like that ought to favor rows with *smaller* i
values, but you say the median goes up not down.  So I'm not sure what's
going on.  I thought at first that TOAST compression might be part of the
explanation, but TOAST shouldn't kick in on rows with raw representation
narrower than 2KB.

Did you do a run with no upper limit on the number of rows per block?
Because I'm not sure that tests with a limit in place are a good guide
to what happens without it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add transforms feature

2013-12-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Dec 10, 2013 at 10:35 PM, Peter Eisentraut pete...@gmx.net wrote:
 Here is an updated patch that implements this, makes some of the
 documentation improvements that you suggested, and rebases everything.

 I'm still kinda unimpressed by this.  Behavior-changing GUC, uggh.

We should have learned by now that those are usually a bad idea.
In this case, we've got changes in the behavior of function calling,
which seems like not only a nightmare for debugging but a fertile
source of security issues.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] In-Memory Columnar Store

2013-12-11 Thread knizhnik

Hello!

Implementation of IMCS itself took me about two months (with testing and 
writing documentation).
But huge part of the code was previously written by me for other 
projects, so I have reused them.
Most of the time I have spent in integration of this code with 
PostgreSQL (I was not so familiar with it before).


Certainly implementations of columnar store for Oracle (Oracle Database 
In-Memory Option), DB2 (BLU Acceleration), ... are more convenient for 
users: them can execute normal SQL queries and do not require users to 
learn new functions and approach. But it requires complete redesign of 
query engine (or providing alternative implementation). I was not able 
to do it.


This is why I try to provide advantages of vertical data representation 
(vector operation, parallel execution, data skipping) as well as 
advantages of fast access to in-memory data as standard PostgreSQL 
extension. There are obviously some limitations and queries look more 
complicated than in case of standard SQL...


But from the other side it is possible to write queries which are hardly 
to be expressed using standard SQL.
For example calculating split-adjusted prices can not be done in SQL 
without using stored procedures.
To make usage of IMCS functions as simple as possible I defined a larger 
number of various operators for most popular operations.

For example Volume-Weighted-Average-Price can be calculated just as:

select Volume//Close as VWAP from Quote_get();

It is even shore than analog SQL statement:

select sum(Close*Volume)/sum(Volume) as VWAP from Quote;


Concerning integration with PostgreSQL, there were several problems. 
Some of them seems to have no easy solution, but other are IMHO 
imperfections in PostgreSQL which I hope will be fixed sometime:


1. Calls in PL/pgSQL are very slow - about 1-2 micsroseconds at my 
computer. Just defining insertion per-row trigger with empty procedure 
increase time of insertion of 6 million records twice - from 7 till 15 
seconds. If trigger procedure is not empty, then time is increased 
proportionally number of performed calls.
In my case inserting data with propagation it in columnar store using 
trigger takes about 80 seconds. But if I first load data without 
triggers in PostgreSQL table and then
insert it in columnar store using load function (implemented in C), then 
time will be 7+9=16 seconds.


Certainly I realize that plpgsql is interpreted language. But for 
example also interpreted Python is able to do 100 times more calls per 
second.
Unfortunately profiler doesn;t show some bottleneck - looks like long 
calltime is caused by large overhead of initializing and resetting 
memory context and copying arguments data.


2. Inefficient implementation of expanding composite type columns using 
(foo()).* clause. In this case function foo() will be invoked as much 
times as there are fields in the returned composite type. Even in case 
of placing call in FROM list (thanks to lateral joins in 9.3), 
PostgreSQL still sometimes performs redundant calls which can be avoided 
using hack with adding OFFSET 1 clause.


3. 256Gb limit for used shared memory segment size at Linux.

Concerning last problem - I have included in IMCS distributive much 
simpler patch which just set MAP_HUGETLB flags when

a) is it defined in system headers
b) requested memory size is larger than 256Gb

In this case right now PostgreSQL will just fail to start.
But certainly it is more correct to trigger this flag through 
configuration parameter, because large pages can minimize MMU overhead 
and so increase speed even if size of used memory is less than 256Gb 
(this is why Oracle is widely using it).




. Вызов функции занимает прядка 2 микросекунд. Т.е. если я напишу 
триггер с пустой процедурой, то вставка 6 миллионов объектов займёт 15 
секунд. Это при том, что без триггера вставка занимает всего 7 секунд...



On 12/11/2013 06:33 PM, Merlin Moncure wrote:

On Mon, Dec 9, 2013 at 1:40 PM, knizhnik knizh...@garret.ru wrote:

Hello!

I want to annouce my implementation of In-Memory Columnar Store extension
for PostgreSQL:

  Documentation: http://www.garret.ru/imcs/user_guide.html
  Sources: http://www.garret.ru/imcs-1.01.tar.gz

Any feedbacks, bug reports and suggestions are welcome.

Vertical representation of data is stored in PostgreSQL shared memory.
This is why it is important to be able to utilize all available physical
memory.
Now servers with Tb or more RAM are not something exotic, especially in
financial world.
But there is limitation in Linux with standard 4kb pages  for maximal size
of mapped memory segment: 256Gb.
It is possible to overcome this limitation either by creating multiple
segments - but it requires too much changes in PostgreSQL memory manager.
Or just set MAP_HUGETLB flag (assuming that huge pages were allocated in the
system).

I found several messages related with MAP_HUGETLB flag, the most recent one
was from 21 of November:

Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-11 Thread MauMau

From: Tom Lane t...@sss.pgh.pa.us

Jim Nasby j...@nasby.net writes:

On 12/9/13 5:56 PM, Tom Lane wrote:

How so?  FATAL means an error that terminates your session, which
is exactly what these are.


Except in these cases the user never actually got a working session; 
their request was denied.


To be clear, from the client standpoint it's certainly fatal, but not 
from the server's point of view. This is fully expected behavior as far 
as the server is concerned. (Obviously it might be an error that caused 
the shutdown/recovery, but that's something different.)


Right, but as already pointed out in this thread, these messages are
worded from the client's point of view.  The client never got a working
connection seems to me to be an empty distinction.  If you got SIGTERM'd
before you could issue your first query, should that not be FATAL because
you'd not gotten any work done?

More generally, we also say FATAL for all sorts of entirely routine
connection failures, like wrong password or mistyped user name.  People
don't seem to have a problem with those.  Even if some do complain,
the costs of changing that behavior after fifteen-years-and-counting
would certainly exceed any benefit.


I agree that #1-#3 are of course reasonable when there's any client the user 
runs.  The problem is that #1 (The database system is starting up) is output 
in the server log by pg_ctl.  In that case, there's no client the user is 
responsible for.  Why does a new DBA have to be worried about that FATAL 
message?  He didn't do anything wrong.


I thought adding options='-c log_min_messages=PANIC' to the connection 
string for PQping() in pg_ctl.c would vanish the message, but it didn't. 
The reason is that connection options take effect in PostgresMain(), which 
is after checking the FATAL condition in ProcessStartupPacket().  Do you 
think there is any good solution?


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 11, 2013 at 9:35 AM, Andres Freund and...@2ndquadrant.com wrote:
 One use case is accessing a particular host when using DNS round robin
 to standbys in combination with SSL.

 Ah, interesting point.  And it's not inconceivable that some
 application out there could be using PQhost() to retrieve the host
 from an existing connection object and reusing that value for a new
 connection, in which case redefining it to sometimes return hostaddr
 would break things.  So I think we shouldn't do that.

I think the only reasonable way to fix this is to improve the logic
in psql, not turn PQhost() into a mess with no understandable definition.
If that means we need to add a separate PQhostaddr() query function,
so be it.  We won't be able to fix the complained-of bug in back branches,
but I'd rather live with that (it's basically just cosmetic anyway)
than risk introducing perhaps-not-so-cosmetic bugs into other existing
applications.

In general, I think the definition of these query functions ought to
be what was the value of this parameter when the connection was made.
As such, I'm not even sure that the pgunixsocket behavior that's in
PQhost now is a good idea, much less that we should extend that hack
to cover DefaultHost.

There is room also for a function defined as give me a textual
description of what I'm connected to, which is not meant to reflect any
single connection parameter but rather the total behavior.  Right now
I think PQhost is on the borderline of doing this instead of just
reporting the host parameter, but I think rather than pushing it
across that border we'd be better off to invent a function explicitly
charged with doing that.  That would give us room to do something
actually meaningful with host+hostaddr cases, for instance.  I think
really what ought to be printed in such cases is something like
host-name (address IP-address); leaving out the former would be
unhelpful but leaving out the latter is outright misleading.

On the other hand, I'm not sure how much of a translatability problem
it'd be to wedge such a description into a larger message.  Might be
better to just put the logic into psql.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.8

2013-12-11 Thread Andres Freund
On 2013-12-10 19:11:03 -0500, Robert Haas wrote:
 Committed #1 (again).  Regarding this:
 
 +   /* XXX: we could also do this unconditionally, the space is used 
 anyway
 +   if (copy_oid)
 +   HeapTupleSetOid(key_tuple, HeapTupleGetOid(tp));
 
 I would like to put in a big +1 for doing that unconditionally.  I
 didn't make that change before committing, but I think it'd be a very
 good idea.

Patch attached.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 47c93d8e7afbcaef268de66246571cdc2f134c97 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Wed, 11 Dec 2013 17:20:27 +0100
Subject: [PATCH] Dep't of second thoughts: Always include oids in WAL logged
 replica identities.

Since replica identities are logged using the normal format for heap
tuples, the space for oids in WITH OIDS tables was already used, so
there's little point in only including the oid if it is included in
the configured IDENTITY.

Per comment from Robert Haas.
---
 src/backend/access/heap/heapam.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 249fffe..e647453 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -6638,7 +6638,6 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed, bool *
 	TupleDesc	idx_desc;
 	char		replident = relation-rd_rel-relreplident;
 	HeapTuple	key_tuple = NULL;
-	bool		copy_oid = false;
 	bool		nulls[MaxHeapAttributeNumber];
 	Datum		values[MaxHeapAttributeNumber];
 	int			natt;
@@ -6698,7 +6697,8 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed, bool *
 		int attno = idx_rel-rd_index-indkey.values[natt];
 
 		if (attno == ObjectIdAttributeNumber)
-			copy_oid = true;
+			/* copied below */
+			;
 		else if (attno  0)
 			elog(ERROR, system column in index);
 		else
@@ -6709,8 +6709,12 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed, bool *
 	*copy = true;
 	RelationClose(idx_rel);
 
-	/* XXX: we could also do this unconditionally, the space is used anyway */
-	if (copy_oid)
+	/*
+	 * Always copy oids if the table has them, even if not included in the
+	 * index. The space in the logged tuple is used anyway, so there's little
+	 * point in not including the information.
+	 */
+	if (relation-rd_rel-relhasoids)
 		HeapTupleSetOid(key_tuple, HeapTupleGetOid(tp));
 
 	/*
-- 
1.8.5.rc2.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] In-Memory Columnar Store

2013-12-11 Thread knizhnik

Hi,

I depends on what you mean by transparently substitute.
I f you want to be able to execute standard SQL queries using columnar 
store, then it seems to be impossible without rewriting of executor.
I provided another approach based on calling standard functions which 
perform manipulations not with scalar types but with timeseries.


For example instead of standard SQL

select sum(ClosePrice) from Quote;

I will have to write:

select cs_sum(ClosePrice) from Quote_get();

It looks similar but not quite the same.
And for more complex queries difference is larger.
For example the query

select sum(score*volenquired)/sum(volenquired) from DbItem group by 
(trader,desk,office);


can be written as

select agg_val,cs_cut(group_by,'c22c30c10') from
(select (cs_project_agg(ss1.*)).* from
  (select (s1).sum/(s2).sum,(s1).groups from DbItem_get() q,
   cs_hash_sum(q.score*q.volenquired, 
q.trader||q.desk||q.office) s1,
cs_hash_sum(q.volenquired, q.trader||q.desk||q.office) 
s2) ss1) ss2;


Looks too complex, doesn't it?
But first two lines are responsible to perform reverse mapping: from 
vertical data representation to normal horisontal tuples.
The good thing is that this query is executed more than 1000 times 
faster (with default PostgreSQL configuration parameters except shared 
shared_buffers

which was set large enough to fit all data in memory).

On 12/11/2013 07:14 PM, k...@rice.edu wrote:

On Mon, Dec 09, 2013 at 11:40:41PM +0400, knizhnik wrote:

Hello!

I want to annouce my implementation of In-Memory Columnar Store
extension for PostgreSQL:

  Documentation: http://www.garret.ru/imcs/user_guide.html
  Sources: http://www.garret.ru/imcs-1.01.tar.gz

Any feedbacks, bug reports and suggestions are welcome.

Vertical representation of data is stored in PostgreSQL shared memory.
This is why it is important to be able to utilize all available
physical memory.

Hi,

This is very neat! The question I have, which applies to the matview
support as well, is How can we transparently substitute usage of the
in-memory columnar store/matview in a SQL query?.

Regards,
Ken




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why the buildfarm is all pink

2013-12-11 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 Kevin Grittner kgri...@ymail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
 I haven't touched matview.sql here; that seems like a distinct issue.

 I'll fix that.

 Done.

Thanks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-11 Thread Tom Lane
MauMau maumau...@gmail.com writes:
 I agree that #1-#3 are of course reasonable when there's any client the user 
 runs.  The problem is that #1 (The database system is starting up) is output 
 in the server log by pg_ctl.  In that case, there's no client the user is 
 responsible for.  Why does a new DBA have to be worried about that FATAL 
 message?  He didn't do anything wrong.

FATAL doesn't mean the DBA did something wrong.  It means we terminated
a client session.

The fundamental problem IMO is that you want to complicate the definition
of what these things mean as a substitute for DBAs learning something
about Postgres.  That seems like a fool's errand from here.  They're going
to have to learn what FATAL means sooner or later, and making it more
complicated just raises the height of that barrier.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-11 Thread MauMau

From: Andres Freund and...@2ndquadrant.com

On 2013-12-12 00:31:25 +0900, MauMau wrote:

5. FATAL:  terminating walreceiver process due to administrator command
6. FATAL:  terminating background worker \%s\ due to administrator 
command


Those are important if they happen outside a shutdown. So, if you really
want to remove them from there, you'd need to change the signalling to
handle the cases differently.


How are they important?  If someone mistakenly sends SIGTERM to walreceiver 
and background workers, they are automatically launched by postmaster or 
startup process later like other background processes.  But other background 
processes such as walsender, bgwriter, etc. don't emit FATAL messages.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why the buildfarm is all pink

2013-12-11 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-12-11 10:07:19 -0500, Tom Lane wrote:
 Do you remember offhand where the failures are?

 No, but they are easy enough to reproduce. Out of 10 runs, I've attached
 the one with the most failures and checked that it seems to contain all
 the failures from other runs. All of them probably could be fixed by
 moving things around, but I am not sure how maintainable that approach
 is :/

Thanks for doing the legwork.  These all seem to be cases where the
planner decided against doing an index-only scan on tenk1, which is
presumably because its relallvisible fraction is too low.  But these are
later in the test series than the vacuum analyze tenk1 that's currently
present in create_index, and most of them are even later than the
database-wide VACUUM in sanity_check.  So those vacuums are failing to
mark the table as all-visible, even though it's not changed since the COPY
test.  This seems odd.  Do you know why your slave server is holding back
the xmin horizon so much?

After looking at this, I conclude that moving the vacuums earlier would
probably make things worse not better, because the critical interval seems
to be from the COPY TO tenk1 command to the vacuum command.  So the idea
of putting vacuums into the COPY test is a bad one, and I'll proceed with
the patch I posted yesterday for moving the ANALYZE steps around.  I think
fixing what you're seeing is going to be a different issue.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Tom Lane
I wrote:
 Hm.  You can only take N rows from a block if there actually are at least
 N rows in the block.  So the sampling rule I suppose you are using is
 select up to N rows from each sampled block --- and that is going to
 favor the contents of blocks containing narrower-than-average rows.

Oh, no, wait: that's backwards.  (I plead insufficient caffeine.)
Actually, this sampling rule discriminates *against* blocks with
narrower rows.  You previously argued, correctly I think, that
sampling all rows on each page introduces no new bias because row
width cancels out across all sampled pages.  However, if you just
include up to N rows from each page, then rows on pages with more
than N rows have a lower probability of being selected, but there's
no such bias against wider rows.  This explains why you saw smaller
values of i being undersampled.

Had you run the test series all the way up to the max number of
tuples per block, which is probably a couple hundred in this test,
I think you'd have seen the bias go away again.  But the takeaway
point is that we have to sample all tuples per page, not just a
limited number of them, if we want to change it like this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-11 Thread MauMau

From: Kevin Grittner kgri...@ymail.com

5. FATAL: terminating walreceiver process due to administrator
command
6. FATAL: terminating background worker \%s\ due to
administrator command


Those are client connections and their backends terminated for a
reason other than the client side of the connection requesting it.
If we don't classify those as FATAL then the definition of FATAL
becomes much more fuzzy. What would you define it to mean?


I'm sorry to cause you trouble, but my understanding is that those are not 
client connections.  They are just background server processes; walreceiver 
is started by startup process, and background workers are started by 
extension modules.  Am I misunderstanding something?


According to Table 18-1 in the manual:

http://www.postgresql.org/docs/current/static/runtime-config-logging.html

the definition of FATAL is:

FATAL
Reports an error that caused the current session to abort.

This does not apply to the above messages, because there is no error.  The 
DBA just shut down the database server, and the background processes 
terminated successfully.


If some message output is desired, LOG's definition seems the nearest:

LOG
Reports information of interest to administrators, e.g., checkpoint 
activity.


So, I thought ereport(LOG, ...); proc_exit(0) is more appropriate than 
ereport(FATAL, ...).  Is this so strange?


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-11 Thread Josh Berkus
On 12/11/2013 08:48 AM, Tom Lane wrote:
 The fundamental problem IMO is that you want to complicate the definition
 of what these things mean as a substitute for DBAs learning something
 about Postgres.  That seems like a fool's errand from here.  They're going
 to have to learn what FATAL means sooner or later, and making it more
 complicated just raises the height of that barrier.

I don't think it works to change the NOTICE/ERROR/FATAL tags; for one
thing, I can hear the screaming about people's log scripts from here.

However, it would really be useful to have an extra tag (in addition to
the ERROR or FATAL) for If you're seeing this message, something has
gone seriously wrong on the server.  Just stuff like corruption
messages, backend crashes, etc.

Otherwise we're requiring users to come up with an alphabet soup of
regexes to filter out the noise error messages from the really, really
important ones.  Speaking as someone who does trainings for new DBAs,
the part where I do what to look for in the logs requires over an hour
and still doesn't cover everything. And doesn't internationalize. That's
nasty.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-11 Thread Peter Eisentraut
On 12/10/13, 5:08 PM, Tom Lane wrote:
 Having said that, I can't get very excited about this feature anyway,
 so I'm fine with rejecting the patch.  I'm not sure that enough people
 care to justify any added overhead at all.  The long and the short of
 it is that network traffic generally is what it is, for any given query
 workload, and so it's not clear what's the point of counting it.

Also, if we add this, the next guy is going to want to add CPU
statistics, memory statistics, etc.

Is there a reason why you can't get this directly from the OS?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with displaying wide tables in psql

2013-12-11 Thread Sergey Muraviov
Hi.

I've improved the patch.
It works in expanded mode when either format option is set to wrapped
(\pset format wrapped), or we have no pager, or pager doesn't chop long
lines (so you can still use the trick).
Target output width is taken from either columns option (\pset columns 70),
or environment variable $COLUMNS, or terminal size.
And it's also compatible with any border style (\pset border 0|1|2).

Here are some examples:

postgres=# \x 1
postgres=# \pset format wrapped
postgres=# \pset border 0
postgres=# select * from wide_table;
* Record 1

value afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf  sadf sa df
sadfsadfa
  sd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f sadf sad fadsf
* Record 2

value afadsafasd fasdf asdfasd

postgres=# \pset border 1
postgres=# \pset columns 70
postgres=# select * from wide_table;
-[ RECORD 1 ]-
value | afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf  sadf sa
  | df sadfsadfasd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f
  |  sadf sad fadsf
-[ RECORD 2 ]-
value | afadsafasd fasdf asdfasd

postgres=# \pset border 2
postgres=# \pset columns 60
postgres=# select * from wide_table;
+-[ RECORD 1 ]-+
| value | afadsafasd fasdf asdfasd fsad fas df sadf sad f  |
|   | sadf  sadf sa df sadfsadfasd fsad fsa df sadf as |
|   | d fa sfd sadfsadf asdf sad f sadf sad fadsf  |
+-[ RECORD 2 ]-+
| value | afadsafasd fasdf asdfasd |
+---+--+

Regards,
Sergey


2013/12/10 Jeff Janes jeff.ja...@gmail.com

 On Mon, Dec 2, 2013 at 10:45 PM, Sergey Muraviov 
 sergey.k.murav...@gmail.com wrote:

 Hi.

 Psql definitely have a problem with displaying wide tables.
 Even in expanded mode, they look horrible.
 So I tried to solve this problem.


 I get compiler warnings:

 print.c: In function 'print_aligned_vertical':
 print.c:1238: warning: ISO C90 forbids mixed declarations and code
 print.c: In function 'print_aligned_vertical':
 print.c:1238: warning: ISO C90 forbids mixed declarations and code

 But I really like this and am already benefiting from it.  No point in
 having the string of hyphens between every record wrap to be 30 lines long
 just because one field somewhere down the list does so.  And configuring
 the pager isn't much of a solution because the pager doesn't know that the
 hyphens are semantically different than the other stuff getting thrown at
 it.

 Cheers,

 Jeff






-- 
Best regards,
Sergey Muraviov
From be9f01777599dc5e84c417e5cae56459677a88d4 Mon Sep 17 00:00:00 2001
From: Sergey Muraviov sergey.k.murav...@gmail.com
Date: Wed, 11 Dec 2013 20:17:26 +0400
Subject: [PATCH] wrapped tables in expanded mode

---
 src/bin/psql/print.c | 123 ---
 1 file changed, 118 insertions(+), 5 deletions(-)

diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index 736225c..4c37f7d 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -124,6 +124,7 @@ const printTextFormat pg_utf8format =
 
 /* Local functions */
 static int	strlen_max_width(unsigned char *str, int *target_width, int encoding);
+static bool IsWrappingNeeded(const printTableContent *cont, bool is_pager);
 static void IsPagerNeeded(const printTableContent *cont, const int extra_lines, bool expanded,
 			  FILE **fout, bool *is_pager);
 
@@ -1234,6 +1235,45 @@ print_aligned_vertical(const printTableContent *cont, FILE *fout)
 			fprintf(fout, %s\n, cont-title);
 	}
 
+	if (IsWrappingNeeded(cont, is_pager))
+	{
+		int output_columns = 0;
+		/*
+		 * Choose target output width: \pset columns, or $COLUMNS, or ioctl
+		 */
+		if (cont-opt-columns  0)
+			output_columns = cont-opt-columns;
+		else
+		{
+			if (cont-opt-env_columns  0)
+output_columns = cont-opt-env_columns;
+#ifdef TIOCGWINSZ
+			else
+			{
+struct winsize screen_size;
+
+if (ioctl(fileno(stdout), TIOCGWINSZ, screen_size) != -1)
+	output_columns = screen_size.ws_col;
+			}
+#endif
+		}
+
+		output_columns -= hwidth;
+
+		if (opt_border == 0)
+			output_columns -= 1;
+		else
+		{
+			output_columns -= 3; /* -+- */
+
+			if (opt_border  1) 
+output_columns -= 4; /* +--+ */
+		}
+
+		if ((output_columns  0)  (dwidth  output_columns))
+			dwidth = output_columns;
+	}
+
 	/* print records */
 	for (i = 0, ptr = cont-cells; *ptr; i++, ptr++)
 	{
@@ -1294,12 +1334,49 @@ print_aligned_vertical(const printTableContent *cont, FILE *fout)
 
 			if (!dcomplete)
 			{
-if (opt_border  2)
-	fprintf(fout, %s\n, dlineptr[line_count].ptr);
+if (dlineptr[line_count].width  dwidth)
+{
+	int offset = 0;
+	int chars_to_output = dlineptr[line_count].width;
+	while (chars_to_output  0)
+	{
+		int target_width, bytes_to_output;
+
+		if (offset  0)
+	

Re: [HACKERS] autovacuum_work_mem

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 10:41 AM, Simon Riggs si...@2ndquadrant.com wrote:
 It looks fairly easy to estimate the memory needed for an auto vacuum,
 since we know the scale factor and the tuple estimate. We can then use
 the memory estimate to alter the scheduling of work. And/or we can use
 actual memory usage and block auto vac workers if they need more
 memory than is currently available because of other workers.

 We would still benefit from a new parameter in the above sketch, but
 it would achieve something useful in practice.

 That's about 2-3 days work and I know Peter can hack it. So the
 situation is not perfection-sought-blocking-good, this is more like
 fairly poor solution being driven through when a better solution is
 available within the time and skills available.

I don't agree with that assessment.  Anything that involves changing
the scheduling of autovacuum is a major project that will legitimately
provoke much controversy.  Extensive testing will be needed to prove
that the new algorithm doesn't perform worse than the current
algorithm in any important cases.  I have my doubts about whether that
can be accomplished in an entire release cycle, let alone 2-3 days.
In contrast, the patch proposed does something that is easy to
understand, clearly safe, and an improvement over what we have now.

Quite apart from the amount of development time required, I think that
the idea that we would use the availability of memory to schedule work
is highly suspect.  You haven't given any details on what you think
that algorithm might look like, and I doubt that any simple solution
will do.  If running more autovacuum workers drives the machine into
swap, then we shouldn't, but we have no way of calculating what size
memory allocation will cause that to happen.  But NOT running
autovacuum workers isn't safe either, because it could cause table
bloat that them drives the machine into swap.  We have no way of
knowing whether that will happen either.  So I think your contention
that we have the necessary information available to make an
intelligent decision is incorrect.

Regardless, whether or not a more complex change is within Peter's
technical capabilities is utterly irrelevant to whether we should
adopt the proposed patch.  Your phrasing seems to imply that you would
not ask such a thing of a less-talented individual, and I think that
is utterly wrong.  Peter's technical acumen does not give us the right
to ask him to write a more complex patch as a condition of getting a
simpler one accepted.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-11 Thread Atri Sharma
On Wed, Dec 11, 2013 at 11:12 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 12/10/13, 5:08 PM, Tom Lane wrote:
 Having said that, I can't get very excited about this feature anyway,
 so I'm fine with rejecting the patch.  I'm not sure that enough people
 care to justify any added overhead at all.  The long and the short of
 it is that network traffic generally is what it is, for any given query
 workload, and so it's not clear what's the point of counting it.

 Also, if we add this, the next guy is going to want to add CPU
 statistics, memory statistics, etc.

 Is there a reason why you can't get this directly from the OS?

I would say that its more of a convenience to track the usage directly
from the database instead of setting up OS infrastructure to store it.

That said, it should be possible to directly do it from OS level. Can
we think of adding this to pgtop, though?

I am just musing here.

Regards,

Atri


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Gavin Flower

On 12/12/13 06:22, Tom Lane wrote:

I wrote:

Hm.  You can only take N rows from a block if there actually are at least
N rows in the block.  So the sampling rule I suppose you are using is
select up to N rows from each sampled block --- and that is going to
favor the contents of blocks containing narrower-than-average rows.

Oh, no, wait: that's backwards.  (I plead insufficient caffeine.)
Actually, this sampling rule discriminates *against* blocks with
narrower rows.  You previously argued, correctly I think, that
sampling all rows on each page introduces no new bias because row
width cancels out across all sampled pages.  However, if you just
include up to N rows from each page, then rows on pages with more
than N rows have a lower probability of being selected, but there's
no such bias against wider rows.  This explains why you saw smaller
values of i being undersampled.

Had you run the test series all the way up to the max number of
tuples per block, which is probably a couple hundred in this test,
I think you'd have seen the bias go away again.  But the takeaway
point is that we have to sample all tuples per page, not just a
limited number of them, if we want to change it like this.

regards, tom lane


Surely we want to sample a 'constant fraction' (obviously, in practice 
you have to sample an integral number of rows in a page!) of rows per 
page? The simplest way, as Tom suggests, is to use all the rows in a page.


However, if you wanted the same number of rows from a greater number of 
pages, you could (for example) select a quarter of the rows from each 
page.  In which case, when this is a fractional number: take the 
integral number of rows, plus on extra row with a probability equal to 
the fraction (here 0.25).


Either way, if it is determined that you need N rows, then keep 
selecting pages at random (but never use the same page more than once) 
until you have at least N rows.



Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Gavin Flower

On 12/12/13 06:22, Tom Lane wrote:

I wrote:

Hm.  You can only take N rows from a block if there actually are at least
N rows in the block.  So the sampling rule I suppose you are using is
select up to N rows from each sampled block --- and that is going to
favor the contents of blocks containing narrower-than-average rows.

Oh, no, wait: that's backwards.  (I plead insufficient caffeine.)
Actually, this sampling rule discriminates *against* blocks with
narrower rows.  You previously argued, correctly I think, that
sampling all rows on each page introduces no new bias because row
width cancels out across all sampled pages.  However, if you just
include up to N rows from each page, then rows on pages with more
than N rows have a lower probability of being selected, but there's
no such bias against wider rows.  This explains why you saw smaller
values of i being undersampled.

Had you run the test series all the way up to the max number of
tuples per block, which is probably a couple hundred in this test,
I think you'd have seen the bias go away again.  But the takeaway
point is that we have to sample all tuples per page, not just a
limited number of them, if we want to change it like this.

regards, tom lane



Hmm...

In my previous reply, which hasn't shown up yet!

I realized I made a mistake!

The fraction/probability could any of 0.25. 0.50, and 0.75.


Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] In-Memory Columnar Store

2013-12-11 Thread Merlin Moncure
On Wed, Dec 11, 2013 at 10:08 AM, knizhnik knizh...@garret.ru wrote:
 1. Calls in PL/pgSQL are very slow - about 1-2 micsroseconds at my computer.
 Just defining insertion per-row trigger with empty procedure increase time
 of insertion of 6 million records twice - from 7 till 15 seconds. If trigger
 procedure is not empty, then time is increased proportionally number of
 performed calls.
 In my case inserting data with propagation it in columnar store using
 trigger takes about 80 seconds. But if I first load data without triggers in
 PostgreSQL table and then
 insert it in columnar store using load function (implemented in C), then
 time will be 7+9=16 seconds.

Yeah. For this problem, we either unfortunately have to try to try to
use standard sql functions in such away that supports inlining (this
is a black art mostly, and fragile), or move logic out of the function
and into the query via things like window functions, or just deal with
the performance hit.  postgres flavored SQL is pretty much the most
productive language on the planet AFAIC, but the challenge is always
performance, performance.

Down the line, I am optimistic per call function overhead can be
optimized, probably by expanding what can be inlined somehow.  The
problem is that this requires cooperation from the language executors
this is not currently possible through the SPI interface, so I really
don't know.

 Certainly I realize that plpgsql is interpreted language. But for example
 also interpreted Python is able to do 100 times more calls per second.
 Unfortunately profiler doesn;t show some bottleneck - looks like long
 calltime is caused by large overhead of initializing and resetting memory
 context and copying arguments data.

 2. Inefficient implementation of expanding composite type columns using
 (foo()).* clause. In this case function foo() will be invoked as much times
 as there are fields in the returned composite type. Even in case of placing
 call in FROM list (thanks to lateral joins in 9.3), PostgreSQL still
 sometimes performs redundant calls which can be avoided using hack with
 adding OFFSET 1 clause.

Yeah, this is long standing headache.   LATERAL mostly deals with this
but most cases (even with pre-9.3) can be worked around one way or
another.

 3. 256Gb limit for used shared memory segment size at Linux.

I figure this will be solved fairly soon. It's a nice problem to have.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum_work_mem

2013-12-11 Thread Josh Berkus
On 12/11/2013 09:57 AM, Robert Haas wrote:
 I don't agree with that assessment.  Anything that involves changing
 the scheduling of autovacuum is a major project that will legitimately
 provoke much controversy.  Extensive testing will be needed to prove
 that the new algorithm doesn't perform worse than the current
 algorithm in any important cases.  I have my doubts about whether that
 can be accomplished in an entire release cycle, let alone 2-3 days.
 In contrast, the patch proposed does something that is easy to
 understand, clearly safe, and an improvement over what we have now.

+1

There is an inherent tuning and troubleshooting challenge in anything
involving a feedback loop.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-11 Thread Fabrízio de Royes Mello
On Wed, Dec 11, 2013 at 6:27 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 December 2013 06:36, KONDO Mitsumasa
 kondo.mitsum...@lab.ntt.co.jp wrote:

  I think this feature will be used in a lot of scenarios in
  which PITR is currently used.
 
  We have to judge which is better, we get something potential or to
protect
  stupid.
  And we had better to wait author's comment...

 I'd say just document that it wouldn't make sense to use it for PITR.

 There may be some use case we can't see yet, so specifically
 prohibiting a use case that is not dangerous seems too much at this
 point. I will no doubt be reminded of these words in the future...


Hi all,

I tend to agree with Simon, but I confess that I don't liked to delay a
server with standby_mode = 'off'.

The main goal of this patch is delay the Streaming Replication, so if the
slave server isn't a hot-standby I think makes no sense to delay it.

Mitsumasa suggested to add StandbyModeRequested in conditional branch to
skip this situation. I agree with him!

And I'll change 'recoveryDelay' (functions, variables) to 'standbyDelay'.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] logical changeset generation v6.8

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 11:25 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-10 19:11:03 -0500, Robert Haas wrote:
 Committed #1 (again).  Regarding this:

 +   /* XXX: we could also do this unconditionally, the space is used 
 anyway
 +   if (copy_oid)
 +   HeapTupleSetOid(key_tuple, HeapTupleGetOid(tp));

 I would like to put in a big +1 for doing that unconditionally.  I
 didn't make that change before committing, but I think it'd be a very
 good idea.

 Patch attached.

Committed with kibitzing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 11:24 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 11, 2013 at 9:35 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
 One use case is accessing a particular host when using DNS round robin
 to standbys in combination with SSL.

 Ah, interesting point.  And it's not inconceivable that some
 application out there could be using PQhost() to retrieve the host
 from an existing connection object and reusing that value for a new
 connection, in which case redefining it to sometimes return hostaddr
 would break things.  So I think we shouldn't do that.

 I think the only reasonable way to fix this is to improve the logic
 in psql, not turn PQhost() into a mess with no understandable definition.
 If that means we need to add a separate PQhostaddr() query function,
 so be it.  We won't be able to fix the complained-of bug in back branches,
 but I'd rather live with that (it's basically just cosmetic anyway)
 than risk introducing perhaps-not-so-cosmetic bugs into other existing
 applications.

I can't argue with that.

 In general, I think the definition of these query functions ought to
 be what was the value of this parameter when the connection was made.
 As such, I'm not even sure that the pgunixsocket behavior that's in
 PQhost now is a good idea, much less that we should extend that hack
 to cover DefaultHost.

Well, returning /tmp on Windows is just stupid.  I don't see why we
should feel bad about changing that.  A bug is a bug.

 There is room also for a function defined as give me a textual
 description of what I'm connected to, which is not meant to reflect any
 single connection parameter but rather the total behavior.  Right now
 I think PQhost is on the borderline of doing this instead of just
 reporting the host parameter, but I think rather than pushing it
 across that border we'd be better off to invent a function explicitly
 charged with doing that.  That would give us room to do something
 actually meaningful with host+hostaddr cases, for instance.  I think
 really what ought to be printed in such cases is something like
 host-name (address IP-address); leaving out the former would be
 unhelpful but leaving out the latter is outright misleading.

 On the other hand, I'm not sure how much of a translatability problem
 it'd be to wedge such a description into a larger message.  Might be
 better to just put the logic into psql.

libpq needs to expose enough functionality to make this simple for
psql, but psql should be the final arbiter of the output format.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-11 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 However, it would really be useful to have an extra tag (in addition to
 the ERROR or FATAL) for If you're seeing this message, something has
 gone seriously wrong on the server.  Just stuff like corruption
 messages, backend crashes, etc.

Right, we've discussed that idea elsewhere; there's a basically orthogonal
classification that needs to happen.  Pretty much all PANICs are high
priority from a DBA's perspective, but only a subset of either FATAL or
ERROR are.  Somebody needs to do the legwork to determine just what kind
of classification scheme we want and propose at least an initial set of
ereports to be so marked.

One thought I had was that we could probably consider the default behavior
(in the absence of any call of an explicit criticality-marking function)
to be like this:
for ereport(), it's critical if a PANIC and otherwise not
for elog(), it's critical if = ERROR level, otherwise not.
The rationale for this is that we generally use elog for
not-supposed-to-happen cases, so those are probably interesting.
If we start getting complaints about some elog not being so interesting,
we can convert it to an ereport so it can include an explicit marking
call.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Completing PL support for Event Triggers

2013-12-11 Thread Peter Eisentraut
On 12/11/13, 5:06 AM, Dimitri Fontaine wrote:
 Peter Eisentraut pete...@gmx.net writes:
 I think you are mistaken.  My patch includes all changes between your v1
 and v2 patch.
 
 I mistakenly remembered that we did remove all the is_event_trigger
 business from the plperl patch too, when it's not the case. Sorry about
 this confusion.
 
 My vote is for “ready for commit” then.

PL/Perl was committed.

Please update the commit fest entry with your plans about PL/Python.
(Returned with Feedback or move to next CF or close and create a
separate entry?)



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-11 Thread Tom Lane
Atri Sharma atri.j...@gmail.com writes:
 On Wed, Dec 11, 2013 at 11:12 PM, Peter Eisentraut pete...@gmx.net wrote:
 Is there a reason why you can't get this directly from the OS?

 I would say that its more of a convenience to track the usage directly
 from the database instead of setting up OS infrastructure to store it.

The thing that I'm wondering is why the database would be the right place
to be measuring it at all.  If you've got a network usage problem,
aggregate usage across everything on the server is probably what you
need to be worried about, and PG can't tell you that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] In-Memory Columnar Store

2013-12-11 Thread Kevin Grittner
k...@rice.edu k...@rice.edu wrote:

 The question I have, which applies to the matview support as
 well, is How can we transparently substitute usage of the
 in-memory columnar store/matview in a SQL query?.

My take on that regarding matviews is:

(1)  It makes no sense to start work on this without a far more
sophisticated concept of matview freshness (or staleness, as
some products prefer to call it).

(2)  Work on query rewrite to use sufficiently fresh matviews to
optimize the execution of a query and work on freshness tracking
are orthogonal to work on incremental maintenance.

I have no plans to work on either matview freshness or rewrite, as
there seems to be several years worth of work to get incremental
maintenance up to a level matching other products.  I welcome
anyone else to take on those other projects.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 10:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 So while I hear your objection to the script in catalog idea Stephen,
 I think we should move forward. We don't have the luxury of only
 applying patches where no compromise has to be made, where everyone is
 fully happy with the solution we find as a community.

 You've got that backwards.  We do have the luxury of rejecting new
 features until people are generally satisfied that the basic design is
 right.  There's no overlord decreeing that this must be in 9.4.

I strongly agree.  PostgreSQL has succeeded because we try not to do
things at all until we're sure we know how to do them right.
Sometimes we lag behind in features or performance as a result of that
- but the upside is that when we say something now works, it does.
Moreover, it means that the number of bad design decisions we're left
to support off into eternity is comparatively small.  Those things are
of great benefit to our community.

I can certainly understand Dimitri's frustration, in that he's written
several versions of this patch and none have been accepted.  But what
that means is that none of those approaches have consensus behind
them, which is another way of saying that, as a community, we really
*don't* know the best way to solve this problem, and our community
policy in that situation is to take no action until we do.  I've
certainly had my own share of disappointments about patches I've
written which I believed, and in some cases still believe, to be
really good work, and I'd really like to be able to force them
through.  But that's not how it works.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 11, 2013 at 11:24 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 In general, I think the definition of these query functions ought to
 be what was the value of this parameter when the connection was made.
 As such, I'm not even sure that the pgunixsocket behavior that's in
 PQhost now is a good idea, much less that we should extend that hack
 to cover DefaultHost.

 Well, returning /tmp on Windows is just stupid.  I don't see why we
 should feel bad about changing that.  A bug is a bug.

What I was suggesting was we should take out the pgunixsocket fallback,
not make it even more complicated.  That probably implies that we need
still another accessor function to get the socket path.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-11 Thread Robert Haas
On Thu, Nov 21, 2013 at 4:51 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-11-21 15:59:35 -0500, Robert Haas wrote:
  * Should HeapTupleHeaderXminFrozen also check for FrozenTransactionId?
It seems quite possible that people think they've delt with frozen
xmin entirely after checking, but they still might get
FrozenTransactionId back in a pg_upgraded cluster.

 The reason I originally wrote the patch the way I did, rather than the
 way that you prefer, is that it minimizes the number of places where
 we might perform extra tests that are known not to be needed in
 context.  These code paths are hot.

 The patch as sent shouldn't really do that in any of paths I know to be
 hot - it uses *RawXmin() there.

 If you do this sort of thing,  then after macro expansion we may end up with 
 a lot of things like:
 (flags  FROZEN) || (rawxid == 2) ? 2 : rawxid.  I want to avoid that.

 But in which cases would that actually be slower? There'll be no
 additional code executed if the hint bits for frozen are set, and in
 case not it will usually safe us an external function call to
 TransactionIdPrecedes().

Dunno.  It's at least more code generation.

  That macros is intended, specifically, to be a test for flag bits,
 and I think it should do precisely that.  If that's not what you want,
 then don't use that macro.

 That's a fair argument. Although there's several HeapTupleHeader* macros
 that muck with stuff besides infomask.

Sure, but that doesn't mean they ALL have to.

  * Existing htup_details boolean checks contain an 'Is', but
HeapTupleHeaderXminCommitted, HeapTupleHeaderXminInvalid,
HeapTupleHeaderXminFrozen don't contain any verb. Not sure.

 We could say XminIsComitted, XminIsInvalid, XminIsFrozen, etc.  I
 don't particularly care for it, but I can see the argument for it.

 I don't have a clear preference either, I just noticed the inconsistency
 and wasn't sure whether it was intentional.

It was intentional enough.  :-)

  I think once we have this we should start opportunistically try to
  freeze tuples during vacuum using OldestXmin instead of FreezeLimit if
  the page is already dirty.

 Separate patch, but yeah, something like that.  If we have to mark the
 page all-visible, we might as well freeze it while we're there.  We
 should think about how it interacts with Heikki's freeze-without-write
 patch though.

 Definitely separate yes. And I agree, it's partially moot if Heikki's
 patch gets in, but I am not sure it will make it into 9.4. There seems
 to be quite some work left.

I haven't heard anything further from Heikki, so I'm thinking we
should proceed with this approach.  It seems to be the path of least
resistance, if not essential, for making CLUSTER freeze everything
automatically, a change almost everyone seems to really want.  Even if
we did have Heikki's stuff, making cluster freeze more aggressively is
still a good argument for doing this.  The pages can then be marked
all-visible (something Bruce is working on) and never need to be
revisited.  Without this, I don't think we can get there.  If we also
handle the vacuum-dirtied-it-already case as you propose here, I think
we'd have quite a respectable improvement in vacuum behavior for 9.4,
even without Heikki's stuff.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Gavin Flower

On 12/12/13 07:22, Gavin Flower wrote:

On 12/12/13 06:22, Tom Lane wrote:

I wrote:
Hm.  You can only take N rows from a block if there actually are at 
least

N rows in the block.  So the sampling rule I suppose you are using is
select up to N rows from each sampled block --- and that is going to
favor the contents of blocks containing narrower-than-average rows.

Oh, no, wait: that's backwards.  (I plead insufficient caffeine.)
Actually, this sampling rule discriminates *against* blocks with
narrower rows.  You previously argued, correctly I think, that
sampling all rows on each page introduces no new bias because row
width cancels out across all sampled pages.  However, if you just
include up to N rows from each page, then rows on pages with more
than N rows have a lower probability of being selected, but there's
no such bias against wider rows.  This explains why you saw smaller
values of i being undersampled.

Had you run the test series all the way up to the max number of
tuples per block, which is probably a couple hundred in this test,
I think you'd have seen the bias go away again.  But the takeaway
point is that we have to sample all tuples per page, not just a
limited number of them, if we want to change it like this.

regards, tom lane


Surely we want to sample a 'constant fraction' (obviously, in practice 
you have to sample an integral number of rows in a page!) of rows per 
page? The simplest way, as Tom suggests, is to use all the rows in a 
page.


However, if you wanted the same number of rows from a greater number 
of pages, you could (for example) select a quarter of the rows from 
each page.  In which case, when this is a fractional number: take the 
integral number of rows, plus on extra row with a probability equal to 
the fraction (here 0.25).


Either way, if it is determined that you need N rows, then keep 
selecting pages at random (but never use the same page more than once) 
until you have at least N rows.



Cheers,
Gavin




Yes the fraction/probability, could actually be one of: 0.25, 0.50, 0.75.

But there is a bias introduced by the arithmetic average size of the 
rows in a page. This results in block sampling favouring large rows, as 
they are in a larger proportion of pages.


For example, assume 1000 rows of 200 bytes and 1000 rows of 20 bytes, 
using 400 byte pages.  In the pathologically worst case, assuming 
maximum packing density and no page has both types: the large rows would 
occupy  500 pages and the smaller rows 50 pages. So if one selected 11 
pages at random, you get about 10 pages of large rows and about one for 
small rows!  In practice, it would be much less extreme - for a start, 
not all blocks will be fully packed, most blocks would have both types 
of rows, and there is usually greater variation in row size - but still 
a bias towards sampling larger rows.  So somehow, this bias needs to be 
counteracted.



Cheers,
Gavin




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 2:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 11, 2013 at 11:24 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 In general, I think the definition of these query functions ought to
 be what was the value of this parameter when the connection was made.
 As such, I'm not even sure that the pgunixsocket behavior that's in
 PQhost now is a good idea, much less that we should extend that hack
 to cover DefaultHost.

 Well, returning /tmp on Windows is just stupid.  I don't see why we
 should feel bad about changing that.  A bug is a bug.

 What I was suggesting was we should take out the pgunixsocket fallback,
 not make it even more complicated.  That probably implies that we need
 still another accessor function to get the socket path.

Well, I guess.  I have a hard time seeing whatever rejiggering we want
to do in master as a reason not to back-patch that fix, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-11 Thread Heikki Linnakangas

On 12/11/2013 09:17 PM, Robert Haas wrote:

On Thu, Nov 21, 2013 at 4:51 PM, Andres Freund and...@2ndquadrant.com wrote:

On 2013-11-21 15:59:35 -0500, Robert Haas wrote:

Separate patch, but yeah, something like that.  If we have to mark the
page all-visible, we might as well freeze it while we're there.  We
should think about how it interacts with Heikki's freeze-without-write
patch though.


Definitely separate yes. And I agree, it's partially moot if Heikki's
patch gets in, but I am not sure it will make it into 9.4. There seems
to be quite some work left.


I haven't heard anything further from Heikki, so I'm thinking we
should proceed with this approach.


+1. It seems unlikely that my patch is going to make it into 9.4.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 11, 2013 at 2:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Well, returning /tmp on Windows is just stupid.  I don't see why we
 should feel bad about changing that.  A bug is a bug.

 What I was suggesting was we should take out the pgunixsocket fallback,
 not make it even more complicated.  That probably implies that we need
 still another accessor function to get the socket path.

 Well, I guess.  I have a hard time seeing whatever rejiggering we want
 to do in master as a reason not to back-patch that fix, though.

I guess as long as the pgunixsocket thing is in there, it makes sense
to substitute DefaultHost for it on Windows, but are we sure that's
something to back-patch?

Right now, as I was saying, PQhost is in some gray area where it's not too
clear what its charter is.  It's not what was the host parameter, for
sure, but we haven't tried to make it an accurate description of the
connection either.  It's a bit less accurate on Windows than elsewhere,
but do we want to risk breaking anything to only partially resolve that?

More generally, if we do go over in 9.4 to the position that PQhost
reports the host parameter and nothing but, I'm not sure that introducing
a third behavior into the back branches is something that anybody will
thank us for.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Kevin Grittner
Gavin Flower gavinflo...@archidevsys.co.nz wrote:

 For example, assume 1000 rows of 200 bytes and 1000 rows of 20 bytes,
 using 400 byte pages.  In the pathologically worst case, assuming
 maximum packing density and no page has both types: the large rows would
 occupy  500 pages and the smaller rows 50 pages. So if one selected 11
 pages at random, you get about 10 pages of large rows and about one for
 small rows!

With 10 * 2 = 20 large rows, and 1 * 20 = 20 small rows.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.3 reference constraint regression

2013-12-11 Thread Daniel Wood
In 9.3 I can delete the parent of a parent-child relation if the child row
is an uncommitted insert and I first update the parent.

USER1:
drop table child;
drop table parent;
create table parent (i int, c char(3));
create unique index parent_idx on parent (i);
insert into parent values (1, 'AAA');
create table child (i int references parent(i));

USER2:
BEGIN;
insert into child values (1);

USER1:
BEGIN;
update parent set c=lower(c);
delete from parent;
COMMIT;

USER2:
COMMIT;

Note that the problem also happens if the update is set i=i.  I was
expecting this update to block as the UPDATE is on a unique index that
can be used in a foreign key.  The i=i update should get a UPDATE lock
and not a NO KEY UPDATE lock as I believe the c=... update does.


Re: [HACKERS] autovacuum_work_mem

2013-12-11 Thread Peter Geoghegan
On Wed, Dec 11, 2013 at 7:41 AM, Simon Riggs si...@2ndquadrant.com wrote:
 That's about 2-3 days work and I know Peter can hack it. So the
 situation is not perfection-sought-blocking-good, this is more like
 fairly poor solution being driven through when a better solution is
 available within the time and skills available.

I think that that's a very optimistic assessment of the amount of work
required. Even by the rose-tinted standards of software project time
estimation. A ton of data is required to justify fundamental
infrastructural changes like that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 2:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 11, 2013 at 2:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Well, returning /tmp on Windows is just stupid.  I don't see why we
 should feel bad about changing that.  A bug is a bug.

 What I was suggesting was we should take out the pgunixsocket fallback,
 not make it even more complicated.  That probably implies that we need
 still another accessor function to get the socket path.

 Well, I guess.  I have a hard time seeing whatever rejiggering we want
 to do in master as a reason not to back-patch that fix, though.

 I guess as long as the pgunixsocket thing is in there, it makes sense
 to substitute DefaultHost for it on Windows, but are we sure that's
 something to back-patch?

Well, it seems like a clear case of returning a ridiculous value, but
I'm willing to be talked out of it if someone can explain how it would
break things.  I guess it's possible someone could have code out that
that tests for the exact value /tmp and does something based on that,
but that seems a stretch - and if they did have such code, it would
probably just handle it by substituting localhost anyway.

 Right now, as I was saying, PQhost is in some gray area where it's not too
 clear what its charter is.  It's not what was the host parameter, for
 sure, but we haven't tried to make it an accurate description of the
 connection either.  It's a bit less accurate on Windows than elsewhere,
 but do we want to risk breaking anything to only partially resolve that?

I guess it depends on how risky we think it is.

 More generally, if we do go over in 9.4 to the position that PQhost
 reports the host parameter and nothing but, I'm not sure that introducing
 a third behavior into the back branches is something that anybody will
 thank us for.

It doesn't seem very plausible to say that we're just going to
redefine it that way, unless we're planning to bump the soversion.
But maybe we should decide what we *are* going to do in master first,
before deciding what to back-patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum_work_mem

2013-12-11 Thread Simon Riggs
On 11 December 2013 17:57, Robert Haas robertmh...@gmail.com wrote:

 Extensive testing will be needed to prove
 that the new algorithm doesn't perform worse than the current
 algorithm in any important cases.

Agreed, but the amount of testing seems equivalent in both cases,
assuming we weren't going to skip it for this patch.

Let me repeat the question, so we are clear...
In what circumstances will the memory usage from multiple concurrent
VACUUMs become a problem? In those circumstances, reducing
autovacuum_work_mem will cause more passes through indexes, dirtying
more pages and elongating the problem workload. I agree that multiple
concurrent VACUUMs could be a problem but this
doesn't solve that, it just makes things worse.

The *only* time this parameter would have any effect looks like when
it will make matters worse.

With considerable regret, I don't see how this solves the problem at
hand. We can and should do better.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Gavin Flower

On 12/12/13 08:14, Gavin Flower wrote:

On 12/12/13 07:22, Gavin Flower wrote:

On 12/12/13 06:22, Tom Lane wrote:

I wrote:
Hm.  You can only take N rows from a block if there actually are at 
least

N rows in the block.  So the sampling rule I suppose you are using is
select up to N rows from each sampled block --- and that is going to
favor the contents of blocks containing narrower-than-average rows.

Oh, no, wait: that's backwards.  (I plead insufficient caffeine.)
Actually, this sampling rule discriminates *against* blocks with
narrower rows.  You previously argued, correctly I think, that
sampling all rows on each page introduces no new bias because row
width cancels out across all sampled pages.  However, if you just
include up to N rows from each page, then rows on pages with more
than N rows have a lower probability of being selected, but there's
no such bias against wider rows.  This explains why you saw smaller
values of i being undersampled.

Had you run the test series all the way up to the max number of
tuples per block, which is probably a couple hundred in this test,
I think you'd have seen the bias go away again.  But the takeaway
point is that we have to sample all tuples per page, not just a
limited number of them, if we want to change it like this.

regards, tom lane


Surely we want to sample a 'constant fraction' (obviously, in 
practice you have to sample an integral number of rows in a page!) of 
rows per page? The simplest way, as Tom suggests, is to use all the 
rows in a page.


However, if you wanted the same number of rows from a greater number 
of pages, you could (for example) select a quarter of the rows from 
each page.  In which case, when this is a fractional number: take the 
integral number of rows, plus on extra row with a probability equal 
to the fraction (here 0.25).


Either way, if it is determined that you need N rows, then keep 
selecting pages at random (but never use the same page more than 
once) until you have at least N rows.



Cheers,
Gavin




Yes the fraction/probability, could actually be one of: 0.25, 0.50, 0.75.

But there is a bias introduced by the arithmetic average size of the 
rows in a page. This results in block sampling favouring large rows, 
as they are in a larger proportion of pages.


For example, assume 1000 rows of 200 bytes and 1000 rows of 20 bytes, 
using 400 byte pages.  In the pathologically worst case, assuming 
maximum packing density and no page has both types: the large rows 
would occupy  500 pages and the smaller rows 50 pages. So if one 
selected 11 pages at random, you get about 10 pages of large rows and 
about one for small rows!  In practice, it would be much less extreme 
- for a start, not all blocks will be fully packed, most blocks would 
have both types of rows, and there is usually greater variation in row 
size - but still a bias towards sampling larger rows.  So somehow, 
this bias needs to be counteracted.



Cheers,
Gavin

Actually, I just thought of a possible way to overcome the bias towards 
large rows.


1. Calculate (a rough estimate may be sufficient, if not too 'rough')
   the size of the smallest row.

2. Select a page at random (never selecting the same page twice)

3. Then select rows at random within the page (never selecting the same
   row twice).  For each row selected, accept it with the probability
   equal to (size of smallest row)/(size of selected row).  I think you
   find that will almost completely offset the bias towards larger rows!

4. If you do not have sufficient rows, and you still have pages not yet
   selected, goto 2

Note that it will be normal for for some pages not to have any rows 
selected, especially for large tables!



Cheers,
Gavin

 P.S.
I really need to stop thinking about this problem, and get on with my 
assigned project!!!





Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Peter Geoghegan
On Tue, Dec 10, 2013 at 4:48 PM, Peter Geoghegan p...@heroku.com wrote:
 Why would I even mention that to a statistician? We want guidance. But
 yes, I bet I could give a statistician an explanation of statistics
 target that they'd understand without too much trouble.

Actually, I think that if we told a statistician about the statistics
target, his or her response would be: why would you presume to know
ahead of time what statistics target is going to be effective? I
suspect that the basic problem is that it isn't adaptive. I think that
if we could somehow characterize the quality of our sample as we took
it, and then cease sampling when we reached a certain degree of
confidence in its quality, that would be helpful. It might not even
matter that the sample was clustered from various blocks.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-11 Thread Gavin Flower

On 12/12/13 08:31, Kevin Grittner wrote:

Gavin Flower gavinflo...@archidevsys.co.nz wrote:


For example, assume 1000 rows of 200 bytes and 1000 rows of 20 bytes,
using 400 byte pages.  In the pathologically worst case, assuming
maximum packing density and no page has both types: the large rows would
occupy  500 pages and the smaller rows 50 pages. So if one selected 11
pages at random, you get about 10 pages of large rows and about one for
small rows!

With 10 * 2 = 20 large rows, and 1 * 20 = 20 small rows.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sorry, I've simply come up with well argued nonsense!

Kevin, you're dead right.


Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-12-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 11, 2013 at 2:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 More generally, if we do go over in 9.4 to the position that PQhost
 reports the host parameter and nothing but, I'm not sure that introducing
 a third behavior into the back branches is something that anybody will
 thank us for.

 It doesn't seem very plausible to say that we're just going to
 redefine it that way, unless we're planning to bump the soversion.

Well, we didn't bump the soversion (nor touch the documentation)
in commit f6a756e4, which is basically what I'm suggesting we ought
to revert.  It was nothing but a quick hack at the time, and hindsight
is saying it was a bad idea.  Admittedly, it was long enough ago that
there might be some grandfather status attached to the current behavior;
but that argument can't be made for changing its behavior still further.

 But maybe we should decide what we *are* going to do in master first,
 before deciding what to back-patch.

Right.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-11 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 You've got that backwards.  We do have the luxury of rejecting new
 features until people are generally satisfied that the basic design is
 right.  There's no overlord decreeing that this must be in 9.4.

 I strongly agree.  PostgreSQL has succeeded because we try not to do
 things at all until we're sure we know how to do them right.

I still agree to the principle, or I wouldn't even try. Not in details,
because the current design passed all the usual criteria a year ago.

  http://www.postgresql.org/message-id/6466.1354817...@sss.pgh.pa.us

 I can certainly understand Dimitri's frustration, in that he's written
 several versions of this patch and none have been accepted.  But what

The design was accepted, last year. It took a year to review it, which
is fair enough, only to find new problems again. Circles at their best.
You just said on another thread that perfect is the enemy of good. What
about applying the same line of thoughts to this patch?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum_work_mem

2013-12-11 Thread Robert Haas
On Wed, Dec 11, 2013 at 2:37 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 11 December 2013 17:57, Robert Haas robertmh...@gmail.com wrote:
 Extensive testing will be needed to prove
 that the new algorithm doesn't perform worse than the current
 algorithm in any important cases.

 Agreed, but the amount of testing seems equivalent in both cases,
 assuming we weren't going to skip it for this patch.

 Let me repeat the question, so we are clear...
 In what circumstances will the memory usage from multiple concurrent
 VACUUMs become a problem? In those circumstances, reducing
 autovacuum_work_mem will cause more passes through indexes, dirtying
 more pages and elongating the problem workload. I agree that multiple
 concurrent VACUUMs could be a problem but this
 doesn't solve that, it just makes things worse.

That's not the problem the patch is designed to solve.  It's intended
for the case where you want to allow more or less memory to autovacuum
than you do for index builds.  There's no principled reason that
anyone should want those things to be the same.  It is not difficult
to imagine situations in which you would want one set to a very
different value than the other.  In particular it seems quite likely
to me that the amount of memory appropriate for index builds might be
vastly more than is needed by autovacuum.  For example, in a
data-warehousing environment where updates are rare but large index
builds by the system's sole user are frequent, someone might want to
default index builds to 64GB of RAM (especially after Noah's patch to
allow huge allocations for the tuple array while sorting) but only
need 256MB for autovacuum.

In general, I'm reluctant to believe that Peter proposed this patch
just for fun.  I assume this is a real-world problem that Heroku
encounters in their environment.  If not, well then that's different.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum_work_mem

2013-12-11 Thread Josh Berkus
On 12/11/2013 11:37 AM, Simon Riggs wrote: On 11 December 2013 17:57,
Robert Haas robertmh...@gmail.com wrote:

 Extensive testing will be needed to prove
 that the new algorithm doesn't perform worse than the current
 algorithm in any important cases.

 Agreed, but the amount of testing seems equivalent in both cases,
 assuming we weren't going to skip it for this patch.

No performance testing is required for this patch.  The effect of memory
limits on vacuum are already well-known and well-understood.

 With considerable regret, I don't see how this solves the problem at
 hand. We can and should do better.

I strongly disagree.  The problem we are dealing with currently is that
two resource limits which should have *always* been independent of each
other are currently conflated into a single GUC variable.  This forces
users to remember to set maintenance_work_mem interactively every time
they want to run a manual VACUUM, because the setting in postgresql.conf
is needed to tune autovacuum.

In other words, we are having an issue with *non-atomic data*, and this
patch partially fixes that.

Would it be better to have an admissions-control policy engine for
launching autovacuum which takes into account available RAM, estimated
costs of concurrent vacuums, current CPU activity, and which tables are
in cache?  Yes.  And if you started on that now, you might have it ready
for 9.5.

And, for that matter, accepting this patch by no means blocks doing
something more sophisticated in the future.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-11 Thread Stephen Frost
Dimitri,

* Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote:
 Stephen Frost sfr...@snowman.net writes:
  The extra catalog tables which store SQL scripts in text columns is one
  of my main objections to the as-proposed Extension Templates.  I view
  those scripts as a poor man's definition of database objects which are
  defined properly in the catalog already.
 
 I have a very hard time to understand this objection.
 
 PL/SQL functions are just a SQL script stored as-is in the catalogs.
 That applies the same way to any other PL language too, with scripts
 stored as-is in the catalogs in different languages.

Sure- but in those cases only the actual function (which is, by
definition, for an *interpreted* language..) is stored as text, not
the definition of the function (eg: the CREATE FUNCTION statement), nor
all of the metadata, dependency information, etc.  Also, what you're
proposing would result in having *both* in the same catalog- the
canonical form defined in pg_proc and friends, and the SQL text blob in
the extension template catalog and I simply do not see value in that.

 So while I hear your objection to the script in catalog idea Stephen,
 I think we should move forward. We don't have the luxury of only
 applying patches where no compromise has to be made, where everyone is
 fully happy with the solution we find as a community.

I understand that you wish to push this forward regardless of anyone's
concerns.  While I appreciate your frustration and the time you've spent
on this, that isn't going to change my opinion of this approach.

   The other big issue is that
  there isn't an easy way to see how we could open up the ability to
  create extensions to non-superusers with this approach.
 
 The main proposal here is to only allow the owner of a template to
 install it as an extension. For superusers, we can implement the needed
 SET ROLE command automatically in the CREATE EXTENSION command.
 
 Is there another security issue that this “same role” approach is not
 solving? I don't think so.

This isn't kind, and for that I'm sorry, but this feels, to me, like a
very hand-wavey well, I think this would solve all the problems answer
to the concerns raised.  I can't answer offhand if this would really
solve all of the issues because I've not tried to implement it or test
it out, but I tend to doubt that it would.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reference to parent query from ANY sublink

2013-12-11 Thread Kevin Grittner
Kevin Grittner kgri...@ymail.com wrote:

 I applied it to master and ran the regression tests, and one of
 the subselect tests failed.

 This query:

 SELECT '' AS six, f1 AS Correlated Field, f3 AS Second
 Field
   FROM SUBSELECT_TBL upper
   WHERE f1 IN
 (SELECT f2 FROM SUBSELECT_TBL WHERE CAST(upper.f2 AS float) = f3);

 [ ... ] during the `make check` or `make install-check` [ ... ]
 is missing the last two rows.  Oddly, if I go into the database
 later and try it, the rows show up.  It's not immediately
 apparent to me what's wrong.

Using the v2 patch, with the default statistics from table
creation, the query modified with an alias of lower for the
second reference, just for clarity, yields a plan which generates
incorrect results:

 Hash Join  (cost=37.12..80.40 rows=442 width=12) (actual time=0.059..0.064 
rows=3 loops=1)
   Hash Cond: (((upper.f2)::double precision = lower.f3) AND (upper.f1 = 
lower.f2))
   -  Seq Scan on subselect_tbl upper  (cost=0.00..27.70 rows=1770 width=16) 
(actual time=0.006..0.007 rows=8 loops=1)
   -  Hash  (cost=34.12..34.12 rows=200 width=12) (actual time=0.020..0.020 
rows=5 loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 1kB
 -  HashAggregate  (cost=32.12..34.12 rows=200 width=12) (actual 
time=0.014..0.018 rows=6 loops=1)
   -  Seq Scan on subselect_tbl lower  (cost=0.00..27.70 rows=1770 
width=12) (actual time=0.002..0.004 rows=8 loops=1)
 Total runtime: 0.111 ms

As soon as there is a VACUUM and/or ANALYZE it generates a plan
more like what the OP was hoping for:

 Hash Semi Join  (cost=1.20..2.43 rows=6 width=12) (actual time=0.031..0.036 
rows=5 loops=1)
   Hash Cond: (((upper.f2)::double precision = lower.f3) AND (upper.f1 = 
lower.f2))
   -  Seq Scan on subselect_tbl upper  (cost=0.00..1.08 rows=8 width=16) 
(actual time=0.004..0.007 rows=8 loops=1)
   -  Hash  (cost=1.08..1.08 rows=8 width=12) (actual time=0.012..0.012 rows=7 
loops=1)
 Buckets: 1024  Batches: 1  Memory Usage: 1kB
 -  Seq Scan on subselect_tbl lower  (cost=0.00..1.08 rows=8 width=12) 
(actual time=0.003..0.005 rows=8 loops=1)
 Total runtime: 0.074 ms

By comparison, without the patch this is the plan:

 Seq Scan on subselect_tbl upper  (cost=0.00..5.59 rows=4 width=12) (actual 
time=0.022..0.037 rows=5 loops=1)
   Filter: (SubPlan 1)
   Rows Removed by Filter: 3
   SubPlan 1
 -  Seq Scan on subselect_tbl lower  (cost=0.00..1.12 rows=1 width=4) 
(actual time=0.002..0.003 rows=1 loops=8)
   Filter: ((upper.f2)::double precision = f3)
   Rows Removed by Filter: 4
 Total runtime: 0.066 ms

When I run the query with fresh statistics and without EXPLAIN both
ways, the unpatched is consistently about 10% faster.

So pulling up the subquery can yield an incorrect plan, and even
when it yields the desired plan with the semi-join it is
marginally slower than using the subplan, at least for this small
data set.  I think it's an interesting idea, but it still needs
work.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >