Re: [HACKERS] RFC: Making TRUNCATE more MVCC-safe

2012-03-04 Thread Simon Riggs
On Sun, Mar 4, 2012 at 2:28 AM, Marti Raudsepp ma...@juffo.org wrote:
 On Sat, Mar 3, 2012 at 14:53, Simon Riggs si...@2ndquadrant.com wrote:
 Thanks Noah for drawing attention to this thread. I hadn't been
 watching. As you say, this work would allow me to freeze rows at load
 time and avoid the overhead of hint bit setting, which avoids
 performance issues from hint bit setting in checksum patch.

 I've reviewed this patch and it seems OK to me. Good work Marti.

 Thanks! This approach wasn't universally liked, but if it gets us
 tangible benefits (COPY with frozenxid) then I guess it's a reason to
 reconsider.

Comments I see support this idea. If we did this purely for truncate
correctness I probably wouldn't care either, which is why I didn't
read this thread in the first place.

 v2 patch attached, updated to latest HEAD. Patch adds
 * relvalidxid is reset by VACUUM/ANALYZE to avoid wraparound failure

 Personally I'd rather keep this out of ANALYZE -- since its purpose is
 to collect stats; VACUUM is responsible for correctness (xid
 wraparound etc). But I don't feel strongly about this.

If there were a reason to do it, then I would agree. Later you point
out a reason, so I will make this change.

 A more important consideration is how this interacts with hot standby.
 Currently you compare OldestXmin to relvalidxmin to decide when to
 reset it. But the standby's OldestXmin may be older than the master's.
 (If VACUUM removes rows then this causes a recovery conflict, but
 AFAICT that won't happen if only relvalidxmin changes)

As of 9.1, the standby's oldestxmin is incorporated into the master's
via hot_standby_feedback, so it wouldn't typically be a problem these
days. It's possible that the standby has this set off by choice, in
which case anomalies could exist in the case that a VACUUM doesn't
clean any rows, as you say.

So we'll use vacrelstats-latestRemovedXid instead of OldestXmin when
we call vac_update_relstats()
which means ANALYZE should always pass InvalidTransactionId.

 It might be more robust to wait until relfrozenxid exceeds
 relvalidxmin -- by then, recovery conflict mechanisms will have taken
 care of killing all older snapshots, or am I mistaken?

It would be better to set it as early as possible to reduce the cost
of the test in heap_begin_scan()

 And a few typos the code...

 + gettext_noop(When enabled viewing a truncated or newly created table 
 + will throw a serialization error to prevent MVCC 
 + discrepancies that were possible priot to 9.2.)

 prior not priot

Yep

 + * Reset relvalidxmin if its old enough

 Should be it's in this context.

Cool, thanks for the review.

v3 attached.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index c910863..4387f49 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -73,7 +73,7 @@
 
 /* GUC variable */
 bool		synchronize_seqscans = true;
-
+bool		StrictMVCC = true;
 
 static HeapScanDesc heap_beginscan_internal(Relation relation,
 		Snapshot snapshot,
@@ -1175,6 +1175,24 @@ heap_beginscan_internal(Relation relation, Snapshot snapshot,
 	HeapScanDesc scan;
 
 	/*
+	 * If the snapshot is older than relvalidxmin then that either a table
+	 * has only recently been created or that a TRUNCATE has removed data
+	 * that we should still be able to see. Either way, we aren't allowed
+	 * to view the rows in StrictMVCC mode.
+	 */
+	if (StrictMVCC 
+		TransactionIdIsValid(relation-rd_rel-relvalidxmin) 
+		TransactionIdIsValid(snapshot-xmax) 
+		NormalTransactionIdPrecedes(snapshot-xmax, relation-rd_rel-relvalidxmin))
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg(canceling statement due to conflict with TRUNCATE TABLE on %s,
+		 NameStr(relation-rd_rel-relname)),
+ errdetail(User query attempting to see row versions that have been removed.)));
+	}
+
+	/*
 	 * increment relation ref count while scanning relation
 	 *
 	 * This is just to make really sure the relcache entry won't go away while
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index aef410a..3f9bd5d 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -787,6 +787,7 @@ InsertPgClassTuple(Relation pg_class_desc,
 	values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel-relhastriggers);
 	values[Anum_pg_class_relhassubclass - 1] = BoolGetDatum(rd_rel-relhassubclass);
 	values[Anum_pg_class_relfrozenxid - 1] = TransactionIdGetDatum(rd_rel-relfrozenxid);
+	values[Anum_pg_class_relvalidxmin - 1] = TransactionIdGetDatum(rd_rel-relvalidxmin);
 	if (relacl != (Datum) 0)
 		values[Anum_pg_class_relacl - 1] = relacl;
 	else
@@ -884,6 +885,7 @@ AddNewRelationTuple(Relation pg_class_desc,
 		new_rel_reltup-relfrozenxid = InvalidTransactionId;
 	}
 
+	new_rel_reltup-relvalidxmin = InvalidTransactionId;
 	

Re: [HACKERS] xlog location arithmetic

2012-03-04 Thread Magnus Hagander
On Sun, Feb 26, 2012 at 00:53, Euler Taveira de Oliveira
eu...@timbira.com wrote:
 On 25-02-2012 09:23, Magnus Hagander wrote:
 Do we even *need* the validate_xlog_location() function? If we just
 remove those calls, won't we still catch all the incorrectly formatted
 ones in the errors of the sscanf() calls? Or am I too deep into
 weekend-mode and missing something obvious?

 sscanf() is too fragile for input sanity check. Try
 pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing
 that function if you protect xlog location input from silly users.

Ah, good point. No, that's the reason I was missing :-)

Patch applied, thanks!

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog location arithmetic

2012-03-04 Thread Magnus Hagander
On Tue, Feb 28, 2012 at 07:21, Fujii Masao masao.fu...@gmail.com wrote:
 On Sun, Feb 26, 2012 at 8:53 AM, Euler Taveira de Oliveira
 eu...@timbira.com wrote:
 On 25-02-2012 09:23, Magnus Hagander wrote:
 Do we even *need* the validate_xlog_location() function? If we just
 remove those calls, won't we still catch all the incorrectly formatted
 ones in the errors of the sscanf() calls? Or am I too deep into
 weekend-mode and missing something obvious?

 sscanf() is too fragile for input sanity check. Try
 pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing
 that function if you protect xlog location input from silly users.

 After this patch will have been committed, it would be better to change
 pg_xlogfile_name() and pg_xlogfile_name_offset() so that they use
 the validate_xlog_location() function to validate the input.

And I've done this part as well.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our regex vs. POSIX on longest match

2012-03-04 Thread Brendan Jurd
On 4 March 2012 17:53, Tom Lane t...@sss.pgh.pa.us wrote:
 Brendan Jurd dire...@gmail.com writes:
 I'll admit that this is a pretty obscure point, but we do appear to be
 in direct violation of POSIX here.

 How so?  POSIX doesn't contain any non-greedy constructs.  If you use
 only the POSIX-compatible greedy constructs, the behavior is compliant.

While it's true that POSIX doesn't contemplate non-greed, after
reading the spec I would have expected an expression *as a whole* to
still prefer the longest match, insofar as that is possible while
respecting non-greedy particles.  I just ran some quick experiments in
Perl, Python and PHP, using 'xy1234' ~ 'y*?\d+'.  All returned
'y1234', which to my mind is the most obvious answer, and one which
still makes sense in light of what POSIX has to say.  Whereas Postgres
(and presumably Tcl) return 'y1'.

What I found surprising here is that our implementation allows an
earlier quantifier to clobber the greediness of a later quantifier.
There's a certain disregard for the intentions of the author of the
regex.  If I had wanted the whole expression to be non-greedy, I could
have written 'y*?\d+?'.  But since I wrote 'y*?\d+', it is clear that
I meant for the first atom to be non-greedy, and for the second to be
greedy.

 The issue that is obscure is, once you define some non-greedy
 constructs, how to define how they should act in combination with greedy
 ones.  I'm not sure to what extent the engine's behavior is driven by
 implementation restrictions and to what extent it's really the sanest
 behavior Henry could think of.  I found a comment from him about it:
 http://groups.google.com/group/comp.lang.tcl/msg/c493317cc0d10d50
 but it's short on details as to what alternatives he considered.

Thanks for the link; it is always good to get more insight into
Henry's approach.  I'm beginning to think I should just start reading
everything he ever posted to comp.lang.tcl ...

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Making TRUNCATE more MVCC-safe

2012-03-04 Thread Simon Riggs
On Sun, Mar 4, 2012 at 9:59 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Sun, Mar 4, 2012 at 2:28 AM, Marti Raudsepp ma...@juffo.org wrote:
 On Sat, Mar 3, 2012 at 14:53, Simon Riggs si...@2ndquadrant.com wrote:
 Thanks Noah for drawing attention to this thread. I hadn't been
 watching. As you say, this work would allow me to freeze rows at load
 time and avoid the overhead of hint bit setting, which avoids
 performance issues from hint bit setting in checksum patch.

 I've reviewed this patch and it seems OK to me. Good work Marti.

...

 v3 attached.

More detailed thoughts show that the test in heap_beginscan_internal()
is not right enough, i.e. wrong.

We need a specific XidInMVCCSnapshot test on the relvalidxid, so it
needs to be a specific xid, not an xmin because otherwise we can get
concurrent transactions failing, not just older transactions.

If we're going freeze tuples on load this needs to be watertight, so
some minor rework needed.

Of course, if we only have a valid xid on the class we might get new
columns added when we do repeated SELECT * statements using the same
snapshot while concurrent DDL occurs. That is impractical, so if we
define this as applying to rows it can work; if we want it to apply to
everything its getting more difficult.

Longer term we might fix this by making all catalog access use MVCC,
but that suffers the same problems with ALTER TABLEs that rewrite rows
to add columns. I don't see a neat future solution that is worth
waiting for.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Alexander Korotkov
On Sun, Mar 4, 2012 at 5:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 1. I'm still unhappy about the loop that fills the count histogram,
 as I noted earlier today.  It at least needs a decent comment and some
 overflow protection, and I'm not entirely convinced that it doesn't have
 more bugs than the overflow issue.


Attached patch is focused on fixing this. The frac variable overflow is
evaded by making it int64. I hope comments is clarifying something. In
general this loop copies behaviour of histogram constructing loop of
compute_scalar_stats function. But instead of values array we've array of
unique DEC and it's frequency.

--
With best regards,
Alexander Korotkov.
*** a/src/backend/utils/adt/array_typanalyze.c
--- b/src/backend/utils/adt/array_typanalyze.c
***
*** 581,587  compute_array_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,
  			DECountItem **sorted_count_items;
  			int			count_item_index;
  			int			delta;
! 			int			frac;
  			float4	   *hist;
  
  			/* num_hist must be at least 2 for the loop below to work */
--- 581,587 
  			DECountItem **sorted_count_items;
  			int			count_item_index;
  			int			delta;
! 			int64		frac;
  			float4	   *hist;
  
  			/* num_hist must be at least 2 for the loop below to work */
***
*** 612,633  compute_array_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,
  			hist[num_hist] = (double) element_no / (double) nonnull_cnt;
  
  			/*
! 			 * Construct the histogram.
! 			 *
! 			 * XXX this needs work: frac could overflow, and it's not clear
! 			 * how or why the code works.  Even if it does work, it needs
! 			 * documented.
  			 */
  			delta = analyzed_rows - 1;
  			count_item_index = 0;
! 			frac = sorted_count_items[0]-frequency * (num_hist - 1);
  			for (i = 0; i  num_hist; i++)
  			{
  while (frac = 0)
  {
  	count_item_index++;
  	Assert(count_item_index  count_items_count);
! 	frac += sorted_count_items[count_item_index]-frequency * (num_hist - 1);
  }
  hist[i] = sorted_count_items[count_item_index]-count;
  frac -= delta;
--- 612,642 
  			hist[num_hist] = (double) element_no / (double) nonnull_cnt;
  
  			/*
! 			 * Construct the histogram of DECs. The object of this loop is to
! 			 * copy the max and min DECs and evenly-spaced DECs in between
! 			 * (space here is number of arrays corresponding to DEC). If we
! 			 * imagine ordered array of DECs where each input array have a
! 			 * corresponding DEC item, i'th value of histogram will be 
! 			 * DECs[i * (analyzed_rows - 1) / (num_hist - 1)]. But instead
! 			 * of such array we've sorted_count_items which holds unique DEC
! 			 * values with their frequencies. We can imagine frac variable as
! 			 * an (index in DECs corresponding to next sorted_count_items
! 			 * element - index in DECs corresponding to last histogram value) *
! 			 * (num_hist - 1). In this case negative fraction leads us to
! 			 * iterate over sorted_count_items. 
  			 */
  			delta = analyzed_rows - 1;
  			count_item_index = 0;
! 			frac = (int64)sorted_count_items[0]-frequency * 
!    (int64)(num_hist - 1);
  			for (i = 0; i  num_hist; i++)
  			{
  while (frac = 0)
  {
  	count_item_index++;
  	Assert(count_item_index  count_items_count);
! 	frac += (int64)sorted_count_items[count_item_index]-frequency * 
! 		(int64)(num_hist - 1);
  }
  hist[i] = sorted_count_items[count_item_index]-count;
  frac -= delta;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: improve selectivity estimation for IN/NOT IN

2012-03-04 Thread Euler Taveira de Oliveira
On 04-03-2012 00:20, Daniele Varrazzo wrote:
 It looks like you have grand plans for array estimation. My patch has
 a much more modest scope, and I'm hoping it could be applied to
 currently maintained PG versions, as I consider the currently produced
 estimations a bug.
 
We don't normally add new features to stable branches unless it is a bug. In
the optimizer case, planner regression is a bug (that this case is not).


-- 
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Alexander Korotkov
On Sun, Mar 4, 2012 at 5:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 2. The tests in the above-mentioned message show that in most cases
 where mcelem_array_contained_selec falls through to the rough
 estimate, the resulting rowcount estimate is just 1, ie we are coming
 out with very small selectivities.  Although that path will now only be
 taken when there are no stats, it seems like we'd be better off to
 return DEFAULT_CONTAIN_SEL instead of what it's doing.  I think there
 must be something wrong with the rough estimate logic.  Could you
 recheck that?


I think the wrong think with rough estimate is that assumption about
independent occurrences of items is very unsuitable even for rough
estimate. The following example shows that rough estimate really works
in the case of independent occurrences of items.

Generate test table where item occurrences are really independent.

test=# create table test as select ('{'||(select string_agg(s,',') from
(select case when (t*0 + random())  0.1 then i::text else null end from
generate_series(1,100) i) as x(s))||'}')::int[] AS val  from
generate_series(1,1) t;

SELECT 1

test=# analyze test;
ANALYZE

Do some test.

test=# explain analyze select * from test where val @
array[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60];

  QUERY PLAN


--
 Seq Scan on test  (cost=0.00..239.00 rows=151 width=61) (actual
time=0.325..32.556 rows=163 loops=1
)
   Filter: (val @
'{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60}'::integer[])
   Rows Removed by Filter: 9837
 Total runtime: 32.806 ms
(4 rows)

Delete DECHIST statistics.

test=# update pg_statistic set stakind1 = 0, staop1 = 0, stanumbers1 =
null, stavalues1 = null where starelid = (select oid from pg_class where
relname = 'test') and stakind1 = 5;
UPDATE 0
test=# update pg_statistic set stakind2 = 0, staop2 = 0, stanumbers2 =
null, stavalues2 = null where starelid = (select oid from pg_class where
relname = 'test') and stakind2 = 5;
UPDATE 0
test=# update pg_statistic set stakind3 = 0, staop3 = 0, stanumbers3 =
null, stavalues3 = null where starelid = (select oid from pg_class where
relname = 'test') and stakind3 = 5;
UPDATE 0
test=# update pg_statistic set stakind4 = 0, staop4 = 0, stanumbers4 =
null, stavalues4 = null where starelid = (select oid from pg_class where
relname = 'test') and stakind4 = 5;
UPDATE 1
test=# update pg_statistic set stakind5 = 0, staop5 = 0, stanumbers5 =
null, stavalues5 = null where starelid = (select oid from pg_class where
relname = 'test') and stakind5 = 5;
UPDATE 0

Do another test.

test=# explain analyze select * from test where val @
array[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60];

  QUERY PLAN


--
 Seq Scan on test  (cost=0.00..239.00 rows=148 width=61) (actual
time=0.332..32.952 rows=163 loops=1)
   Filter: (val @
'{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60}'::integer[])
   Rows Removed by Filter: 9837
 Total runtime: 33.225 ms
(4 rows)

It this particular case rough estimate is quite accurate. But in most
part of cases it behaves really bad. It is why I started to invent
calc_distr and etc. So, I think return DEFAULT_CONTAIN_SEL is OK unless
we've some better ideas.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Patch: improve selectivity estimation for IN/NOT IN

2012-03-04 Thread Daniele Varrazzo
On Sun, Mar 4, 2012 at 1:44 PM, Euler Taveira de Oliveira
eu...@timbira.com wrote:
 On 04-03-2012 00:20, Daniele Varrazzo wrote:
 It looks like you have grand plans for array estimation. My patch has
 a much more modest scope, and I'm hoping it could be applied to
 currently maintained PG versions, as I consider the currently produced
 estimations a bug.

 We don't normally add new features to stable branches unless it is a bug. In
 the optimizer case, planner regression is a bug (that this case is not).

Please note that we are talking about planning errors leading to
estimates of records in the millions instead of in the units, in
queries where all the elements are known (most common elements, with
the right stats, included in the query), even with a partial index
whose size could never physically contain all the estimated rows
screaming something broken here!. So, it's not a regression as the
computation has always been broken, but I think it can be hardly
considered not a bug.

OTOH, I expect the decision of leaving things as they are could be
taken on the basis of the possibility that some program may stop
working in reaction of an altered query plan: I'm not going to argue
about this, although I feel it unlikely.

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead

2012-03-04 Thread Boszormenyi Zoltan
Hi,

first, thank you for answering and for the review.

2012-03-02 17:41 keltezéssel, Noah Misch írta:
 On Thu, Dec 29, 2011 at 10:46:23AM +0100, Boszormenyi Zoltan wrote:
 2011-11-16 20:51 keltez?ssel, Boszormenyi Zoltan ?rta:
 2010-10-14 11:56 keltez?ssel, Boszormenyi Zoltan ?rta:
 On Thu, Jun 24, 2010 at 8:19 AM, Michael Meskes mes...@postgresql.org 
 wrote:
 On Thu, Jun 24, 2010 at 12:04:30PM +0300, Heikki Linnakangas wrote:
 Is there a reason not to enable it by default? I'm a bit worried
 that it will receive no testing if it's not always on.
   
 Yes, there is a reason, namely that you don't need it in native mode at 
 all.
 ECPG can read as many records as you want in one go, something ESQL/C
 apparently cannot do. This patch is a workaround for that restriction. I 
 still
 do not really see that this feature gives us an advantage in native mode
 though.
 We yet lack a consensus on whether native ECPG apps should have access to the
 feature.

I don't even remember about any opinion on this matter.
So, at this point  don't know whether it's lack of interest.
We also have a saying silence means agreement. :-)

   My 2c is to make it available, because it's useful syntactic sugar.

Thanks, we thought the same.

 If your program independently processes each row of an arbitrary-length result
 set, current facilities force you to add an extra outer loop to batch the
 FETCHes for every such code site.  Applications could define macros to
 abstract that pattern, but this seems common-enough to justify bespoke
 handling.

We have similar opinions.

   Besides, minimalists already use libpq directly.

Indeed. On the other hand, ECPG provides a safety net with syntax checking
so it's useful for not minimalist types. :-)

 I suggest enabling the feature by default but drastically reducing the default
 readahead chunk size from 256 to, say, 5. 


   That still reduces the FETCH round
 trip overhead by 80%, but it's small enough not to attract pathological
 behavior on a workload where each row is a 10 MiB document.

I see. How about 8? Nice round power of 2 value, still small and avoids
87.5% of overhead.

BTW, the default disabled behaviour was to avoid make check breakage,
see below.

   I would not offer
 an ecpg-time option to disable the feature per se.  Instead, let the user set
 the default chunk size at ecpg time.  A setting of 1 effectively disables the
 feature, though one could later re-enable it with ECPGFETCHSZ.

This means all code previously going through ECPGdo() would go through
ECPGopen()/ECPGfetch()/ECPGclose(). This is more intrusive and all
regression tests that were only testing certain features would also
test the readahead feature, too.

Also, the test for WHERE CURRENT OF at ecpg time would have to be done
at runtime, possibly making previously working code fail if ECPGFETCHSZ is 
enabled.

How about still allowing NO READAHEAD cursors that compile into plain 
ECPGdo()?
This way, ECPGFETCHSZ don't interfere with WHERE CURRENT OF. But this would
mean code changes everywhere where WHERE CURRENT OF is used.

Or how about a new feature in the backend, so ECPG can do
UPDATE/DELETE ... WHERE OFFSET N OF cursor
and the offset of computed from the actual cursor position and the position 
known
by the application? This way an app can do readahead and do work on rows 
collected
by the cursor with WHERE CURRENT OF which gets converted to WHERE OFFSET OF
behind the scenes.


 The ASAP took a little long. The attached patch is in git diff format,
 because (1) the regression test intentionally doesn't do ECPGdebug()
 so the patch isn't dominated by a 2MB stderr file, so this file is empty
 and (2) regular diff cannot cope with empty new files.
 Avoid the empty file with fprintf(STDERR, Dummy non-empty error output\n);

Fixed.


 - *NEW FEATURE* Readahead can be individually enabled or disabled
   by ECPG-side grammar:
 DECLARE curname [ [ NO ] READAHEAD ] CURSOR FOR ...
   Without [ NO ] READAHEAD, the default behaviour is used for cursors.
 Likewise, this may as well take a chunk size rather than a yes/no.

Done.

 The patch adds warnings:
 error.c: In function `ecpg_raise':
 error.c:281: warning: format not a string literal and no format arguments
 error.c:281: warning: format not a string literal and no format arguments

Fixed.

 The patch adds few comments and no larger comments explaining its higher-level
 ideas.  That makes it much harder to review.  In this regard it follows the
 tradition of the ECPG code, but let's depart from that tradition for new work.
 I mention a few cases below where the need for commentary is acute.

Understood. Adding comments as I go over that code again.

 I tested full reads of various SELECT * FROM generate_series(1, $1) commands
 over a 50ms link, and the patch gives a sound performance improvement.  With
 no readahead, a mere N=100 takes 5s to drain.  With readahead enabled, N=1
 takes only 3s.  Performance was quite similar to that of 

Re: [HACKERS] Patch review for logging hooks (CF 2012-01)

2012-03-04 Thread Andrew Dunstan



On 01/20/2012 10:01 AM, Robert Haas wrote:

On Fri, Jan 20, 2012 at 2:47 AM, Greg Smithg...@2ndquadrant.com  wrote:

The updated patch looks good, marking as 'Ready for Committer'

Patches without documentation are never ready for commit.  For this one, I'm
not sure if that should be in the form of a reference example in contrib, or
just something that documents that the hook exists and what the ground rules
are for grabbing it.

Hooks are frequently not documented, and we only sometimes even bother
to include an example in contrib.  We should probably at least have a
working example for testing purposes, though, whether or not we end up
committing it.



I'm just looking at this patch, and I agree, it should be testable. I'm 
wondering if it wouldn't be a good idea to have a module or set of 
modules for demonstrating and testing bits of the API that we expose. 
src/test/api or something similar? I'm not sure how we'd automate a test 
for this case, though. I guess we could use something like pg_logforward 
and have a UDP receiver catch the messages and write them to a file. 
Something like that should be possible to rig up in Perl. But all that 
seems a lot of work at this stage of the game. So the question is do we 
want to commit this patch without it?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead

2012-03-04 Thread Michael Meskes
On Fri, Mar 02, 2012 at 11:41:05AM -0500, Noah Misch wrote:
 We yet lack a consensus on whether native ECPG apps should have access to the
 feature.  My 2c is to make it available, because it's useful syntactic sugar.
 If your program independently processes each row of an arbitrary-length result
 set, current facilities force you to add an extra outer loop to batch the
 FETCHes for every such code site.  Applications could define macros to
 abstract that pattern, but this seems common-enough to justify bespoke
 handling.  Besides, minimalists already use libpq directly.

Sorry, I don't really understand what you're saying here. The program logic
won't change at all when using this feature or what do I misunderstand?

 I suggest enabling the feature by default but drastically reducing the default
 readahead chunk size from 256 to, say, 5.  That still reduces the FETCH round
 trip overhead by 80%, but it's small enough not to attract pathological
 behavior on a workload where each row is a 10 MiB document.  I would not offer
 an ecpg-time option to disable the feature per se.  Instead, let the user set
 the default chunk size at ecpg time.  A setting of 1 effectively disables the
 feature, though one could later re-enable it with ECPGFETCHSZ.

Using 1 to effectively disable the feature is fine with me, but I strongly
object any default enabling this feature. It's farily easy to create cases with
pathological behaviour and this features is not standard by any means. I figure
a normal programmer would expect only one row being transfered when fetching
one.

Other than that, thanks for the great review.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Michael at BorussiaFan dot De, Meskes at (Debian|Postgresql) dot Org
Jabber: michael.meskes at googlemail dot com
VfL Borussia! Força Barça! Go SF 49ers! Use Debian GNU/Linux, PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead

2012-03-04 Thread Boszormenyi Zoltan
2012-03-04 17:16 keltezéssel, Michael Meskes írta:
 On Fri, Mar 02, 2012 at 11:41:05AM -0500, Noah Misch wrote:
 We yet lack a consensus on whether native ECPG apps should have access to the
 feature.  My 2c is to make it available, because it's useful syntactic sugar.
 If your program independently processes each row of an arbitrary-length 
 result
 set, current facilities force you to add an extra outer loop to batch the
 FETCHes for every such code site.  Applications could define macros to
 abstract that pattern, but this seems common-enough to justify bespoke
 handling.  Besides, minimalists already use libpq directly.
 Sorry, I don't really understand what you're saying here. The program logic
 won't change at all when using this feature or what do I misunderstand?

The program logic shouldn't change at all. He meant that extra coding effort
is needed if you want manual caching. It requires 2 loops instead of 1 if you 
use
FETCH N (N1).


 I suggest enabling the feature by default but drastically reducing the 
 default
 readahead chunk size from 256 to, say, 5.  That still reduces the FETCH round
 trip overhead by 80%, but it's small enough not to attract pathological
 behavior on a workload where each row is a 10 MiB document.  I would not 
 offer
 an ecpg-time option to disable the feature per se.  Instead, let the user set
 the default chunk size at ecpg time.  A setting of 1 effectively disables the
 feature, though one could later re-enable it with ECPGFETCHSZ.
 Using 1 to effectively disable the feature is fine with me,

Something else would be needed. For DML with  WHERE CURRENT OF cursor,
the feature should stay disabled even with the environment variable is set
without adding any decoration to the DECLARE statement. The safe thing
would be to distinguish between uncached (but cachable) and uncachable
cases. A single value cannot work.

  but I strongly
 object any default enabling this feature. It's farily easy to create cases 
 with
 pathological behaviour and this features is not standard by any means. I 
 figure
 a normal programmer would expect only one row being transfered when fetching
 one.

 Other than that, thanks for the great review.

 Michael

Best regards,
Zoltán Böszörményi

-- 
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Making TRUNCATE more MVCC-safe

2012-03-04 Thread Simon Riggs
On Sun, Mar 4, 2012 at 1:02 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Sun, Mar 4, 2012 at 9:59 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Sun, Mar 4, 2012 at 2:28 AM, Marti Raudsepp ma...@juffo.org wrote:
 On Sat, Mar 3, 2012 at 14:53, Simon Riggs si...@2ndquadrant.com wrote:
 Thanks Noah for drawing attention to this thread. I hadn't been
 watching. As you say, this work would allow me to freeze rows at load
 time and avoid the overhead of hint bit setting, which avoids
 performance issues from hint bit setting in checksum patch.

 I've reviewed this patch and it seems OK to me. Good work Marti.

 ...

 v3 attached.

 More detailed thoughts show that the test in heap_beginscan_internal()
 is not right enough, i.e. wrong.

 We need a specific XidInMVCCSnapshot test on the relvalidxid, so it
 needs to be a specific xid, not an xmin because otherwise we can get
 concurrent transactions failing, not just older transactions.

Marti, please review this latest version which has new isolation tests added.

This does both TRUNCATE and CREATE TABLE.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index c910863..3d90125 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -73,7 +73,7 @@
 
 /* GUC variable */
 bool		synchronize_seqscans = true;
-
+bool		StrictMVCC = true;
 
 static HeapScanDesc heap_beginscan_internal(Relation relation,
 		Snapshot snapshot,
@@ -1175,6 +1175,24 @@ heap_beginscan_internal(Relation relation, Snapshot snapshot,
 	HeapScanDesc scan;
 
 	/*
+	 * If the snapshot can't see relvalidxid then that means either the table
+	 * is newly created or has recently been truncated. Either way, we aren't
+	 * allowed to view the rows in StrictMVCC mode.
+	 */
+	if (IsMVCCSnapshot(snapshot) 
+		StrictMVCC 
+		XidInMVCCSnapshot(relation-rd_rel-relvalidxid, snapshot))
+	{
+		/* Unless we created or truncated the table recently ourselves */
+		if (relation-rd_createSubid == InvalidSubTransactionId 
+			relation-rd_newRelfilenodeSubid == InvalidSubTransactionId)
+			ereport(ERROR,
+	(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+	 errmsg(cannot access recently added or recently removed data in %s,
+			 NameStr(relation-rd_rel-relname;
+	}
+
+	/*
 	 * increment relation ref count while scanning relation
 	 *
 	 * This is just to make really sure the relcache entry won't go away while
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index aef410a..eb93a7c 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -787,6 +787,7 @@ InsertPgClassTuple(Relation pg_class_desc,
 	values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel-relhastriggers);
 	values[Anum_pg_class_relhassubclass - 1] = BoolGetDatum(rd_rel-relhassubclass);
 	values[Anum_pg_class_relfrozenxid - 1] = TransactionIdGetDatum(rd_rel-relfrozenxid);
+	values[Anum_pg_class_relvalidxid - 1] = TransactionIdGetDatum(rd_rel-relvalidxid);
 	if (relacl != (Datum) 0)
 		values[Anum_pg_class_relacl - 1] = relacl;
 	else
@@ -872,6 +873,7 @@ AddNewRelationTuple(Relation pg_class_desc,
 		 * that will do.
 		 */
 		new_rel_reltup-relfrozenxid = RecentXmin;
+		new_rel_reltup-relvalidxid = GetCurrentTransactionId();
 	}
 	else
 	{
@@ -882,6 +884,7 @@ AddNewRelationTuple(Relation pg_class_desc,
 		 * commands/sequence.c.)
 		 */
 		new_rel_reltup-relfrozenxid = InvalidTransactionId;
+		new_rel_reltup-relvalidxid = InvalidTransactionId;
 	}
 
 	new_rel_reltup-relowner = relowner;
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index bfbe642..0d96a6a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2854,7 +2854,7 @@ reindex_index(Oid indexId, bool skip_constraint_checks)
 		}
 
 		/* We'll build a new physical relation for the index */
-		RelationSetNewRelfilenode(iRel, InvalidTransactionId);
+		RelationSetNewRelfilenode(iRel, InvalidTransactionId, InvalidTransactionId);
 
 		/* Initialize the index and rebuild */
 		/* Note: we do not need to re-establish pkey setting */
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index b40e57b..44b891c 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -538,6 +538,7 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt, bool inh)
 			totalrows,
 			visibilitymap_count(onerel),
 			hasindex,
+			InvalidTransactionId,
 			InvalidTransactionId);
 
 	/*
@@ -558,6 +559,7 @@ do_analyze_rel(Relation onerel, VacuumStmt *vacstmt, bool inh)
 totalindexrows,
 0,
 false,
+InvalidTransactionId,
 InvalidTransactionId);
 		}
 	}
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 1f95abc..ab4aec2 100644
--- a/src/backend/commands/sequence.c
+++ 

Re: [HACKERS] Our regex vs. POSIX on longest match

2012-03-04 Thread Tom Lane
Brendan Jurd dire...@gmail.com writes:
 On 4 March 2012 17:53, Tom Lane t...@sss.pgh.pa.us wrote:
 Brendan Jurd dire...@gmail.com writes:
 I'll admit that this is a pretty obscure point, but we do appear to be
 in direct violation of POSIX here.

 How so?  POSIX doesn't contain any non-greedy constructs.  If you use
 only the POSIX-compatible greedy constructs, the behavior is compliant.

 While it's true that POSIX doesn't contemplate non-greed, after
 reading the spec I would have expected an expression *as a whole* to
 still prefer the longest match, insofar as that is possible while
 respecting non-greedy particles.

It's not apparent to me that a construct that is outside the spec
shouldn't be expected to change the overall behavior.  In particular,
what if the RE contains only non-greedy quantifiers --- would you still
expect longest match overall?  Henry certainly didn't.

 I just ran some quick experiments in
 Perl, Python and PHP, using 'xy1234' ~ 'y*?\d+'.  All returned
 'y1234', which to my mind is the most obvious answer, and one which
 still makes sense in light of what POSIX has to say.  Whereas Postgres
 (and presumably Tcl) return 'y1'.

Well, that's just an arbitrary example.  The cases I remember people
complaining about in practice were the other way round: greedy
quantifier followed by non-greedy, and they were unhappy that the
non-greediness was effectively not respected (because the overall RE was
taken as greedy).  So you can't fix the issue by pointing to POSIX and
saying overall greedy is always the right thing.

Another thing I've seen people complain about is that an alternation
(anything with top-level |) is always taken as greedy overall.
This seems a bit surprising to me --- I'd have expected a rule like
inherits its greediness from the leftmost branch that has one,
though I'm not sure whether that would really be much of an improvement
in practice.  (It would at least fix the problem that a leading (a|b)
determines greediness of the containing RE whereas the logically
equivalent [ab] does not.)  Again, pointing to POSIX and saying
overall greedy is the right thing doesn't help.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: improve selectivity estimation for IN/NOT IN

2012-03-04 Thread Tom Lane
Daniele Varrazzo daniele.varra...@gmail.com writes:
 On Sun, Mar 4, 2012 at 1:44 PM, Euler Taveira de Oliveira
 eu...@timbira.com wrote:
 On 04-03-2012 00:20, Daniele Varrazzo wrote:
 It looks like you have grand plans for array estimation. My patch has
 a much more modest scope, and I'm hoping it could be applied to
 currently maintained PG versions, as I consider the currently produced
 estimations a bug.

 We don't normally add new features to stable branches unless it is a bug. In
 the optimizer case, planner regression is a bug (that this case is not).

 Please note that we are talking about planning errors leading to
 estimates of records in the millions instead of in the units,

We're also talking about a proposed patch that is not clearly a bug fix,
but is a change in a heuristic, meaning it has the potential to make
things worse in some cases.  (Notably, for arrays that *don't* contain
all-distinct values, the estimates are likely to get worse.)  So I
wasn't thinking of it as being back-patchable anyway.  It's generally
better not to destabilize planner behavior in minor releases, even if
it's a 90%-of-the-time improvement.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-04 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 FWIW, I agree with Thom on this.  If we do it as you suggest, I
 confidently predict that it will be less than a year before we seriously
 regret it.  Given all the discussion around this, it's borderline insane
 to believe that the set of parameters to be passed to command triggers
 is nailed down and won't need to change in the future.

 As for special coding of support, it hardly seems onerous when every
 language that has triggers at all has got some provision for the
 existing trigger parameters.  A bit of copying and pasting should get
 the job done.

So, for that to happen I need to add a new macro and use it in most call
sites of CALLED_AS_TRIGGER(fcinfo). That includes the setup switch in
src/pl/plpgsql/src/pl_comp.c doCompile() and plpgsql_call_handler() for
starters.

Let's call the macro CALLED_AS_COMMAND_TRIGGER(fcinfo), and I would
extend CommandContextData to be a Node of type CommandTriggerData, that
needs to be added to the NodeTag enum as T_CommandTriggerData.

With that in place I can stuff the data into the function's call context
(via InitFunctionCallInfoData) then edit the call handlers of included
procedure languages until they are able to init their language variables
with the info.

Then, do we want the command trigger functions to return type trigger or
another specific type?  I guess we want to forbid registering any
function as a command trigger?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Tom Lane
Alexander Korotkov aekorot...@gmail.com writes:
 On Sun, Mar 4, 2012 at 5:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 2. The tests in the above-mentioned message show that in most cases
 where mcelem_array_contained_selec falls through to the rough
 estimate, the resulting rowcount estimate is just 1, ie we are coming
 out with very small selectivities.  Although that path will now only be
 taken when there are no stats, it seems like we'd be better off to
 return DEFAULT_CONTAIN_SEL instead of what it's doing.  I think there
 must be something wrong with the rough estimate logic.  Could you
 recheck that?

 I think the wrong think with rough estimate is that assumption about
 independent occurrences of items is very unsuitable even for rough
 estimate. The following example shows that rough estimate really works
 in the case of independent occurrences of items. ...
 It this particular case rough estimate is quite accurate. But in most
 part of cases it behaves really bad. It is why I started to invent
 calc_distr and etc. So, I think return DEFAULT_CONTAIN_SEL is OK unless
 we've some better ideas.

OK.  Looking again at that code, I notice that it also punts and returns
DEFAULT_CONTAIN_SEL if it's not given MCELEM stats, which it more or
less has to because without even a minfreq the whole calculation is just
hot air.  And there are no plausible scenarios where compute_array_stats
would produce an MCELEM slot but no count histogram.  So that says there
is no point in sweating over this case, unless you have an idea how to
produce useful results without MCELEM.

So I think it's sufficient to punt at the top of the function if no
histogram, and take out the various attempts to cope with the case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our regex vs. POSIX on longest match

2012-03-04 Thread hubert depesz lubaczewski
On Sun, Mar 04, 2012 at 12:34:22PM -0500, Tom Lane wrote:
 Well, that's just an arbitrary example.  The cases I remember people
 complaining about in practice were the other way round: greedy
 quantifier followed by non-greedy, and they were unhappy that the
 non-greediness was effectively not respected (because the overall RE was
 taken as greedy).  So you can't fix the issue by pointing to POSIX and
 saying overall greedy is always the right thing.

I was one of the complaining, and my point was that deciding for whole
regexp whether it's greedy or non-greedy is a bug (well, it might be
documented, but it's still *very* unexpected).

I stand on position that mixing greedy and non-greedy operators should
be possible, and that it should work according to POLA - i.e. greedines
of given atom shouldn't be influenced by other atoms.

Best regards,

depesz


-- 
The best thing about modern society is how easy it is to avoid contact with it.
 http://depesz.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum locks

2012-03-04 Thread Greg Jaskiewicz

On 2 Mar 2012, at 15:28, Robert Haas wrote:

 On Fri, Mar 2, 2012 at 6:22 AM, Gregg Jaskiewicz gryz...@gmail.com wrote:
 Looking at the system bit more now, it look like 'waiting' states are
 changing for both the query and autovacuum in pg_stat_activity.
 But very slowly. It looks like they both got into that sort of state
 that keeps them on the edge of starvation.
 
 So this isn't really a deadlock, but rather very poor performance in
 this corner case.
 
 Are you making any use of cursors?

nope. 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-04 Thread Pavel Stehule
Hello

2012/3/4 Alvaro Herrera alvhe...@commandprompt.com:

 Excerpts from Alvaro Herrera's message of sáb mar 03 17:56:23 -0300 2012:
 Excerpts from Alvaro Herrera's message of sáb mar 03 16:54:19 -0300 2012:
  Excerpts from Pavel Stehule's message of sáb mar 03 02:25:52 -0300 2012:

   3. THE ARE NOT CARET - this is really important

  I am not sure about the caret thingy -- mainly because I don't think it
  works all that well.

 It doesn't work correctly with your patch; see sample below.  Note the
 caret is pointing to an entirely nonsensical position.  I'm not sure
 about duplicating the libpq line-counting logic in the backend.

 Also note i18n seems to be working well, except for the In function
 header, query, and the error level.  That seems easily fixable.

 I remain unconvinced that this is the best possible output.

 alvherre=# create function f() returns int language plpgsql as $$
 begin select
 var
 from
 foo; end; $$;
 CREATE FUNCTION
 alvherre=# check function f();
                     CHECK FUNCTION
 -
  In function: 'f()'
  error:42P01:2:sentencia SQL:no existe la relación «foo»
  query:select                                           +
  var                                                    +
  from                                                   +
  foo
                       ^
 (4 filas)


this should be fixed. I checked expressions, that works (I expect)
correctly. Caret helps - (really). Sometimes man is blind :).

I'll look on this topic tomorrow and I hope this will be solvable simply.

Regards

Pavel

 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog min recovery request ... is past current point ...

2012-03-04 Thread Heikki Linnakangas

On 03.02.2012 18:32, Christophe Pettus wrote:

PostgreSQL 9.0.4:

While bringing up a streaming replica, and while it is working its way through 
the WAL segments before connecting to the primary, I see a lot of messages of 
the form:

2012-02-01 21:26:13.978 PST,,,24448,,4f2a1e61.5f80,54,,2012-02-01 21:25:53 PST,1/0,0,LOG,0,restored log file 
00010DB40065 from archive,
2012-02-01 21:26:14.032 PST,,,24448,,4f2a1e61.5f80,55,,2012-02-01 21:25:53 
PST,1/0,0,WARNING,01000,xlog min recovery request DB5/42E15098 is past current point 
DB4/657FA490,writing block 5 of relation base/155650/156470_vm
xlog redo insert: rel 1663/155650/1658867; tid 9640/53
2012-02-01 21:26:14.526 PST,,,24448,,4f2a1e61.5f80,56,,2012-02-01 21:25:53 PST,1/0,0,LOG,0,restored log file 
00010DB40066 from archive,

All of these are on _vm relations.  The recovery completed successfully and the 
secondary connected to the primary without issue, so: Are these messages 
something to be concerned over?


Hmm, I think I see how that can happen:

0. A heap page has its bit set in visibility map to begin with

1. A heap tuple is inserted/updated/deleted. This clears the VM bit.
2. time passes, and more WAL is generated
3. The page is vacuumed, and the visibility map bit is set again.

In the standby, this can happen while replaying the WAL, if you restart 
the standby so that some WAL is re-replayed:


1. The update of the heap tuple is replayed. This clears the VM bit.
2. The VACUUM is replayed, setting the VM bit again, and updating the VM 
page's LSN.

3. Shutdown and restart standby
4. The heap update is replayed again. This again clears the VM bit, but 
does not set the LSN


If the VM page is now evicted from the buffer cache, you get the WARNING 
you saw, because the page is dirty, yet its LSN is beyond the current 
point in recovery.


AFAICS that's totally harmless, but the warning is quite alarming, so 
we'll have to figure out a way to fix that. Not sure how; perhaps we 
need to set the LSN on the VM page when the VM bit is cleared, but I 
don't remember off the top of my head if there was some important reason 
why we don't do that currently.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our regex vs. POSIX on longest match

2012-03-04 Thread Tom Lane
hubert depesz lubaczewski dep...@depesz.com writes:
 I stand on position that mixing greedy and non-greedy operators should
 be possible, and that it should work according to POLA - i.e. greedines
 of given atom shouldn't be influenced by other atoms.

[ shrug... ]  That sounds good, but it's pretty much vacuous as far as
defining a principled alternative behavior goes.  It's easy to
demonstrate cases where atoms *must* be influenced by other ones.
A trivial example is
(.*)(.*)
It doesn't matter whether the second atom is greedy or not: it's not
going to get to eat anything because the first one does instead.
IOW this is just the same as
(.*)(.*?)
--- they are both overall-greedy.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Tom Lane
Alexander Korotkov aekorot...@gmail.com writes:
 On Sun, Mar 4, 2012 at 5:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 1. I'm still unhappy about the loop that fills the count histogram,
 as I noted earlier today.  It at least needs a decent comment and some
 overflow protection, and I'm not entirely convinced that it doesn't have
 more bugs than the overflow issue.

 Attached patch is focused on fixing this. The frac variable overflow is
 evaded by making it int64. I hope comments is clarifying something. In
 general this loop copies behaviour of histogram constructing loop of
 compute_scalar_stats function. But instead of values array we've array of
 unique DEC and it's frequency.

OK, I reworked this a bit and committed it.  Thanks.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collect frequency statistics for arrays

2012-03-04 Thread Tom Lane
BTW, one other thing about the count histogram: seems like we are
frequently generating uselessly large ones.  For instance, do ANALYZE
in the regression database and then run

select tablename,attname,elem_count_histogram from pg_stats
  where elem_count_histogram is not null;

You get lots of entries that look like this:

 pg_proc | proallargtypes | 
{1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,6,6,6,2.80556}
 pg_proc | proargmodes| 
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1.6}
 pg_proc | proargnames| 
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,6,6,6,7,7,7,7,8,8,8,14,14,15,16,3.8806}
 pg_proc | proconfig  | 
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}
 pg_class| reloptions | 
{1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}

which seems to me to be a rather useless expenditure of space.
Couldn't we reduce the histogram size when there aren't many
different counts?

It seems fairly obvious to me that we could bound the histogram
size with (max count - min count + 1), but maybe something even
tighter would work; or maybe I'm missing something and this would
sacrifice accuracy.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parameterized-path cost comparisons need some work

2012-03-04 Thread Robert Haas
On Sun, Mar 4, 2012 at 12:20 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 After looking at the results, I think that the fallacy in what we've
 been discussing is this: a parameterized path may well have some extra
 selectivity over a less-parameterized one, but perhaps *not enough to be
 meaningful*.  The cases I was getting hits on were where the rowcount
 estimate got rounded off to be the same as for the less-parameterized
 path.  (In this connection it's worth noting that most of the hits were
 for rowcount estimates of only 1 or 2 rows.)  So basically, the scenario
 is where you have restriction clauses that are already enough to get
 down to a small number of rows retrieved, and then you have some join
 clauses that are not very selective and don't reduce the rowcount any
 further.  Or maybe you have some nicely selective join clauses, and then
 adding more joins to some other relations doesn't help any further.

OK, makes sense.

 One annoying thing about that is that it will reduce the usefulness of
 add_path_precheck, because that's called before we compute the rowcount
 estimates (and indeed not having to make the rowcount estimates is one
 of the major savings from the precheck).  I think what we'll have to do
 is assume that a difference in parameterization could result in a
 difference in rowcount, and hence only a dominant path with exactly the
 same parameterization can result in failing the precheck.

I wish we had some way of figuring out how much this - and maybe some
of the other new planning possibilities like index-only scans - were
going to cost us on typical medium-to-large join problems.  In the
absence of real-world data it's hard to know how worried we should be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: CHECK FUNCTION statement

2012-03-04 Thread Alvaro Herrera

Excerpts from Pavel Stehule's message of dom mar 04 16:33:08 -0300 2012:
 
 Hello
 
 2012/3/4 Alvaro Herrera alvhe...@commandprompt.com:

                      CHECK FUNCTION
  -
   In function: 'f()'
   error:42P01:2:sentencia SQL:no existe la relación «foo»
   query:select                                           +
   var                                                    +
   from                                                   +
   foo
                        ^
  (4 filas)
 
 
 this should be fixed. I checked expressions, that works (I expect)
 correctly. Caret helps - (really). Sometimes man is blind :).

Agreed.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GiST for range types (was Re: [HACKERS] Range Types - typo + NULL string constructor)

2012-03-04 Thread Tom Lane
Alexander Korotkov aekorot...@gmail.com writes:
 On Mon, Jan 30, 2012 at 1:39 AM, Jeff Davis pg...@j-davis.com wrote:
 Marking ready for committer, but please apply my comment fixes at your
 discretion.

 Patch with your comment fixes is attached.

Applied with revisions, some cosmetic, some not so much.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our regex vs. POSIX on longest match

2012-03-04 Thread Brendan Jurd
On 5 March 2012 04:34, Tom Lane t...@sss.pgh.pa.us wrote:
 Brendan Jurd dire...@gmail.com writes:
 On 4 March 2012 17:53, Tom Lane t...@sss.pgh.pa.us wrote:
 Brendan Jurd dire...@gmail.com writes:
 While it's true that POSIX doesn't contemplate non-greed, after
 reading the spec I would have expected an expression *as a whole* to
 still prefer the longest match, insofar as that is possible while
 respecting non-greedy particles.

 It's not apparent to me that a construct that is outside the spec
 shouldn't be expected to change the overall behavior.  In particular,
 what if the RE contains only non-greedy quantifiers --- would you still
 expect longest match overall?  Henry certainly didn't.

Well, no, but then that wouldn't be prefer the longest match, insofar
as that is possible while
 respecting non-greedy particles.  If all the quantifiers in an
expression are non-greedy, then it is trivially true that the only way
to respect the author's intention is to return the shortest match.

 Well, that's just an arbitrary example.  The cases I remember people
 complaining about in practice were the other way round: greedy
 quantifier followed by non-greedy, and they were unhappy that the
 non-greediness was effectively not respected (because the overall RE was
 taken as greedy).

I am unhappy about the reverse example too, and for the same reasons.

If we look at 'xy1234' ~ 'y*\d+?', Postgres gives us 'y1234'
(disregarding the non-greediness of the latter quantifier), and Python
gives us 'y1' (respecting both quantifiers).

So in Postgres, no matter how you mix the greediness up, some of your
quantifiers will not be respected.

 So you can't fix the issue by pointing to POSIX and
 saying overall greedy is always the right thing.

... insofar as that is possible while respecting non-greedy particles.

I will take Henry's word for it that this problem is harder than it
looks, but in any case, I think we may not presume to know better than
the author of the regex about the greediness of his quantifiers.

 Another thing I've seen people complain about is that an alternation
 (anything with top-level |) is always taken as greedy overall.

Yeah, that is quirky.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Our regex vs. POSIX on longest match

2012-03-04 Thread Robert Haas
On Sun, Mar 4, 2012 at 3:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 hubert depesz lubaczewski dep...@depesz.com writes:
 I stand on position that mixing greedy and non-greedy operators should
 be possible, and that it should work according to POLA - i.e. greedines
 of given atom shouldn't be influenced by other atoms.

 [ shrug... ]  That sounds good, but it's pretty much vacuous as far as
 defining a principled alternative behavior goes.  It's easy to
 demonstrate cases where atoms *must* be influenced by other ones.
 A trivial example is
        (.*)(.*)
 It doesn't matter whether the second atom is greedy or not: it's not
 going to get to eat anything because the first one does instead.
 IOW this is just the same as
        (.*)(.*?)
 --- they are both overall-greedy.

But I don't think this makes Brendan's example not a bug.  In this
case, we can't eat anything because there's nothing to eat.  In his
example, there's something to eat, and it's sitting in a place where
it logically seems to line up with a greedy quantifier, and yet the
quantifier doesn't eat it.  Now in fairness, I've seen my fair share
of apparently-buggy behavior from Perl's regex engine over the years
as well.

I think the right way to imagine this is as though the regular
expression were being matched to the source text in left-to-right
fashion.  For example, suppose the RE is [ab]+?[bc]+ and the source
text is aabbcc.  Clearly there's a match at position 1, but the match
could be anywhere between 3 and 6 characters in length, depending on
how greedy we think the RE should be, and the exact source text that
gets matched to each atom is up for grabs.  Enumerating all the
possibilities where each atom matches a string that is consistent with
its definition, leaving greediness aside, we get this:

[aa,b]
[aa,bb]
[aa,bbc]
[aa,bbcc]
[aab,b]
[aab,bc]
[aab,bcc]
[aabb,c]
[aabb,cc]

To distinguish among these, we look first at [ab]+? and decide that,
since it is non-greedy, the right overall answer must be one of the
ones that assigns to [ab]+? a match of minimal length within the space
of possible matches.  That narrows the field to just the first four
choices listed above.  Then we look at [bc]+ and, since it is greedy,
give it a match of maximal length within the space of possible
matches, leading to [ab]+? = aa and [bc]+ = bbcc.  Had the RE been
[ab]+?[bc]+?, the same algorithm would assign [ab]+? = aa and [bc]+? =
b; had it been [ab]+[bc]+? the same algorithm would assign [ab]+ =
aabb and [bc]+? = c.  Testing, I find that this matches what Perl does
in all of those cases.

Things get a bit more complicated when you throw alternation into the
picture, but I think it's basically right to view it as a greedy
operation.  Anything else seems rather unprincipled.  It may seem
surprising that a+?|b+? matched against aaa will match the first
branch to aaa rather than just a, but there's no real reason to
suppose that the ? which applies to the plus should somehow bleed up
and affect the interpretation of the alternation.  The RE might be
something like a+b+?|b+?a+ where the sub-REs have different greediness
interpretations and there's no obvious principled way to decide which
one should apply to the parent; or even something like cat|catfish
where the sub-REs don't contain any greedy/non-greedy operators at all
and yet we still have to assign some greediness to the alternation. I
think it's right to view every RE construct as greedy unless it's got
an explicit not-greedy flag attached to it; after all, that's the
traditional behavior of REs from time immemorial.  Someone could
invent a non-greedy form of alternation if they were so inclined.
This is different from what Perl does, but I think Perl's behavior
here is batty: given a+|a+b+ and the string aaabbb, it picks the first
branch and matches only aaa.  My whole being recoils: that surely
looks like a non-greedy interpretation of a regex with only greedy
quantifiers.  It turns out that if you write a+b+|a+ then it matches
the whole string; apparently, it selects the syntactically first
branch that can match, regardless of the length of the match, which
strikes me as nearly pure evil.  PostgreSQL - correctly, IMHO -
matches the longest substring available regardless of the syntactic
ordering; AIUI, the original charter for REs was to always match the
longest possible substring; non-greedy quantifiers were added later,
and should not be taken as a license to change the semantics of REs
that don't use them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers