date:20150730

[HACKERS] Re: Using quicksort and a merge step to significantly improve on tuplesort's single run external sort

2015-07-30 Thread Simon Riggs

On 30 July 2015 at 12:26, Greg Stark st...@mit.edu wrote:


 On Thu, Jul 30, 2015 at 12:09 PM, Heikki Linnakangas hlinn...@iki.fi
 wrote:


 True, you need a heap to hold the next tuple from each tape in the merge
 step. To avoid exceeding work_mem, you'd need to push some tuples from the
 in-memory array to the tape to make room for that. In practice, though, the
 memory needed for the merge step's heap is tiny. Even if you merge 1000
 tapes, you only need memory for 1000 tuples in the heap. But yeah, you'll
 need some logic to share/divide the in-memory array between the heap and
 the in-memory tail of the last tape.


 It's a bit worse than that because we buffer up a significant chunk of the
 tape to avoid randomly seeking between tapes after every tuple read. But I
 think in today's era of large memory we don't need anywhere near the entire
 work_mem just to buffer to avoid random access. Something simple like a
 fixed buffer size per tape probably much less than 1MB/tape.


MERGE_BUFFER_SIZE is currently 0.25 MB, but there was benefit seen above
that. I'd say we should scale that up to 1 MB if memory allows.

Yes, that could be tiny for small number of runs. I mention it because
Heikki's comment that we could use this in the general case would not be
true for larger numbers of runs. Number of runs decreases quickly with more
memory anyway, so the technique is mostly good for larger memory but
certainly not for small memory allocations.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services

Re: [HACKERS] [PATCH] Microvacuum for gist.

2015-07-30 Thread Alexander Korotkov

Hi!

On Thu, Jul 30, 2015 at 2:51 PM, Anastasia Lubennikova 
lubennikov...@gmail.com wrote:

 I have written microvacuum support for gist access method.
 Briefly microvacuum includes two steps:
 1. When search tells us that the tuple is invisible to all transactions it
 is marked LP_DEAD and page is marked as has dead tuples,
 2. Then, when insert touches full page which has dead tuples it calls
 microvacuum instead of splitting page.
 You can find a kind of review here [1].

 [1]
 http://www.google-melange.com/gsoc/proposal/public/google/gsoc2015/ivanitskiy_ilya/5629499534213120

 Patch is in attachements. Please review it.


Nice!

Some notes about this patch.

1) Could you give same test case demonstrating that microvacuum really work
with patch? Finally, we should get index less growing with microvacuum.

2) Generating notices for every dead tuple would be too noisy. I suggest to
replace notice with one of debug levels.

+  elog(NOTICE, gistkillitems. Mark Item Dead offnum %hd, blkno %d,
offnum, BufferGetBlockNumber(buffer));


3) Please, recheck coding style. For instance, this line needs more spaces
and open brace should be on the next line.

+ if ((scan-kill_prior_tuple)(so-curPageData  0)(so-curPageData ==
so-nPageData)) {

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] multivariate statistics / patch v7

2015-07-30 Thread Tomas Vondra


Hi,

On 07/30/2015 10:21 AM, Heikki Linnakangas wrote:

On 05/25/2015 11:43 PM, Tomas Vondra wrote:

There are 6 files attached, but only 0002-0006 are actually part of the
multivariate statistics patch itself.


All of these patches are huge. In order to review this in a reasonable
amount of time, we need to do this in several steps. So let's see what
would be the minimal set of these patches that could be reviewed and
committed, while still being useful.

The main patches are:

1. shared infrastructure and functional dependencies
2. clause reduction using functional dependencies
3. multivariate MCV lists
4. multivariate histograms
5. multi-statistics estimation

Would it make sense to commit only patches 1 and 2 first? Would that be
enough to get a benefit from this?


I agree that the patch can't be reviewed as a single chunk - that was 
the idea when I split the original (single chunk) patch into multiple 
smaller pieces.


And yes, I believe committing pieces 12 might be enough to get 
something useful, which can then be improved by adding the usual MCV 
and histogram stats on top of that.



I have some doubts about the clause reduction and functional
dependencies part of this. It seems to treat functional dependency as
a boolean property, but even with the classic zipcode and city case,
it's not always an all or nothing thing. At least in some countries,
there can be zipcodes that span multiple cities. So zipcode=X does
not completely imply city=Y, although there is a strong correlation
(if that's the right term). How strong does the correlation need to
be for this patch to decide that zipcode implies city? I couldn't
actually see a clear threshold stated anywhere.

So rather than treating functional dependence as a boolean, I think
it would make more sense to put a 0.0-1.0 number to it. That means
that you can't do clause reduction like it's done in this patch,
where you actually remove clauses from the query for cost esimation
purposes. Instead, you need to calculate the selectivity for each
clause independently, but instead of just multiplying the
selectivities together, apply the dependence factor to it.

Does that make sense? I haven't really looked at the MCV, histogram
and multi-statistics estimation patches yet. Do those patches make
the clause reduction patch obsolete? Should we forget about the
clause reduction and functional dependency patch, and focus on those
later patches instead?


Perhaps. It's true that most real-world data sets are not 100% valid 
with respect to functional dependencies - either because of natural 
imperfections (multiple cities with the same ZIP code) or just noise in 
the data (incorrect entries ...). And it's even mentioned in the code 
comments somewhere, I guess.


But there are two main reasons why I chose not to extend the functional 
dependencies with the [0.0-1.0] value you propose.


Firstly, functional dependencies were meant to be the simplest possible 
implementation, illustrating how the infrastructure is supposed to 
work (which is the main topic of the first patch).


Secondly, all kinds of statistics are simplifications of the actual 
data. So I think it's not incorrect to ignore the exceptions up to some 
threshold.


I also don't think this will make the estimates globally better. Let's 
say you have 1% of rows that contradict the functional dependency - you 
may either ignore them and have good estimates for 99% of the values and 
incorrect estimates for 1%, or tweak the rule a bit and make the 
estimates worse for 99% (and possibly better for 1%).


That being said, I'm not against improving the functional dependencies. 
I already do have some improvements on my TODO - like for example 
dependencies on more columns (not just A=B but [A,B]=C and such), but 
I think we should not squash this into those two patches.


And yet another point - ISTM these cases might easily be handled better 
by the statistics based on ndistinct coefficients, as proposed by 
Kyotaro-san some time ago. That is, compute and track


ndistinct(A) * ndistinct(B) / ndistinct(A,B)

for all pairs of columns (or possibly larger groups). That seems to be 
similar to the coefficient you propose.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5a1 BUG FIX: pgbench negative latencies

2015-07-30 Thread Heikki Linnakangas


On 07/27/2015 03:43 PM, Fabien COELHO wrote:

Under 9.5a1 pgbench -r negative latencies are reported on meta commands,
probably as an oversight of 84f0ea3f.

This patch ensures that now is reset on each loop inside doCustom.


Applied, thanks!

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Row-Level Security Policies (RLS)

2015-07-30 Thread Dean Rasheed

On 30 July 2015 at 01:35, Joe Conway m...@joeconway.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 06/01/2015 02:21 AM, Dean Rasheed wrote:
 While going through this, I spotted another issue --- in a DML
 query with additional non-target relations, such as UPDATE t1 ..
 FROM t2 .., the old code was checking the UPDATE policies of both
 t1 and t2, but really I think it ought to be checking the SELECT
 policies of t2 (in the same way as this query requires SELECT table
 permissions on t2, not UPDATE permissions). I've changed that and
 added new regression tests to test that change.

 I assume the entire refactoring patch needs a fair bit of work to
 rebase against current HEAD,

Actually, there haven't been any conflicting changes so far, so a git
rebase was able to automatically merge correctly -- new patch
attached, with some minor comment rewording (not affecting the bug-fix
part).

Even so, I agree that it makes sense to apply the bug-fix separately,
since it's not really anything to do with the refactoring.


 but I picked out the attached to address
 just the above issue. Does this look correct, and if so does it make
 sense to apply at least this part right now?


Looks correct to me.

Thanks.

Regards,
Dean
diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c
new file mode 100644
index bcf4a8f..cb689ec
--- a/src/backend/commands/policy.c
+++ b/src/backend/commands/policy.c
@@ -186,9 +186,6 @@ policy_role_list_to_array(List *roles, i
 /*
  * Load row security policy from the catalog, and store it in
  * the relation's relcache entry.
- *
- * We will always set up some kind of policy here.  If no explicit policies
- * are found then an implicit default-deny policy is created.
  */
 void
 RelationBuildRowSecurity(Relation relation)
@@ -246,7 +243,6 @@ RelationBuildRowSecurity(Relation relati
 			char	   *with_check_value;
 			Expr	   *with_check_qual;
 			char	   *policy_name_value;
-			Oid			policy_id;
 			bool		isnull;
 			RowSecurityPolicy *policy;
 
@@ -298,14 +294,11 @@ RelationBuildRowSecurity(Relation relati
 			else
 with_check_qual = NULL;
 
-			policy_id = HeapTupleGetOid(tuple);
-
 			/* Now copy everything into the cache context */
 			MemoryContextSwitchTo(rscxt);
 
 			policy = palloc0(sizeof(RowSecurityPolicy));
 			policy-policy_name = pstrdup(policy_name_value);
-			policy-policy_id = policy_id;
 			policy-polcmd = cmd_value;
 			policy-roles = DatumGetArrayTypePCopy(roles_datum);
 			policy-qual = copyObject(qual_expr);
@@ -326,40 +319,6 @@ RelationBuildRowSecurity(Relation relati
 
 		systable_endscan(sscan);
 		heap_close(catalog, AccessShareLock);
-
-		/*
-		 * Check if no policies were added
-		 *
-		 * If no policies exist in pg_policy for this relation, then we need
-		 * to create a single default-deny policy.  We use InvalidOid for the
-		 * Oid to indicate that this is the default-deny policy (we may decide
-		 * to ignore the default policy if an extension adds policies).
-		 */
-		if (rsdesc-policies == NIL)
-		{
-			RowSecurityPolicy *policy;
-			Datum		role;
-
-			MemoryContextSwitchTo(rscxt);
-
-			role = ObjectIdGetDatum(ACL_ID_PUBLIC);
-
-			policy = palloc0(sizeof(RowSecurityPolicy));
-			policy-policy_name = pstrdup(default-deny policy);
-			policy-policy_id = InvalidOid;
-			policy-polcmd = '*';
-			policy-roles = construct_array(role, 1, OIDOID, sizeof(Oid), true,
-			'i');
-			policy-qual = (Expr *) makeConst(BOOLOID, -1, InvalidOid,
-		   sizeof(bool), BoolGetDatum(false),
-			  false, true);
-			policy-with_check_qual = copyObject(policy-qual);
-			policy-hassublinks = false;
-
-			rsdesc-policies = lcons(policy, rsdesc-policies);
-
-			MemoryContextSwitchTo(oldcxt);
-		}
 	}
 	PG_CATCH();
 	{
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
new file mode 100644
index 2c65a90..c28eb2b
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1815,14 +1815,26 @@ ExecWithCheckOptions(WCOKind kind, Resul
 	break;
 case WCO_RLS_INSERT_CHECK:
 case WCO_RLS_UPDATE_CHECK:
-	ereport(ERROR,
-			(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	if (wco-polname != NULL)
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+			 errmsg(new row violates row level security policy \%s\ for \%s\,
+	wco-polname, wco-relname)));
+	else
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 			 errmsg(new row violates row level security policy for \%s\,
 	wco-relname)));
 	break;
 case WCO_RLS_CONFLICT_CHECK:
-	ereport(ERROR,
-			(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+	if (wco-polname != NULL)
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+			 errmsg(new row violates row level security policy \%s\ (USING expression) for \%s\,
+	wco-polname, wco-relname)));
+	else
+		ereport(ERROR,
+

[HACKERS] [PATCH] Microvacuum for gist.

2015-07-30 Thread Anastasia Lubennikova

Hi,

I have written microvacuum support for gist access method.
Briefly microvacuum includes two steps:
1. When search tells us that the tuple is invisible to all transactions it
is marked LP_DEAD and page is marked as has dead tuples,
2. Then, when insert touches full page which has dead tuples it calls
microvacuum instead of splitting page.
You can find a kind of review here [1].

[1]
http://www.google-melange.com/gsoc/proposal/public/google/gsoc2015/ivanitskiy_ilya/5629499534213120

Patch is in attachements. Please review it.

-- 
Best regards,
Lubennikova Anastasia


microvacuum_for_gist.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] 64-bit XIDs again

2015-07-30 Thread Alexander Korotkov

Hackers,

I know there were already couple of threads about 64bit XIDs.
http://www.postgresql.org/message-id/42cccfe9.9040...@intellilink.co.jp
http://www.postgresql.org/message-id/4f6c0e13.3080...@wiesinger.com
I read them carefully, but I didn't find all the arguments for 64bit XIDs
mentioned. That's why I'd like to raise this subject again.

Now hardware capabilities are much higher than when Postgres was designed.
In the modern PostgreSQL scalability tests it's typical to achieve 400 000
- 500 000 tps with pgbench. With such tps it takes only few minutes to
achieve default autovacuum_freeze_max_age = 200 millions.

Notion of wraparound is evolutioning during the time. Initially it was
something that almost never happens. Then it becomes something that could
happen rarely, and we should be ready to it (freeze tuples in advance).
Now, it becomes quite frequent periodic event for high load database. DB
admins should take into account its performance impact.

Typical scenario that I've faced in real life was so. Database is divided
into operative and archive parts. Operative part is small (dozens of
gigabytes) and it serves most of transactions. Archive part is relatively
large (some terabytes) and it serves rare selects and bulk inserts.
Autovacuum work very active for operative part and very lazy for archive
part (as it's expected). System works well until one day age of archive
tables exceeds autovacuum_freeze_max_age. Then all autovacuum workers
starts to do autovacuum to prevent wraparound on archive tables. If even
system IO survive this, operative tables get bloated because all autovacuum
workers are busy with archive tables. In such situation I typically advise
to increase autovacuum_freeze_max_age and run vacuum freeze manually when
system have enough of free resources.

As I mentioned in CSN thread, it would be nice to replace XID with CSN when
setting hint bits for tuple. In this case when hint bits are set we don't
need any additional lookups to check visibility.
http://www.postgresql.org/message-id/CAPpHfdv7BMwGv=ofug3s-jgvfkqhi79pr_zk1wsk-13oz+c...@mail.gmail.com
Introducing 32-bit CSN doesn't seem reasonable for me, because it would
double our troubles with wraparound.

Also, I think it's possible to migrate to 64-bit XIDs without breaking
pg_upgrade. Old tuples can be leaved with 32-bit XIDs while new tuples
would be created with 64-bit XIDs. We can use free bits in t_infomask2 to
distinguish old and new formats.

86 matches

Mail list logo