Re: [HACKERS] avoiding tuple copying in btree index builds

2014-06-01 Thread Amit Kapila
On Tue, May 6, 2014 at 12:04 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, May 5, 2014 at 2:13 PM, Andres Freund and...@2ndquadrant.com
wrote:
  On 2014-05-05 13:52:39 -0400, Robert Haas wrote:
  Today, I discovered that when building a btree index, the btree code
  uses index_form_tuple() to create an index tuple from the heap tuple,
  calls tuplesort_putindextuple() to copy that tuple into the sort's
  memory context, and then frees the original one it built.  This seemed
  inefficient, so I wrote a patch to eliminate the tuple copying.  It
  works by adding a function tuplesort_putindextuplevalues(), which
  builds the tuple in the sort's memory context and thus avoids the need
  for a separate copy.  I'm not sure if that's the best approach, but
  the optimization seems wortwhile.
 
  Hm. It looks like we could quite easily just get rid of
  tuplesort_putindextuple(). The hash usage doesn't look hard to convert.

 I glanced at that, but it wasn't obvious to me how to convert the hash
 usage.  If you have an idea, I'm all ears.

I also think it's possible to have similar optimization for hash index
incase it has to spool the tuple for sorting.

In function hashbuildCallback(), when buildstate-spool is true, we
can avoid to form index tuple. To check for nulls before calling

_h_spool(), we can traverse the isnull array.
It seems converting hash index usage is not as straightforward as
btree index, but doesn't look too complex either.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] pg_sleep() doesn't work well with recovery conflict interrupts.

2014-06-01 Thread Andres Freund
On 2014-05-30 10:30:42 +0530, Amit Kapila wrote:
 On Wed, May 28, 2014 at 8:53 PM, Andres Freund and...@2ndquadrant.com
 wrote:
  Hi,
 
  Since a64ca63e59c11d8fe6db24eee3d82b61db7c2c83 pg_sleep() uses
  WaitLatch() to wait. That's fine in itself. But
  procsignal_sigusr1_handler, which is used e.g. when resolving recovery
  conflicts, doesn't unconditionally do a SetLatch().
  That means that we'll we'll currently not be able to cancel conflicting
  backends during recovery for 10min. Now, I don't think that'll happen
  too often in practice, but it's still annoying.
 
 How will such a situation occur, aren't we using pg_usleep during
 RecoveryConflict functions
 (ex. in ResolveRecoveryConflictWithVirtualXIDs)?

I am not sure what you mean. pg_sleep() is the SQL callable function, a
different thing to pg_usleep(). The latter isn't interruptible on all
platforms, but the sleep times should be short enough for that not to
matter.
I am pretty sure by now that the sane fix for this is to add a
SetLatch() call to RecoveryConflictInterrupt(). All the signal handlers
that deal with query cancelation et al. do so, so it seems right that
RecoveryConflictInterrupt() does so as well.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pivot aggregation with a patched intarray

2014-06-01 Thread Michael Paquier
On Sat, May 31, 2014 at 12:31 AM, Marc Mamin m.ma...@intershop.de wrote:
 I have patched intarray with 3 additional functions in order to 
 count[distinct] event IDs
 into arrays, whereas the array position correspond to the integer values. 
 (mimic column oriented storage)

I didn't look at the feature itself, but here are some comments about
the format of the patch:
- Be careful the newlines on the file you posted use ¥r¥n, which is
purely Windows stuff... This will generate unnecessary diffs with the
source code
- Here are some guidelines about the code convention:
http://www.postgresql.org/docs/devel/static/source.html
- And here are a couple of rules to respect when submitting a patch:
https://wiki.postgresql.org/wiki/Submitting_a_Patch
Following those rules will make patch review as well as the life of
reviewers easier.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pivot aggregation with a patched intarray

2014-06-01 Thread Marc Mamin

On Sat, May 31, 2014 at 12:31 AM, Marc Mamin m.ma...@intershop.de wrote:
 I have patched intarray with 3 additional functions in order to 
 count[distinct] event IDs
 into arrays, whereas the array position correspond to the integer values. 
 (mimic column oriented storage)

I didn't look at the feature itself, but here are some comments about
the format of the patch:
- Be careful the newlines on the file you posted use ¥r¥n, which is
purely Windows stuff... This will generate unnecessary diffs with the
source code

oops, will fix

- Here are some guidelines about the code convention:
http://www.postgresql.org/docs/devel/static/source.html
- And here are a couple of rules to respect when submitting a patch:
https://wiki.postgresql.org/wiki/Submitting_a_Patch
Following those rules will make patch review as well as the life of
reviewers easier.

thanks for your comments
I don't mean to suggests this directly as a patch,
I'm first interested to see if there are some interest
for such an aggregation type.

regards,

Marc Mamin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] teaching Var about NOT NULL

2014-06-01 Thread David Rowley
Hi,

While hacking away at implementing join removal support for ANTI-JOINs I
realised that I couldn't just replace the join with a WHERE false
condition... Let me explain...

With a query such as:

SELECT * FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = b.id);

Where a.b_id has a foreign key on b(id)

I'm working on a join removal patch which will turn this into:

SELECT * FROM a WHERE b_id IS NULL;

This seemed like a bit of a shame since with my test tables b_id is defined
NOT NULL, but there seemed to be no way to tell if I needed to add a WHERE
FALSE or a IS NULL check to the WHERE clause.

I quickly put together the attached patch which adds a knownnotnull bool
field to Var which we can set to true when we're completely sure that the
Var cannot contain any NULL values. I'm populating this from
pg_attribute.attnotnull where I can and setting it to false where I can't
see another way to tell for sure that nulls cannot exist.

The only use of knownnotnull that I've added to the patch is to turn a
query such as:

SELECT * FROM a WHERE b_id IS NULL;

To not scan the table, since id is defined as NOT NULL.

postgres=# alter table a alter column b_id drop not null;
ALTER TABLE
postgres=# explain select id from a where b_id is null;
QUERY PLAN
---
 Seq Scan on a  (cost=0.00..31.40 rows=11 width=4)
   Filter: (b_id IS NULL)
 Planning time: 0.340 ms
(3 rows)


postgres=# alter table a alter column b_id set not null;
ALTER TABLE
postgres=# explain select id from a where b_id is null;
   QUERY PLAN

 Result  (cost=0.00..31.40 rows=1 width=4)
   One-Time Filter: false
   -  Seq Scan on a  (cost=0.00..31.40 rows=1 width=4)
 Planning time: 0.402 ms
(4 rows)

Having this extra flag could likely help optimise NOT IN(SELECT notnullcol
FROM table) to allow this to become an ANTI-JOIN. It will also help join
optimise join removals a little more.

The patch is just a few minutes old and there's no regression tests yet.
I'd rather have some feedback before I proceed with it.

Regards

David Rowley


var_not_null_v0.1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-06-01 Thread Euler Taveira
On 01-06-2014 02:57, Andres Freund wrote:
 On 2014-06-01 00:50:58 -0500, Jim Nasby wrote:
 On 5/31/14, 9:11 AM, Andres Freund wrote:
 On 2014-02-21 15:14:15 -0600, Jim Nasby wrote:
 On 2/17/14, 7:31 PM, Robert Haas wrote:
 But do you really want to keep that snapshot around long enough to
 copy the entire database?  I bet you don't: if the database is big,
 holding back xmin for long enough to copy the whole thing isn't likely
 to be fun.

 I can confirm that this would be epic fail, at least for londiste. It 
 takes about 3 weeks for a new copy of a ~2TB database. There's no way 
 that'd work with one snapshot. (Granted, copy performance in londiste is 
 rather lackluster, but still...)

 I'd marked this email as todo:
 If you have such a huge database you can, with logical decoding at
 least, use a basebackup using pg_basebackup or pg_start/stop_backup()
 and roll forwards from that... That'll hopefull make such huge copies
 much faster.
 
 Just keep in mind that one of the use cases for logical replication is 
 upgrades.
 
 Should still be fine. Make a physical copy; pg_upgrade; catchup via
 logical rep.
 
Have in mind that it is not an option if you want to copy *part* of the
database(s) (unless you have space available and want to do the cleanup
after upgrade). In a near future, a (new) tool could do (a) copy schema,
(b) accumulate modifications while copying data, (c) copy whole table
and (d) apply modifications for selected table(s)/schema(s). Such a tool
could even be an alternative to pg_upgrade.


-- 
   Euler Taveira   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] teaching Var about NOT NULL

2014-06-01 Thread Tom Lane
David Rowley dgrowle...@gmail.com writes:
 I quickly put together the attached patch which adds a knownnotnull bool
 field to Var which we can set to true when we're completely sure that the
 Var cannot contain any NULL values.

This is utterly the wrong way to go about it.  How will you update views
containing such Vars, when someone does an ALTER TABLE SET/DROP NOT NULL?

The right thing is for the planner to look *at plan time* to see if the
column is marked NOT NULL.  (This is safe against the ALTER TABLE problem
because an ALTER will result in a relcache invalidation signal, forcing
any plans referencing the table to be regenerated before next use.)

One way to go about it would be to add a bool array to RelOptInfo and
teach plancat.c to populate the array.  However, that would only be a win
if we could expect the information to get used (preferably multiple times)
in most planning attempts.  That does not seem to me to be likely for this
use case, so I'd go with just doing a pg_attribute catcache lookup
on-the-fly when necessary.  I'd suggest
(1) new lsyscache.c utility function taking a relation OID and an attnum;
(2) new function somewhere in the planner that decides whether an
expression is known not-null.  For a Var, it'd fetch the matching RTE,
see if it's RTE_RELATION, and if so call the lsyscache.c function.
There are a lot of other potential cases that such a function could be
taught about later, if it proves useful.

BTW, you'd need to be pretty careful about semantics here.  Even if
the Var is known not-null at the point of scanning the relation,
what if that relation is nullable by some upper outer join?  Perhaps
the function (2) would need to take an argument describing the
join level at which we're making the test.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sleep() doesn't work well with recovery conflict interrupts.

2014-06-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I am pretty sure by now that the sane fix for this is to add a
 SetLatch() call to RecoveryConflictInterrupt(). All the signal handlers
 that deal with query cancelation et al. do so, so it seems right that
 RecoveryConflictInterrupt() does so as well.

+1

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-06-01 Thread Jim Nasby

On 6/1/14, 10:49 AM, Euler Taveira wrote:

On 01-06-2014 02:57, Andres Freund wrote:

On 2014-06-01 00:50:58 -0500, Jim Nasby wrote:

On 5/31/14, 9:11 AM, Andres Freund wrote:

On 2014-02-21 15:14:15 -0600, Jim Nasby wrote:

On 2/17/14, 7:31 PM, Robert Haas wrote:

But do you really want to keep that snapshot around long enough to
copy the entire database?  I bet you don't: if the database is big,
holding back xmin for long enough to copy the whole thing isn't likely
to be fun.


I can confirm that this would be epic fail, at least for londiste. It takes 
about 3 weeks for a new copy of a ~2TB database. There's no way that'd work 
with one snapshot. (Granted, copy performance in londiste is rather lackluster, 
but still...)


I'd marked this email as todo:
If you have such a huge database you can, with logical decoding at
least, use a basebackup using pg_basebackup or pg_start/stop_backup()
and roll forwards from that... That'll hopefull make such huge copies
much faster.



Just keep in mind that one of the use cases for logical replication is upgrades.


Should still be fine. Make a physical copy; pg_upgrade; catchup via
logical rep.


Have in mind that it is not an option if you want to copy *part* of the
database(s) (unless you have space available and want to do the cleanup
after upgrade). In a near future, a (new) tool could do (a) copy schema,
(b) accumulate modifications while copying data, (c) copy whole table
and (d) apply modifications for selected table(s)/schema(s). Such a tool
could even be an alternative to pg_upgrade.


There's also things that pg_upgrade doesn't handle, so it's not always an 
option.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC] Clustering in MADlib - status update

2014-06-01 Thread Maxence Ahlouche
Hi all!

I've pushed my report for this week on my repo [0]. Here is a copy!
Attached is the patch containing my work for this week.
Week 2 - 2014/01/01

This week, I have worked on the beginning of the kmedoids module.
Unfortunately, I was supposed to have something working for last Wednesday,
and it is still not ready, mostly because I've lost time this week by being
sick, and by packing all my stuff in preparation for relocation.

The good news now: this week is my last school (exam) week, and that means
full-time GSoC starting next Monday! Also, I've studied the kmeans module
quite thoroughly, and I can finally understand how it all goes on, at the
exception of one bit: the enormous SQL request used to update the
IterationController.

For kmedoids, I've abandoned the idea of making the loop by myself and have
decided instead to stick to copying kmeans as much as possible, as it seems
easier than doing it all by myself. The only part that remains to be
adapted is that big SQL query I haven't totally understood yet. I've asked
the help of Atri, but surely the help of an experienced MADlib hacker would
speed things up :) Atri and I would also like to deal with this through a
voip meeting, to ease communication. If anyone wants to join, you're
welcome!

As for the technology we'll use, I have a Mumble server running somewhere,
if that fits to everyone. Otherwise, suggest something!

I am available from Monday 2 at 3 p.m. (UTC) to Wednesday 4 at 10 a.m.
(exam weeks are quite light).

This week, I have also faced the first design decisions I have to make. For
kmedoids, the centroids are points of the dataset. So, if I wanted to
identify them precisely, I'd need to use their ids, but that would mean
having a prototype different than the kmeans one. So, for now, I've decided
to use the points coordinates only, hoping I will not run into trouble. If
I ever do, switching to ids should'nt be too hard. Also, if the user wants
to input initial medoids, he can input whatever points he wants, be they
part of the dataset or not. After the first iteration, the centroids will
anyway be points of the dataset (maybe I could just select the points
nearest to the coordinates they input as initial centroids).

Second, I'll need to refactor the code in kmeans and kmedoids, as these two
modules are very similar. There are several options for this:

   1. One big clustering module containing everything clustering-related
   (ugly but easy option);
   2. A clustering module and kmeans, kmedoids, optics, utils
   submodules (the best imo, but I'm not sure it's doable);
   3. A clustering_utils module at the same level as the others (less
   ugly than the first one, but easy too).

Any opinions?

Next week, I'll get a working kmedoids module, do some refactoring, and
then add the extra methods, similar to what's done in kmeans, for the
different seedings. Once that's done, I'll make it compatible with all
three ports (I'm currently producing Postgres-only code, as it's the
easiest for me to test), and write the tests and doc. The deadline for this
last step is in two weeks; I don't know yet if I'll be on time by then or
not. It will depend on how fast I can get kmedoids working, and how fast
I'll go once I'm full time GSoC.

Finally, don't hesitate to tell me if you think my decisions are wrong, I'm
glad to learn :)
[0] http://git.viod.eu/viod/gsoc_2014/blob/master/reports.rst

-- 
Maxence Ahlouche
06 06 66 97 00
diff --git a/src/config/Modules.yml b/src/config/Modules.yml
index bf48d82..8f3431f 100644
--- a/src/config/Modules.yml
+++ b/src/config/Modules.yml
@@ -20,6 +20,7 @@ modules:
   depends: ['svec_util']
 - name: kmeans
   depends: ['array_ops', 'svec_util', 'sample']
+- name: kmedoids
 - name: lda
   depends: ['array_ops']
 - name: linalg
diff --git a/src/ports/postgres/modules/kmedoids/__init__.py_in b/src/ports/postgres/modules/kmedoids/__init__.py_in
new file mode 100644
index 000..e69de29
diff --git a/src/ports/postgres/modules/kmedoids/kmedoids.py_in b/src/ports/postgres/modules/kmedoids/kmedoids.py_in
new file mode 100644
index 000..e6e6167
--- /dev/null
+++ b/src/ports/postgres/modules/kmedoids/kmedoids.py_in
@@ -0,0 +1,38 @@
+import plpy
+
+from utilities.validate_args import table_exists
+from utilities.validate_args import table_is_empty
+
+# --
+
+
+# TODO:refactor (directly copied from kmeans module)
+def kmedoids_validate_src(schema_madlib, rel_source, **kwargs):
+if rel_source is None or rel_source.strip().lower() in ('null', ''):
+plpy.error(kmeans error: Invalid data table name!)
+if not table_exists(rel_source):
+plpy.error(kmeans error: Data table does not exist!)
+if table_is_empty(rel_source):
+plpy.error(kmeans error: Data table is empty!)
+
+# --
+
+
+def compute_kmedoids(schema_madlib, rel_args, 

plpython_unicode test (was Re: [HACKERS] buildfarm / handling (undefined) locales)

2014-06-01 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 On 13.5.2014 20:58, Tom Lane wrote:
 Tomas Vondra t...@fuzzy.cz writes:
 Yeah, not really what we were shooting for. I've fixed this by
 defining the missing locales, and indeed - magpie now fails in
 plpython tests.

 I saw that earlier today (tho right now the buildfarm server seems
 to not be responding :-().  Probably we should use some
 more-widely-used character code in that specific test?

 Any idea what other character could be used in those tests? ISTM fixing
 this universally would mean using ASCII characters - the subset of UTF-8
 common to all the encodings. But I'm afraid that'd contradict the very
 purpose of those tests ...

We really ought to resolve this issue so that we can get rid of some of
the red in the buildfarm.  ISTM there are three possible approaches:

1. Decide that we're not going to support running the plpython regression
tests under weird server encodings, in which case Tomas should just
remove cs_CZ.WIN-1250 from the set of encodings his buildfarm animals
test.  Don't much care for this, but it has the attraction of being
minimal work.

2. Change the plpython_unicode test to use some ASCII character in place
of \u0080.  We could keep on using the \u syntax to create the character,
but as stated above, this still seems like it's losing a significant
amount of test coverage.

3. Try to select some more portable non-ASCII character, perhaps U+00A0
(non breaking space) or U+00E1 (a-acute).  I think this would probably
work for most encodings but it might still fail in the Far East.  Another
objection is that the expected/plpython_unicode.out file would contain
that character in UTF8 form.  In principle that would work, since the test
sets client_encoding = utf8 explicitly, but I'm worried about accidental
corruption of the expected file by text editors, file transfers, etc.
(The current usage of U+0080 doesn't suffer from this risk because psql
special-cases printing of multibyte UTF8 control characters, so that we
get exactly \u0080.)

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: plpython_unicode test (was Re: [HACKERS] buildfarm / handling (undefined) locales)

2014-06-01 Thread Tom Lane
I wrote:
 3. Try to select some more portable non-ASCII character, perhaps U+00A0
 (non breaking space) or U+00E1 (a-acute).  I think this would probably
 work for most encodings but it might still fail in the Far East.  Another
 objection is that the expected/plpython_unicode.out file would contain
 that character in UTF8 form.  In principle that would work, since the test
 sets client_encoding = utf8 explicitly, but I'm worried about accidental
 corruption of the expected file by text editors, file transfers, etc.
 (The current usage of U+0080 doesn't suffer from this risk because psql
 special-cases printing of multibyte UTF8 control characters, so that we
 get exactly \u0080.)

I did a little bit of experimentation and determined that none of the
LATIN1 characters are significantly more portable than what we've got:
for instance a-acute fails to convert into 16 of the 33 supported
server-side encodings (versus 17 failures for U+0080).  However,
non-breaking space is significantly better: it converts into all our
supported server encodings except EUC_CN, EUC_JP, EUC_KR, EUC_TW.
It seems likely that we won't do better than that except with a basic
ASCII character.

In principle we could make the test pass even in these encodings
by adding variant expected files, but I doubt it's worth it.  I'd
be inclined to just add a comment to the regression test file indicating
that that's a known failure case, and move on.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: plpython_unicode test (was Re: [HACKERS] buildfarm / handling (undefined) locales)

2014-06-01 Thread Andrew Dunstan


On 06/01/2014 05:35 PM, Tom Lane wrote:

I wrote:

3. Try to select some more portable non-ASCII character, perhaps U+00A0
(non breaking space) or U+00E1 (a-acute).  I think this would probably
work for most encodings but it might still fail in the Far East.  Another
objection is that the expected/plpython_unicode.out file would contain
that character in UTF8 form.  In principle that would work, since the test
sets client_encoding = utf8 explicitly, but I'm worried about accidental
corruption of the expected file by text editors, file transfers, etc.
(The current usage of U+0080 doesn't suffer from this risk because psql
special-cases printing of multibyte UTF8 control characters, so that we
get exactly \u0080.)

I did a little bit of experimentation and determined that none of the
LATIN1 characters are significantly more portable than what we've got:
for instance a-acute fails to convert into 16 of the 33 supported
server-side encodings (versus 17 failures for U+0080).  However,
non-breaking space is significantly better: it converts into all our
supported server encodings except EUC_CN, EUC_JP, EUC_KR, EUC_TW.
It seems likely that we won't do better than that except with a basic
ASCII character.



Yeah, I just looked at the copyright symbol, with similar results.

Let's just stick to ASCII.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Documenting the Frontend/Backend Protocol update criteria

2014-06-01 Thread Mikko Tiihonen
Hi,

Currently the criteria on updating the F/B protocol is undefined. This makes it 
hard to update the protocol going forward. It makes it also hard to write 
library/driver/application implementations that will be more future proof to 
future server versions.

Ideally the documentation for 9.4 (backport?) would say what kind of things are 
allowed to change within the v3 protocol, and thus implies what kind of changes 
need a new v4 protocol. Is there some wishlist page of items to do differently 
for v4 already?

I did find the following text in the documentation: binary representations for 
complex data types might change across server versions. But having more 
specific rules would help, especially since it seems to be there just to scare: 
so far changes have been strongly discouraged.

An example to consider: some binary formats have flags (arrays) or version 
(jsonb) field. We should explicitly say that clients must reject any unknown 
bits/versions that they do not know about. This guarantees they detect small 
format updates instead of silently accepting then and possibly returning 
corrupt data.

My motivation:

Two years ago accidentally I opened a discussion on how to do updates to the 
binary encoding of data in the protocol [1]. I would like to reopen the 
discussion now since the jsonb 'binary' encoding is just a version '1' + text 
json. The result from the last discussion was that having a version or flags as 
part of the binary format is not enough, since drivers/libraries (fixable) and 
applications (unfixable) are depending on the current encoding.
And if we add a new bit to the flags or bump the version number we will break 
backward compatibility.

To summarize the previous discussion:
- there are currently no written rules for modifying the binary encoding formats
- bytea modification was done with a GUC, but GUC was seen as a bad solution in 
general
- my proposal was to add a minor format version number was not good either 
since any per session state would be problematic for connection poolers

[1]: 
http://grokbase.com/t/postgresql/pgsql-hackers/11bwhv1esa/add-minor-version-to-v3-protocol-to-allow-changes-without-breaking-backwards-compatibility


Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch

2014-06-01 Thread Heikki Linnakangas

On 06/01/2014 03:44 AM, johnlumby wrote:

If you look at the new patch,  you'll see that for the different-pid case,
I still call aio_suspend with a timeout.
As you or Claudio pointed out earlier,it could just as well sleep
for the same timeout,
but the small advantage of calling aio_suspend is if the io completed
just between
the aio_error returning EINPROGRESS and the aio_suspend call.
Also it makes the code simpler.   In fact this change is quite small,
just a few lines
in backend/storage/buffer/buf_async.c and backend/storage/file/fd.c

Based on this,I think it is not necessary  to get rid of the polling
altogether
(and in any case, as far as I can see,  very difficult).


That's still just as wrong as it always has been. Just get rid of it. 
Don't put aiocb structs in shared memory at all. They don't belong there.



Well,  as mentioned earlier,  it is not broken. Whether it is
efficient I am not sure.
I have looked at the mutex in aio_suspend that you mentioned and I am not
quite convinced that,  if caller is not the original aio_read process,
it renders the suspend() into an instant timeout.  I will see if I can
verify that.


I don't see the point of pursuing this design further. Surely we don't want
to use polling here, and you're relying on undefined behavior anyway. I'm
pretty sure aio_return/aio_error won't work from a different process on all
platforms, even if it happens to work on Linux. Even on Linux, it might stop
working if the underlying implementation changes from the glibc pthread
emulation to something kernel-based.


Good point. I have included the guts of your little test program
(modified to do polling) into the existing autoconf test program
that decides on the
#define  USE_AIO_ATOMIC_BUILTIN_COMP_SWAP.
See config/c-library.m4.
I hope this goes some way to answer your concern about robustness,
as at least now if the implementation changes in some way that
renders the polling ineffective,  it will be caught in configure.


No, that does not make it robust enough.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb access operators inefficiency

2014-06-01 Thread Andrew Dunstan


On 05/30/2014 01:41 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

If we're going to construct varlena objects inside a StringInfo, maybe
we need a proper API for it. Isn't there a danger that data member of
the StringInfo won't be properly aligned to allow us to do this? In any
case, we should get most of the benefit of your patch without this
optimization.

As noted, the data buffer is maxaligned; but nonetheless I agree that
this is a serious stylistic abuse, and it's not buying much compared
to the rest of the patch.  I'd stick with the cstring_to_text_with_len
coding.

At the other end of the process, why are we using PG_GETARG_TEXT_P
and not PG_GETARG_TEXT_PP to avoid a detoast cycle on short-header
inputs?  The function body is using VARDATA_ANY/VARSIZE_ANY_EXHDR,
so it's already prepared for unaligned input.





Committed with these changes.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sleep() doesn't work well with recovery conflict interrupts.

2014-06-01 Thread Amit Kapila
On Sun, Jun 1, 2014 at 1:05 PM, Andres Freund and...@2ndquadrant.com
wrote:
 On 2014-05-30 10:30:42 +0530, Amit Kapila wrote:
  On Wed, May 28, 2014 at 8:53 PM, Andres Freund and...@2ndquadrant.com
   Since a64ca63e59c11d8fe6db24eee3d82b61db7c2c83 pg_sleep() uses
   WaitLatch() to wait. That's fine in itself. But
   procsignal_sigusr1_handler, which is used e.g. when resolving recovery
   conflicts, doesn't unconditionally do a SetLatch().
   That means that we'll we'll currently not be able to cancel
conflicting
   backends during recovery for 10min. Now, I don't think that'll happen
   too often in practice, but it's still annoying.
 
  How will such a situation occur, aren't we using pg_usleep during
  RecoveryConflict functions
  (ex. in ResolveRecoveryConflictWithVirtualXIDs)?

 I am not sure what you mean. pg_sleep() is the SQL callable function, a
 different thing to pg_usleep().

I was not clear how such a situation can occur, but now looking at
it bit more carefully, I think I understood that any backend calling
pg_sleep() during recovery conflict resolution can face this situation.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com