Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Hannu Krosing
On 02/05/2014 06:48 PM, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 02/05/2014 11:40 AM, Tom Lane wrote:
 switching to binary is the same as text may well be the most prudent
 path here.
 If we do that we're going to have to live with that forever, aren't we? 
 Yeah, but the other side of that coin is that we'll have to live forever
 with whatever binary format we pick, too.  If it turns out to be badly
 designed, that could be much worse than eating some parsing costs during
 dump/restore.
The fastest and lowest parsing cost format for JSON is tnetstrings
http://tnetstrings.org/ why not use it as the binary wire format ?

It would be as binary as it gets and still be generally parse-able by
lots of different platforms, at leas by all of these  we care about.

Cheers
Hannu


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andres Freund
Hi,

On 2014-02-06 18:47:31 -0500, Andrew Dunstan wrote:
  * switching to using text representation in jsonb send/recv

 +/*
 + * jsonb type recv function
 + *
 + * the type is sent as text in binary mode, so this is almost the same
 + * as the input function.
 + */
 +Datum
 +jsonb_recv(PG_FUNCTION_ARGS)
 +{
 + StringInfo  buf = (StringInfo) PG_GETARG_POINTER(0);
 + text   *result = cstring_to_text_with_len(buf-data, buf-len);
 +
 + return deserialize_json_text(result);
 +}

 +/*
 + * jsonb type send function
 + *
 + * Just send jsonb as a string of text
 + */
 +Datum
 +jsonb_send(PG_FUNCTION_ARGS)
 +{
 + Jsonb  *jb = PG_GETARG_JSONB(0);
 + StringInfoData buf;
 + char   *out;
 +
 + out = JsonbToCString(NULL, (JB_ISEMPTY(jb)) ? NULL : VARDATA(jb), 
 VARSIZE(jb));
 +
 + pq_begintypsend(buf);
 + pq_sendtext(buf, out, strlen(out));
 + PG_RETURN_BYTEA_P(pq_endtypsend(buf));
 +}

I'd suggest making the format discernible from possible different future
formats, to allow introducing a proper binary at some later time. Maybe
just send a int8 first, containing the format.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Hannu Krosing
On 02/10/2014 11:05 AM, Andres Freund wrote:
 Hi,

 On 2014-02-06 18:47:31 -0500, Andrew Dunstan wrote:
  * switching to using text representation in jsonb send/recv
 +/*
 + * jsonb type recv function
 + *
 + * the type is sent as text in binary mode, so this is almost the same
 + * as the input function.
 + */
 +Datum
 +jsonb_recv(PG_FUNCTION_ARGS)
 +{
 +StringInfo  buf = (StringInfo) PG_GETARG_POINTER(0);
 +text   *result = cstring_to_text_with_len(buf-data, buf-len);
 +
 +return deserialize_json_text(result);
 +}
 +/*
 + * jsonb type send function
 + *
 + * Just send jsonb as a string of text
 + */
 +Datum
 +jsonb_send(PG_FUNCTION_ARGS)
 +{
 +Jsonb  *jb = PG_GETARG_JSONB(0);
 +StringInfoData buf;
 +char   *out;
 +
 +out = JsonbToCString(NULL, (JB_ISEMPTY(jb)) ? NULL : VARDATA(jb), 
 VARSIZE(jb));
 +
 +pq_begintypsend(buf);
 +pq_sendtext(buf, out, strlen(out));
 +PG_RETURN_BYTEA_P(pq_endtypsend(buf));
 +}
 I'd suggest making the format discernible from possible different future
 formats, to allow introducing a proper binary at some later time. Maybe
 just send a int8 first, containing the format.
+10

Especially as this is one type where we may want add type-specific
compression options at some point


Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-02-10 Thread Marti Raudsepp
On Sun, Feb 9, 2014 at 7:37 PM, Alexander Korotkov aekorot...@gmail.com wrote:
 This is not only place that worry me about planning overhead. See
 get_cheapest_fractional_path_for_pathkeys. I had to estimate number of
 groups for each sorting column in order to get right fractional path.

AFAICT this only happens once per plan and the overhead is O(n) to the
number of pathkeys? I can't get worried about that, but I guess it's
better to test anyway.

PS: You didn't answer my questions about splitting the patch. I guess
I'll have to do that anyway to run the tests.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Breaking compile-time dependency cycles of Postgres subdirs?

2014-02-10 Thread Christian Convey
On Sun, Feb 9, 2014 at 8:06 PM, Robert Haas robertmh...@gmail.com wrote:

 On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey
 christian.con...@gmail.com wrote:
  This question is mostly just curiosity...

 As someone very new to this code base, I think these cycles make it a
 little
  harder to figure out the runtime and compile-time dependencies between
 the
  subsystems these directories seem to represent.  I wonder if that's a
  problem others face as well?

 There are probably some cases that could be improved, but I have my
 doubts about whether eliminating cycles is a reasonable goal.
 Sometimes, two modules really do depend on each other.  And, you're
 talking about this not just on the level of individual files but
 entire subtrees.  There are 90,000 lines of code in src/backend/access
 (whose headers are in src/include/access) and more than 38,000 in
 src/backend/storage (whose headers are in src/include/storage);
 expecting all dependencies between those modules to go in one
 direction doesn't feel terribly reasonable.  If it could be done at
 all, you'd probably end up separating code into lots of little tiny
 directories, splitting apart modules with logically related
 functionality into chunks living in entirely different parts of the
 source tree - and I don't think that would be an improvement.


Thanks Robert.  IMHO, whether or not it would be beneficial depends on
which files (or definitions within files) had to be broken out into
additional subdirectories in order to break the cycles.   If it could be
accomplished with at most a few additional subdirectories that were also
intuitively meaningful groupings of files/definitions, it could be a win.
 But if not, I agree it would be a step backwards.

Still, I'm thinking this might be a problem we need to partially solve if
we're going to support a pluggable storage manager, particularly if we
allow a pluggable storage manager to use the system's buffer system and/or
block I/O system.  I guess it depends on exactly what we want from a
pluggable storage manager.

- Christian


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2014-02-10 Thread Heikki Linnakangas

On 02/07/2014 01:27 PM, Peter Geoghegan wrote:

On Thu, Jan 16, 2014 at 12:35 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

I think you should consider breaking off the relcache parts of my
patch and committing them, because they're independently useful.


Makes sense. Can you extract that into a separate patch, please?


Perhaps you can take a look at this again, when you get a chance.


The relcache parts? I don't think a separate patch ever appeared that 
could be reviewed.


Looking again at the last emails in this whole thread, I don't have 
anything to add. At this point, I think it's pretty clear this won't 
make it into 9.4, so I'm going to mark this as returned with feedback. 
If someone else thinks this still has a chance and is willing to review 
this and beat it into shape, please resurrect it quickly.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-10 Thread Etsuro Fujita
(2014/02/07 21:31), Etsuro Fujita wrote:
 So, I've modified the patch so 
 that we continue to disallow SET STORAGE on a foreign table *in the same 
 manner as before*, but, as your patch does, allow it on an inheritance 
 hierarchy that contains foreign tables, with the semantics that we 
 quietly ignore the foreign tables and apply the operation to the plain 
 tables, by modifying the ALTER TABLE simple recursion mechanism. 
 Attached is the updated version of the patch.
 
 I'll continue the patch review a bit more, though the patch looks good 
 generally to me except for the abobe issue and the way that the ANALYZE 
 command works.

While reviwing the patch, I've found some issues on interactions with
other commands, other than the SET STORAGE command.

* ADD table_constraint NOT VALID: the patch allows ADD table_constraint
*NOT VALID* to be set on a foreign table as well as an inheritance
hierarchy that contains foreign tables.  But I think it would be better
to disallow ADD table_constraint *NOT VALID* on a foreign table, but
allow it on an inheritance hierarchy that contains foreign tables, with
the semantics that we apply the operation to the plain tables and apply
the transformed operation *ADD table_constraint* to the foreign tables.

* VALIDATE CONSTRAINT constraint_name: though the patch disallow the
direct operation on foreign tables, it allows the operation on an
inheritance hierarchy that contains foreign tables, and the operation
fails if there is at least one foreign table that has at least one NOT
VALID constraint.  I think it would be better to modify the patch so
that we allow the operation on an inheritance hierarchy that contains
foreign tables, with the semantics that we quietly ignore the foreign
tables and apply the operation on the plain tables just like the
semantics of SET STORAGE.  (Note that the foreign tables wouldn't have
NOT VALID constraints by diallowing ADD table_constraint *NOT VALID* on
foreign tables as mentioned above.)

* SET WITH OIDS: though the patch disallow the direct operation on
foreign tables, it allows the operation on an inheritance hierarchy that
contains foreign tables, and that operation is successfully done on
foreign tables.  I think it would be better to modify the patch so that
we allow it on an inheritance hierarchy that contains foreign tables,
with the semantics that we quietly ignore the foreign tables and apply
the operation to the plain tables just like the semantics of SET STORAGE.

Comments are wellcome!

Thanks,

Best regards,
Etsuro Fujita


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andrew Dunstan


On 02/10/2014 05:05 AM, Andres Freund wrote:

Hi,

On 2014-02-06 18:47:31 -0500, Andrew Dunstan wrote:

  * switching to using text representation in jsonb send/recv
+/*
+ * jsonb type recv function
+ *
+ * the type is sent as text in binary mode, so this is almost the same
+ * as the input function.
+ */
+Datum
+jsonb_recv(PG_FUNCTION_ARGS)
+{
+   StringInfo  buf = (StringInfo) PG_GETARG_POINTER(0);
+   text   *result = cstring_to_text_with_len(buf-data, buf-len);
+
+   return deserialize_json_text(result);
+}
+/*
+ * jsonb type send function
+ *
+ * Just send jsonb as a string of text
+ */
+Datum
+jsonb_send(PG_FUNCTION_ARGS)
+{
+   Jsonb  *jb = PG_GETARG_JSONB(0);
+   StringInfoData buf;
+   char   *out;
+
+   out = JsonbToCString(NULL, (JB_ISEMPTY(jb)) ? NULL : VARDATA(jb), 
VARSIZE(jb));
+
+   pq_begintypsend(buf);
+   pq_sendtext(buf, out, strlen(out));
+   PG_RETURN_BYTEA_P(pq_endtypsend(buf));
+}

I'd suggest making the format discernible from possible different future
formats, to allow introducing a proper binary at some later time. Maybe
just send a int8 first, containing the format.



Teodor privately suggested something similar.  I was thinking of just 
sending a version byte, which for now would be '\x01'. An int8 seems 
like more future-proofing provision than we really need.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andres Freund
On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
 On 02/10/2014 05:05 AM, Andres Freund wrote:
 I'd suggest making the format discernible from possible different future
 formats, to allow introducing a proper binary at some later time. Maybe
 just send a int8 first, containing the format.
 
 
 Teodor privately suggested something similar.  I was thinking of just
 sending a version byte, which for now would be '\x01'. An int8 seems like
 more future-proofing provision than we really need.

Hm. Isn't that just about the same? I was thinking of the c type int8,
not the 64bit type. It seems cleaner to do a pg_sendint(..., 1, 1) than
to do it manually inside the string.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andrew Dunstan


On 02/10/2014 07:39 AM, Andres Freund wrote:

On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:

On 02/10/2014 05:05 AM, Andres Freund wrote:

I'd suggest making the format discernible from possible different future
formats, to allow introducing a proper binary at some later time. Maybe
just send a int8 first, containing the format.


Teodor privately suggested something similar.  I was thinking of just
sending a version byte, which for now would be '\x01'. An int8 seems like
more future-proofing provision than we really need.

Hm. Isn't that just about the same? I was thinking of the c type int8,
not the 64bit type. It seems cleaner to do a pg_sendint(..., 1, 1) than
to do it manually inside the string.



OK, works for me. I'm tied up for a couple of days, will do this when 
I'm back on deck.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-02-10 Thread Heikki Linnakangas

On 01/31/2014 11:54 AM, Andres Freund wrote:

Hi,

On 2014-01-28 21:27:29 -0800, Peter Geoghegan wrote:

On Fri, Nov 15, 2013 at 11:47 AM, Andres Freund and...@2ndquadrant.com wrote:

1) I've added an abstracted atomic ops implementation. Needs a fair
amount of work, also submitted as a separate CF entry. (Patch 1  2)


Commit 220b34331f77effdb46798ddd7cca0cffc1b2858 caused bitrot when
applying 0002-Very-basic-atomic-ops-implementation.patch. Please
rebase.


I've pushed a rebased version of the patchset to
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git
branch rwlock contention.
220b34331f77effdb46798ddd7cca0cffc1b2858 actually was the small problem,
ea9df812d8502fff74e7bc37d61bdc7d66d77a7f was the major PITA.

I plan to split the atomics patch into smaller chunks before
reposting. Imo the Convert the PGPROC-lwWaitLink list into a dlist
instead of open coding it. is worth being applied independently from
the rest of the series, it simplies code and it fixes a bug...


I committed a fix for the WakeupWaiters() bug now, without the rest of 
the open coding patch. Converting lwWaitLInk into a dlist is probably 
a good idea, but seems better to fix the bug separately, for the sake of 
git history if nothing else.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-10 Thread Hiroshi Inoue

(2014/02/09 8:06), Andrew Dunstan wrote:


On 02/08/2014 05:34 PM, Tom Lane wrote:

Hiroshi Inoue in...@tpf.co.jp writes:

Though I'm not a MINGW expert at all, I know dllwrap is a deprecated
tool and dlltool is almost a deprecated tool. Cygwin port is removing
the use of dllwrap and dlltool now. Isn't it better for MINGW port to
follow it?

Only way to make that happen is to prepare and test a patch ...


Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We
did get rid of dllwrap. But I agree this is worth trying for Mingw.


I tried MINGW port with the attached change and successfully built
src and contrib and all pararell regression tests were OK.

regards,
Hiroshi Inoue

diff --git a/src/Makefile.shlib b/src/Makefile.shlib
index a95e4d6..367f585 100644
--- a/src/Makefile.shlib
+++ b/src/Makefile.shlib
@@ -371,7 +371,20 @@ $(stlib): $(OBJS) | $(SHLIB_PREREQS)
 	$(RANLIB) $@
 
 
+else #cygwin
+ifeq ($(PORTNAME), win32)
+ifeq (,$(SHLIB_EXPORTS))
+$(shlib) $(stlib): $(OBJS) | $(SHLIB_PREREQS)
+	$(CC) $(CFLAGS)  -shared -o $@  $(OBJS) $(LDFLAGS) $(LDFLAGS_SL) $(SHLIB_LINK) $(LIBS) -Wl,--out-implib=$(stlib) -Wl,--export-all-symbols
+
 else
+DLL_DEFFILE = lib$(NAME)dll.def
+$(shlib) $(stlib): $(OBJS) $(DLL_DEFFILE) | $(SHLIB_PREREQS)
+	$(CC) $(CFLAGS)  -shared -o $@  $(OBJS) $(LDFLAGS) $(LDFLAGS_SL) $(SHLIB_LINK) $(LIBS) $(DLL_DEFFILE) -Wl,--out-implib=$(stlib) 
+endif
+
+
+else #win32
 ifeq (,$(SHLIB_EXPORTS))
 DLL_DEFFILE = lib$(NAME)dll.def
 exports_file = $(DLL_DEFFILE)
@@ -388,6 +401,7 @@ $(shlib): $(OBJS) $(DLL_DEFFILE) | $(SHLIB_PREREQS)
 $(stlib): $(shlib) $(DLL_DEFFILE) | $(SHLIB_PREREQS)
 	$(DLLTOOL) --dllname $(shlib) $(DLLTOOL_LIBFLAGS) --def $(DLL_DEFFILE) --output-lib $@
 
+endif # PORTNAME == win32
 endif # PORTNAME == cgywin
 endif # PORTNAME == cygwin || PORTNAME == win32
 
diff --git a/src/backend/Makefile b/src/backend/Makefile
index 356890d..41a4e41 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -80,18 +80,8 @@ endif # cygwin
 ifeq ($(PORTNAME), win32)
 LIBS += -lsecur32
 
-postgres: $(OBJS) postgres.def libpostgres.a $(WIN32RES)
-	$(DLLTOOL) --dllname $@$(X) --output-exp $@.exp --def postgres.def
-	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) -o $@$(X) -Wl,--base-file,$@.base $@.exp $(call expand_subsys,$(OBJS)) $(WIN32RES) $(LIBS)
-	$(DLLTOOL) --dllname $@$(X) --base-file $@.base --output-exp $@.exp --def postgres.def
-	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) -Wl,--stack=$(WIN32_STACK_RLIMIT) -o $@$(X) $@.exp $(call expand_subsys,$(OBJS)) $(WIN32RES) $(LIBS)
-	rm -f $@.exp $@.base
-
-postgres.def: $(OBJS)
-	$(DLLTOOL) --export-all --output-def $@ $(call expand_subsys,$^)
-
-libpostgres.a: postgres.def
-	$(DLLTOOL) --dllname postgres.exe --def postgres.def --output-lib $@
+postgres: $(OBJS)  $(WIN32RES)
+	$(CC) $(CFLAGS) $(LDFLAGS) $(LDFLAGS_EX) -Wl,--stack=$(WIN32_STACK_RLIMIT) -o $@$(X) -Wl,--out-implib=libpostgres.a -Wl,--export-all-symbols $(call expand_subsys,$(OBJS)) $(WIN32RES) $(LIBS)
 
 endif # win32
 
diff --git a/src/makefiles/Makefile.win32 b/src/makefiles/Makefile.win32
index 1aae9e9..b18621b 100644
--- a/src/makefiles/Makefile.win32
+++ b/src/makefiles/Makefile.win32
@@ -72,6 +72,4 @@ win32ver.o: win32ver.rc
 
 # Rule for building a shared library from a single .o file
 %.dll: %.o
-	$(DLLTOOL) --export-all --output-def $*.def $
-	$(DLLWRAP) -o $@ --def $*.def $ $(LDFLAGS) $(LDFLAGS_SL) $(BE_DLLLIBS)
-	rm -f $*.def
+	$(CC) $(CFLAGS) -shared -o $@ $ -Wl,--export-all-symbols $(LDFLAGS) $(LDFLAGS_SL) $(BE_DLLLIBS)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-10 Thread Andres Freund
Hi,

During the lwlock scalability work I noticed a longstanding issue with
the lwlock code. LWLockRelease() and the other mentioned locations do
the following to wake up any waiters, without holding the lock's
spinlock:
/*
 * Awaken any waiters I removed from the queue.
 */
while (head != NULL)
{
LOG_LWDEBUG(LWLockRelease, T_NAME(l), T_ID(l), release waiter);
proc = head;
head = proc-lwWaitLink;
proc-lwWaitLink = NULL;
proc-lwWaiting = false;
PGSemaphoreUnlock(proc-sem);
}

which means they manipulate the lwWaitLink queue without
protection. That's done intentionally. The code tries to protect against
corruption of the list to do a woken up backend acquiring a lock (this
or an independent one) by only continuing when the lwWaiting flag is set
to false. Unfortunately there's absolutely no guarantee that a) the
assignment to lwWaitLink and lwWaiting are done in that order b) that
the stores are done in-order from the POV of other backends.
So what we need to do is to acquire a write barrier between the
assignments to lwWaitLink and lwWaiting, i.e.
proc-lwWaitLink = NULL;
pg_write_barrier();
proc-lwWaiting = false;
the reader side already uses an implicit barrier by using spinlocks.

I've fixed this as part 1 of the lwlock scalability work in [1], but
Heikki rightfully suggested that a) this should be backpatched b) done
in a separate commit.

There is the question what to do about the branches without barriers? I
guess a SpinLockAcquire()/Release() would do? Or do we decide it's not
important enough to matter, since it's not an issue on x86?

[1] 
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=commitdiff;h=2de11eb11bb3e3777f6d384de0af9c2f77960637

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GiST, getting the record in table that the leaf entry points to

2014-02-10 Thread Marios Vodas
Hello,
 
What I would like to do is to get the record in the table that a leaf GISTENTRY 
points to, if that is possible.
I notice that GISTENTRY contains these members: Relation rel, Page page, and 
OffsetNumber offset, but are these referring to the table or the index?
 
Thank you,
Marios Vodas
  

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-10 Thread Amit Kapila
On Thu, Feb 6, 2014 at 5:57 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Thu, Feb 6, 2014 at 9:13 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 Considering above change as correct, I have tried to see the worst
 case overhead for this patch by having new tuple such that after
 25% or so of suffix/prefix match, there is a small change in tuple
 and kept rest of tuple same as old tuple and it shows overhead
 for this patch as well.

 Updated test script is attached.

 Unpatched
  testname | wal_generated | duration
 --+---+--
  ten long fields, 8 bytes changed | 348843824 | 5.56866788864136
  ten long fields, 8 bytes changed | 348844800 | 5.84434294700623
  ten long fields, 8 bytes changed | 35050 | 5.92329406738281
 (3 rows)



 wal-update-prefix-suffix-encode-1.patch

  testname | wal_generated | duration
 --+---+--
  ten long fields, 8 bytes changed | 348845624 | 6.92243480682373
  ten long fields, 8 bytes changed | 348847000 | 8.35828399658203
  ten long fields, 8 bytes changed | 350204752 | 7.61826491355896
 (3 rows)

 One minor point, can we avoid having prefix tag if prefixlen is 0.

 + /* output prefix as a tag */
 + pgrb_out_tag(ctrlp, ctrlb, ctrl, bp, prefixlen, hlen);

I think generating out tag for suffix/prefix has one bug i.e it doesn't
consider the max length of 273 bytes (PGLZ_MAX_MATCH ) which
is mandatory for LZ format.

One more point about this patch is that in function pgrb_delta_encode(),
is it mandatory to return at end in below check:

if (result_size  result_max)
return false;

I mean to say as before starting for copying literal bytes we have
already ensured that the compressed data is greater than 25%, so
may be we can avoid this check. I have tried to take the data by
removing this check and found that it reduces overhead and improves
WAL reduction as well. The data is as below (compare this with data
in above mail for unpatched version data):


 testname | wal_generated | duration
--+---+--
 ten long fields, 8 bytes changed | 300705552 | 6.51416897773743
 ten long fields, 8 bytes changed | 300703816 | 6.85267090797424
 ten long fields, 8 bytes changed | 300701840 | 7.15832996368408
(3 rows)

If we want to go with this approach, then I think apart from above
points there is no major change required (may be some comments,
function names etc. can be improved).

 But if we really just want to do prefix/suffix compression, this is a
 crappy and expensive way to do it.  We needn't force everything
 through the pglz tag format just because we elide a common prefix or
 suffix.

Here are you bothered about below code where the patch is
doing byte-by-byte copy after prefix/suffix match?

/* output bytes between prefix and suffix as literals */
dp = source[prefixlen];
dend = source[slen - suffixlen];
while (dp  dend)
{
pgrb_out_literal(ctrlp, ctrlb, ctrl, bp, *dp);
dp++; /* Do not do this ++ in the line above! */
}

I think if we want to change LZ format, it will be bit more work and
verification for decoding has to be done much more strenuously.

Note - During performance test, I have focussed mainly on worst case,
because we already know that this idea is good for best and average cases.
However if we decide that this is better and good to proceed, I can take
the data for other cases as well.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 On 02/06/2014 01:48 AM, Tom Lane wrote:
 switching to binary is the same as text may well be the most prudent
 path here.

 Can't we just reject attempts to transfer these via binary copy,
 allowing only a text format? So rather than sending text when the binary
 is requested, we just require clients to use text for this type.

That used to be the case, back when we didn't have send/recv functions for
all built-in types; and client-code authors complained bitterly about it.
It's pretty much unworkable if the text/binary choice is being made by
a code level that doesn't have complete understanding of the queries it's
transmitting.  Consider SELECT * FROM ...; how are you going to know
which columns to request in binary and which in text?  Even if you're
willing to do trial and error (ie, it's okay to cause transaction
rollbacks), the backend isn't very helpful about telling you exactly
which column(s) would need to be requested as text.

I think the downthread solution of prepending a type-specific format ID
byte is a better answer for giving us flexibility down the road.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Breaking compile-time dependency cycles of Postgres subdirs?

2014-02-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Feb 7, 2014 at 7:39 AM, Christian Convey
 christian.con...@gmail.com wrote:
 As someone very new to this code base, I think these cycles make it a little
 harder to figure out the runtime and compile-time dependencies between the
 subsystems these directories seem to represent.  I wonder if that's a
 problem others face as well?

 There are probably some cases that could be improved, but I have my
 doubts about whether eliminating cycles is a reasonable goal.

Aside from Robert's points, I have a couple of thoughts:

I think if it had been a clear, enforced goal all along, it might've been
possible to build the system with such a restriction (for the most part at
least).  At this point though, the amount of work and code churn involved
seems like it'd far exceed the benefits.

It's also fair to question how much improvement in comprehensibility
we'd really get.  It's not like code's been dropped into completely
random places where it doesn't belong.  In the end, Postgres is a pretty
big system and it's necessarily going to take time for newbies to learn
their way around it.

I believe there are some cases where circularity is just about
unavoidable.  As an example, the error reporting code in elog.c depends
on memory management in mcxt.c, which itself uses elog.c's reporting
facilities.  There's another mutual dependency between error reporting
and GUC (server configuration control).  And on and on.  I think the
coding rule you're suggesting would require that each such dependency
loop be confined to one major backend subsystem, which seems rather
arbitrary.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Breaking compile-time dependency cycles of Postgres subdirs?

2014-02-10 Thread Christian Convey
On Mon, Feb 10, 2014 at 10:28 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 I think if it had been a clear, enforced goal all along, it might've been
 possible to build the system with such a restriction (for the most part at
 least).  At this point though, the amount of work and code churn involved
 seems like it'd far exceed the benefits.


That makes sense to me.  I certainly didn't think it was a slam-dunk that
what I was proposing would be an improvement.  It just seemed like a
question worth asking.  Thanks for your thoughts.

- Christian


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-10 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 So what we need to do is to acquire a write barrier between the
 assignments to lwWaitLink and lwWaiting, i.e.
 proc-lwWaitLink = NULL;
 pg_write_barrier();
 proc-lwWaiting = false;

You didn't really explain why you think that ordering is necessary?
Each proc being awoken will surely see both fields updated, and other
procs won't be examining these fields at all, since we already delinked
all these procs from the LWLock's queue.

 There is the question what to do about the branches without barriers? I
 guess a SpinLockAcquire()/Release() would do? Or do we decide it's not
 important enough to matter, since it's not an issue on x86?

Given the lack of trouble reports that could be traced to this,
I don't feel a need to worry about it in branches that don't
have any barrier support.  But in any case, I'm not convinced
there's a bug here at all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-10 Thread Andres Freund
On 2014-02-10 11:11:28 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  So what we need to do is to acquire a write barrier between the
  assignments to lwWaitLink and lwWaiting, i.e.
  proc-lwWaitLink = NULL;
  pg_write_barrier();
  proc-lwWaiting = false;
 
 You didn't really explain why you think that ordering is necessary?
 Each proc being awoken will surely see both fields updated, and other
 procs won't be examining these fields at all, since we already delinked
 all these procs from the LWLock's queue.

The problem is that one the released backends could wake up concurrently
because of a unrelated, or previous PGSemaphoreUnlock(). It could see
lwWaiting = false, and thus wakeup and acquire the lock, even if the
store for lwWaitLink hasn't arrived (or performed, there's no guaranteed
ordering here) yet.
Now, it may well be that there's no practical consequence of that, but I
am not prepared to bet on it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-10 Thread Andres Freund
On 2014-02-10 11:20:30 -0500, Tom Lane wrote:
 I wrote:
  You didn't really explain why you think that ordering is necessary?
 
 Actually, after grepping to check my memory of what those fields are
 being used for, I have a bigger question: WTF is xlog.c doing being
 so friendly with the innards of LWLocks?  Surely this needs to get
 refactored so that most of WakeupWaiters() and friends is in lwlock.c.
 Or has all notion of modularity gone out the window while I wasn't
 looking?

Well, it's not actually using any lwlock.c code, it's a special case
locking logic, just reusing the datastructures. That said, I am not
particularly happy about the amount of code it's duplicating from
lwlock.c. Pretty much all of WALInsertSlotReleaseOne and most of
WALInsertSlotAcquireOne() is a copied.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal, patch: allow multiple plpgsql plugins

2014-02-10 Thread Pavel Stehule
Hello Marko


2014-01-16 23:54 GMT+01:00 Marko Tiikkaja ma...@joh.to:

 Hi Pavel,

 First of all, thanks for working on this!


 On 1/12/14, 8:58 PM, Pavel Stehule wrote:

 I still not happy with plugin_info - it is only per plugin now and should
 be per plugin and per function.


 I'm not sure I understand the point of plugin_info in the first place, but
 what would having a separate info per (plugin, function) achieve that can't
 be done otherwise?


 As for the current patch, I'd like to see improvements on a few things:

   1) It doesn't currently compile because of extra semicolons in the
  PLpgSQL_plugin struct.


fixed



   2) The previous comment above the same struct still talk about the
  rendezvous variable which is now gone.  The comment should be
  updated to reflect the new API.


removed



   3) The same comment talks about how important it is to unregister a
  plugin if its _PG_fini() is ever called, but the current API does
  not support unregistering.  That should probably be added?  I'm not
  sure when _PG_fini() would be called.


removed

These plugins should not be removed - there is no any mechanism how to
remove active plugin without close session

Regards

Pavel



   4) The comment  /* reserved for use by optional plugin */  seems a bit
  weird in its new context.


 Regards,
 Marko Tiikkaja

commit bf0820786f08b08812bba3d86233cbac30922054
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Mon Feb 10 17:50:31 2014 +0100

initial

diff --git a/src/pl/plpgsql/src/Makefile b/src/pl/plpgsql/src/Makefile
index 852b0c7..37d17a8 100644
--- a/src/pl/plpgsql/src/Makefile
+++ b/src/pl/plpgsql/src/Makefile
@@ -19,7 +19,7 @@ rpath =
 
 OBJS = pl_gram.o pl_handler.o pl_comp.o pl_exec.o pl_funcs.o pl_scanner.o
 
-DATA = plpgsql.control plpgsql--1.0.sql plpgsql--unpackaged--1.0.sql
+DATA = plpgsql.control plpgsql--1.1.sql plpgsql--1.0--1.1.sql plpgsql--unpackaged--1.0.sql
 
 all: all-lib
 
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 3749fac..e6510f1 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -86,6 +86,22 @@ typedef struct SimpleEcontextStackEntry
 static EState *shared_simple_eval_estate = NULL;
 static SimpleEcontextStackEntry *simple_econtext_stack = NULL;
 
+/*
+ * List of pointers and info of registered plugins.
+ */
+typedef struct PluginPtrEntry
+{
+	PLpgSQL_plugin *plugin_ptr;
+	struct PluginPtrEntry *next;
+} PluginPtrEntry;
+
+/*
+ * Allocated in TopMemoryContext
+ */
+static PluginPtrEntry *plugins = NULL;
+static int nplugins = 0;
+static int used_plugin_hook_types = 0;
+
 /
  * Local function forward declarations
  /
@@ -235,6 +251,18 @@ static char *format_expr_params(PLpgSQL_execstate *estate,
 const PLpgSQL_expr *expr);
 static char *format_preparedparamsdata(PLpgSQL_execstate *estate,
 	   const PreparedParamsData *ppd);
+static void **get_plugin_info(PLpgSQL_execstate *estate, int plugin_number);
+
+
+/* Bits for used plugin callback types */
+#define PLUGIN_HOOK_TYPE_FUNC_SETUP	(1  0)
+#define PLUGIN_HOOK_TYPE_FUNC_BEG	(1  2)
+#define PLUGIN_HOOK_TYPE_FUNC_END	(1  3)
+#define PLUGIN_HOOK_TYPE_STMT_BEG	(1  4)
+#define PLUGIN_HOOK_TYPE_STMT_END	(1  5)
+
+#define EXEC_PLUGIN_HOOK_TYPE(type) \
+	((used_plugin_hook_types  (PLUGIN_HOOK_TYPE_ ## type)) == (PLUGIN_HOOK_TYPE_ ## type))
 
 
 /* --
@@ -331,10 +359,28 @@ plpgsql_exec_function(PLpgSQL_function *func, FunctionCallInfo fcinfo,
 	exec_set_found(estate, false);
 
 	/*
-	 * Let the instrumentation plugin peek at this function
+	 * Let the instrumentation plugins peek at this function
 	 */
-	if (*plugin_ptr  (*plugin_ptr)-func_beg)
-		((*plugin_ptr)-func_beg) (estate, func);
+	if (EXEC_PLUGIN_HOOK_TYPE(FUNC_BEG))
+	{
+		PluginPtrEntry *pe;
+		int i;
+
+		Assert(plugins != NULL);
+
+		for (i = 0, pe = plugins; pe != NULL; pe = pe-next, i++)
+		{
+			PLpgSQL_plugin *pl_ptr = pe-plugin_ptr;
+
+			if (pl_ptr-func_beg)
+			{
+void **plugin_info;
+
+plugin_info = get_plugin_info(estate, i);
+(pl_ptr-func_beg) (estate, func, plugin_info);
+			}
+		}
+	}
 
 	/*
 	 * Now call the toplevel block of statements
@@ -479,10 +525,28 @@ plpgsql_exec_function(PLpgSQL_function *func, FunctionCallInfo fcinfo,
 	estate.err_text = gettext_noop(during function exit);
 
 	/*
-	 * Let the instrumentation plugin peek at this function
+	 * Let the instrumentation plugins peek at this function
 	 */
-	if (*plugin_ptr  (*plugin_ptr)-func_end)
-		((*plugin_ptr)-func_end) (estate, func);
+	if (EXEC_PLUGIN_HOOK_TYPE(FUNC_END))
+	{
+		PluginPtrEntry *pe;
+		int i;
+
+		Assert(plugins != NULL);
+
+		for (i = 0, pe = plugins; pe != NULL; i++, pe = pe-next)
+		{
+			PLpgSQL_plugin *pl_ptr = pe-plugin_ptr;
+
+			if (pl_ptr-func_end)
+			{
+void **plugin_info;
+
+

Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-10 Thread Tom Lane
I wrote:
 You didn't really explain why you think that ordering is necessary?

Actually, after grepping to check my memory of what those fields are
being used for, I have a bigger question: WTF is xlog.c doing being
so friendly with the innards of LWLocks?  Surely this needs to get
refactored so that most of WakeupWaiters() and friends is in lwlock.c.
Or has all notion of modularity gone out the window while I wasn't
looking?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for Todo Item : Provide fallback_application_name in contrib/pgbench, oid2name, and dblink

2014-02-10 Thread Jeff Janes
On Sun, Feb 9, 2014 at 4:56 PM, Robert Haas robertmh...@gmail.com wrote:

 On Sun, Feb 9, 2014 at 6:33 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 
  Since this commit (17676c785a95b2598c573), pgbench no longer uses
 .pgpass to
  obtain passwords, but instead prompts for a password
 
  This problem is in 9.3 and 9.4dev
 
  According to strace, it is reading the .pgpass file, it just seem like
 it is
  not using it.

 Hmm.  I tried pgbench with the following .pgpass file and it worked
 OK.  Removing the file caused pgbench to prompt for a password.

 *:*:*:*:foo


OK, that works for me.  I had it completely specified.  Playing with
variations on this, I see that the key is pgport.  Set to * it works, set
to 5432 it prompts for the password.  (If I specify -p 5432 to pgbench,
that would work with the original file)



 Presumably whatever behavior difference you are seeing is somehow
 related to the use of PQconnectdbParams() rather than PQsetdbLogin(),
 but the fine manual does not appear to document a different between
 those functions as regards password-prompting behavior or .pgpass
 usage.


It looks like PQsetdbLogin() has either NULL or empty string passed to it
match 5432 in pgpass, while PQconnectdbParams() only has NULL match 5432
and empty string does not.  pgbench uses empty string if no -p is specified.


This make pgbench behave the way I think is correct, but it hardly seems
like the right way to fix it.


*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
*** doConnect(void)
*** 528,533 
--- 528,535 

new_pass = false;

+   if (values[1][0] == 0) values[1]=NULL;
+
conn = PQconnectdbParams(keywords, values, true);

if (!conn)


Cheers,

Jeff


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-10 Thread Heikki Linnakangas

On 02/10/2014 06:41 PM, Andres Freund wrote:

On 2014-02-10 11:20:30 -0500, Tom Lane wrote:

I wrote:

You didn't really explain why you think that ordering is necessary?


Actually, after grepping to check my memory of what those fields are
being used for, I have a bigger question: WTF is xlog.c doing being
so friendly with the innards of LWLocks?  Surely this needs to get
refactored so that most of WakeupWaiters() and friends is in lwlock.c.
Or has all notion of modularity gone out the window while I wasn't
looking?


Well, it's not actually using any lwlock.c code, it's a special case
locking logic, just reusing the datastructures. That said, I am not
particularly happy about the amount of code it's duplicating from
lwlock.c. Pretty much all of WALInsertSlotReleaseOne and most of
WALInsertSlotAcquireOne() is a copied.


I'm not too happy with the amount of copy-paste myself, but there was 
enough difference to regular lwlocks that I didn't want to bother all 
lwlocks with the xlog-specific stuff either. The WAL insert slots do 
share the LWLock-related PGPROC fields though, and semaphore. I'm all 
ears if you have ideas on that..


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Merlin Moncure
On Mon, Feb 10, 2014 at 6:39 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
 On 02/10/2014 05:05 AM, Andres Freund wrote:
 I'd suggest making the format discernible from possible different future
 formats, to allow introducing a proper binary at some later time. Maybe
 just send a int8 first, containing the format.
 

 Teodor privately suggested something similar.  I was thinking of just
 sending a version byte, which for now would be '\x01'. An int8 seems like
 more future-proofing provision than we really need.

 Hm. Isn't that just about the same? I was thinking of the c type int8,
 not the 64bit type. It seems cleaner to do a pg_sendint(..., 1, 1) than
 to do it manually inside the string.

-1.   Currently no other wire format types send version and it's not
clear why this one is special.  We've changed the wire format versions
before and it's upon the client to deal with those changes.  The
server version *is* the version basically.  If a broader solution
exists I think it should be addressed broadly.  Versioning one type
only IMNSHO is a complete hack.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-10 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 02/10/2014 06:41 PM, Andres Freund wrote:
 Well, it's not actually using any lwlock.c code, it's a special case
 locking logic, just reusing the datastructures. That said, I am not
 particularly happy about the amount of code it's duplicating from
 lwlock.c. Pretty much all of WALInsertSlotReleaseOne and most of
 WALInsertSlotAcquireOne() is a copied.

 I'm not too happy with the amount of copy-paste myself, but there was 
 enough difference to regular lwlocks that I didn't want to bother all 
 lwlocks with the xlog-specific stuff either. The WAL insert slots do 
 share the LWLock-related PGPROC fields though, and semaphore. I'm all 
 ears if you have ideas on that..

I agree that if the behavior is considerably different, we don't really
want to try to make LWLockAcquire/Release cater to both this and their
standard behavior.  But why not add some additional functions in lwlock.c
that do what xlog wants?  If we're going to have mostly-copy-pasted logic,
it'd at least be better if it was in the same file, and not somewhere
that's not even in the same major subtree.

Also, the reason that LWLock isn't an abstract struct is because we wanted
to be able to embed it in other structs; *not* as license for other
modules to fool with its contents.  If we were working in C++ we'd
certainly have made all its fields private.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2014-02-10 Thread Peter Geoghegan
On Mon, Feb 10, 2014 at 11:57 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 The relcache parts? I don't think a separate patch ever appeared that could
 be reviewed.


I posted the patch on January 18th:
http://www.postgresql.org/message-id/cam3swzth4vkesot7dcrwbprn7zzhnz-wa6zmvo1ff7gbnoj...@mail.gmail.com

I was under the impression that you agreed that this was independently
valuable, regardless of the outcome here.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-10 Thread Heikki Linnakangas

On 02/10/2014 03:46 PM, Andres Freund wrote:

Hi,

During the lwlock scalability work I noticed a longstanding issue with
the lwlock code. LWLockRelease() and the other mentioned locations do
the following to wake up any waiters, without holding the lock's
spinlock:
 /*
  * Awaken any waiters I removed from the queue.
  */
 while (head != NULL)
 {
 LOG_LWDEBUG(LWLockRelease, T_NAME(l), T_ID(l), release waiter);
 proc = head;
 head = proc-lwWaitLink;
 proc-lwWaitLink = NULL;
 proc-lwWaiting = false;
 PGSemaphoreUnlock(proc-sem);
 }

which means they manipulate the lwWaitLink queue without
protection. That's done intentionally. The code tries to protect against
corruption of the list to do a woken up backend acquiring a lock (this
or an independent one) by only continuing when the lwWaiting flag is set
to false. Unfortunately there's absolutely no guarantee that a) the
assignment to lwWaitLink and lwWaiting are done in that order b) that
the stores are done in-order from the POV of other backends.
So what we need to do is to acquire a write barrier between the
assignments to lwWaitLink and lwWaiting, i.e.
 proc-lwWaitLink = NULL;
 pg_write_barrier();
 proc-lwWaiting = false;
the reader side already uses an implicit barrier by using spinlocks.

I've fixed this as part 1 of the lwlock scalability work in [1], but
Heikki rightfully suggested that a) this should be backpatched b) done
in a separate commit.

There is the question what to do about the branches without barriers? I
guess a SpinLockAcquire()/Release() would do?


The other alternative we discussed on IM is to unlink the waiters from 
the linked list, while still holding the spinlock. We can't do the 
PGSemaphoreUnlock() call to actually wake up the waiters while holding 
the spinlock, but all the shared memory manipulations we could. It would 
move all the modifications of the shared structures under the spinlock, 
which feels comforting.


It would require using some-sort of a backend-private data structure to 
hold the list of processes to wake up. We don't want to palloc() in 
LWLockRelease(), but we could malloc() a large-enough array once at 
process initialization, and use that on all LWLockRelease() calls.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Mon, Feb 10, 2014 at 6:39 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
 Teodor privately suggested something similar.  I was thinking of just
 sending a version byte, which for now would be '\x01'. An int8 seems like
 more future-proofing provision than we really need.

 -1.   Currently no other wire format types send version and it's not
 clear why this one is special.  We've changed the wire format versions
 before and it's upon the client to deal with those changes.

Really?  How would you expect to do that, exactly?  In particular,
how would you propose that a binary pg_dump file be reloadable if
we redefine the binary format down the road without having made
provision like this?

 Versioning one type only IMNSHO is a complete hack.

I don't feel a need for versioning int, or float8, or most other types;
and that includes the ones for which we've previously defined binary
format as equivalent to text (enums).  In this case we know that we're not
totally satisfied with the binary format we're defining today, so I think
a type-specific escape hatch is a reasonable solution.

Moreover, I don't especially buy tying it to server version, even if we
had an information pathway that would provide that reliably in all
contexts.  Granting the presumption that more than one data type would
want such versioning, it's still possible that different data types would
have different ideas about what they needed to do and where the cutover
points were.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Terminating pg_basebackup background streamer

2014-02-10 Thread Heikki Linnakangas

On 02/09/2014 02:17 PM, Magnus Hagander wrote:

If an error occurs in the foreground (backup) process of pg_basebackup, and
we exit in a controlled way, the background process (streaming xlog
process) would stay around and keep streaming.

This can happen for example if disk space runs out and there is very low
activity on the server. (If there is activity on the server, the background
streamer will also run out of disk space and exit)

Attached patch kills it off in disconnect_and_exit(), which seems like the
right thing to do to me.

Any objections to applying and backpatching that for the upcoming minor
releases?


Do you get a different error message with this patch than before? Is the 
new one better than the old one?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-10 Thread Heikki Linnakangas

On 02/10/2014 08:03 PM, Tom Lane wrote:

Heikki Linnakangas hlinnakan...@vmware.com writes:

On 02/10/2014 06:41 PM, Andres Freund wrote:

Well, it's not actually using any lwlock.c code, it's a special case
locking logic, just reusing the datastructures. That said, I am not
particularly happy about the amount of code it's duplicating from
lwlock.c. Pretty much all of WALInsertSlotReleaseOne and most of
WALInsertSlotAcquireOne() is a copied.



I'm not too happy with the amount of copy-paste myself, but there was
enough difference to regular lwlocks that I didn't want to bother all
lwlocks with the xlog-specific stuff either. The WAL insert slots do
share the LWLock-related PGPROC fields though, and semaphore. I'm all
ears if you have ideas on that..


I agree that if the behavior is considerably different, we don't really
want to try to make LWLockAcquire/Release cater to both this and their
standard behavior.  But why not add some additional functions in lwlock.c
that do what xlog wants?  If we're going to have mostly-copy-pasted logic,
it'd at least be better if it was in the same file, and not somewhere
that's not even in the same major subtree.


Ok, I'll try to refactor it that way, so that we can see if it looks better.


Also, the reason that LWLock isn't an abstract struct is because we wanted
to be able to embed it in other structs; *not* as license for other
modules to fool with its contents.  If we were working in C++ we'd
certainly have made all its fields private.


Um, xlog.c is doing no such thing. The insertion slots use a struct of 
their own, which is also copy-pasted from LWLock (with one additional 
field).


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

2014-02-10 Thread Peter Geoghegan
On Sun, Jan 19, 2014 at 2:17 AM, Peter Geoghegan p...@heroku.com wrote:
 I'm just throwing an error when locking the tuple returns
 HeapTupleInvisible, and the xmin of the tuple is our xid.

I would like some feedback on this point. We need to consider how
exactly to avoid updating the same tuple inserted by our command.
Updating a tuple we inserted cannot be allowed to happen, not least
because to do so causes livelock.

A related consideration that I raised in mid to late January that
hasn't been commented on is avoiding updating the same tuple twice,
and where we come down on that with respect to where our
responsibility to the user starts and ends. For example, SQL MERGE
officially forbids this, but MySQL's INSERT...ON DUPLICATE KEY UPDATE
seems not to, probably due to implementation considerations.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Terminating pg_basebackup background streamer

2014-02-10 Thread Magnus Hagander
On Mon, Feb 10, 2014 at 7:29 PM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 02/09/2014 02:17 PM, Magnus Hagander wrote:

 If an error occurs in the foreground (backup) process of pg_basebackup,
 and
 we exit in a controlled way, the background process (streaming xlog
 process) would stay around and keep streaming.

 This can happen for example if disk space runs out and there is very low
 activity on the server. (If there is activity on the server, the
 background
 streamer will also run out of disk space and exit)

 Attached patch kills it off in disconnect_and_exit(), which seems like the
 right thing to do to me.

 Any objections to applying and backpatching that for the upcoming minor
 releases?


 Do you get a different error message with this patch than before? Is the
 new one better than the old one?


Previously you got double error messages - one from the foreground, and a
second one from the background sometime in the future (whenever it
eventually failed, and for whatever reason - so if it was out of disk
space, it would complain about that once it got enough xlog for it to
happen).

With the patch you just get the error message from the first process. The
background process doesn't give an error on SIGTERM, it just exists.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] PoC: Partial sort

2014-02-10 Thread Alexander Korotkov
On Mon, Feb 10, 2014 at 2:33 PM, Marti Raudsepp ma...@juffo.org wrote:

 On Sun, Feb 9, 2014 at 7:37 PM, Alexander Korotkov aekorot...@gmail.com
 wrote:
  This is not only place that worry me about planning overhead. See
  get_cheapest_fractional_path_for_pathkeys. I had to estimate number of
  groups for each sorting column in order to get right fractional path.

 AFAICT this only happens once per plan and the overhead is O(n) to the
 number of pathkeys? I can't get worried about that, but I guess it's
 better to test anyway.

 PS: You didn't answer my questions about splitting the patch. I guess
 I'll have to do that anyway to run the tests.


Done. Patch is splitted.

--
With best regards,
Alexander Korotkov.


partial-sort-basic-1.patch.gz
Description: GNU Zip compressed data


partial-sort-merge-1.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Merlin Moncure
On Mon, Feb 10, 2014 at 12:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 On Mon, Feb 10, 2014 at 6:39 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
 On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
 Teodor privately suggested something similar.  I was thinking of just
 sending a version byte, which for now would be '\x01'. An int8 seems like
 more future-proofing provision than we really need.

 -1.   Currently no other wire format types send version and it's not
 clear why this one is special.  We've changed the wire format versions
 before and it's upon the client to deal with those changes.

 Really?  How would you expect to do that, exactly?  In particular,
 how would you propose that a binary pg_dump file be reloadable if
 we redefine the binary format down the road without having made
 provision like this?

 Versioning one type only IMNSHO is a complete hack.

 I don't feel a need for versioning int, or float8, or most other types;
 and that includes the ones for which we've previously defined binary
 format as equivalent to text (enums).  In this case we know that we're not
 totally satisfied with the binary format we're defining today, so I think
 a type-specific escape hatch is a reasonable solution.

 Moreover, I don't especially buy tying it to server version, even if we
 had an information pathway that would provide that reliably in all
 contexts.

Why not?  Furthermore what are we doing now?  If we need a binary
format contract that needs to be separated from this discussion.

I've written (along with Andrew C) the only serious attempt to deal
with client side binary format handling (http://libpqtypes.esilo.com/)
and in all interesting cases it depends on the server version to
define binary parsing behaviors.   I agree WRT float8, etc but other
types have changed in a couple of cases and it's always been with the
version.   I find it highly unlikely that any compatibility behaviors
are going to be defined *for each and every returned datum* now, or
ever...so even if it's a few bytes lost, why do it?  Intra-version
compatibility issues should they ever have to be handled would be more
likely handled at connection- or query- time.

Point being, if an escape hatch is needed, I'm near 100% certain this
is not the right place to do it.  Binary wire format compatibility is
a complex topic and proposed solution ISTM is not at all fleshed out.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2014 - mentors, students and admins

2014-02-10 Thread Alexander Korotkov
Hi!

On Tue, Jan 28, 2014 at 9:34 PM, Thom Brown t...@linux.com wrote:

 And I'd be fine with being admin again this year, unless there's
 anyone else who would like to take up the mantle?


Thanks for your work. I would like to see you as admin this year again.

Who would be up for mentoring this year?  And are there any project
 ideas folk would like to suggest?


I would like to be mentor.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-02-10 Thread Andres Freund
On 2014-02-10 19:48:47 +0200, Heikki Linnakangas wrote:
 On 02/10/2014 06:41 PM, Andres Freund wrote:
 On 2014-02-10 11:20:30 -0500, Tom Lane wrote:
 I wrote:
 You didn't really explain why you think that ordering is necessary?
 
 Actually, after grepping to check my memory of what those fields are
 being used for, I have a bigger question: WTF is xlog.c doing being
 so friendly with the innards of LWLocks?  Surely this needs to get
 refactored so that most of WakeupWaiters() and friends is in lwlock.c.
 Or has all notion of modularity gone out the window while I wasn't
 looking?
 
 Well, it's not actually using any lwlock.c code, it's a special case
 locking logic, just reusing the datastructures. That said, I am not
 particularly happy about the amount of code it's duplicating from
 lwlock.c. Pretty much all of WALInsertSlotReleaseOne and most of
 WALInsertSlotAcquireOne() is a copied.
 
 I'm not too happy with the amount of copy-paste myself, but there was enough
 difference to regular lwlocks that I didn't want to bother all lwlocks with
 the xlog-specific stuff either. The WAL insert slots do share the
 LWLock-related PGPROC fields though, and semaphore. I'm all ears if you have
 ideas on that..

The lwlock scalability stuff has separated out the enqueue/wakeup code,
that probably should work here as well? And that's a fair portion of the
code. I think it should be doable to make that generic enough that the
actual difference of the struct doesn't matter. It'd also reduce
duplication of LWLockAcquire, ConditionalAcquire, OrWait.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andres Freund
On 2014-02-10 11:59:53 -0600, Merlin Moncure wrote:
 On Mon, Feb 10, 2014 at 6:39 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
  On 02/10/2014 05:05 AM, Andres Freund wrote:
  I'd suggest making the format discernible from possible different future
  formats, to allow introducing a proper binary at some later time. Maybe
  just send a int8 first, containing the format.
  
 
  Teodor privately suggested something similar.  I was thinking of just
  sending a version byte, which for now would be '\x01'. An int8 seems like
  more future-proofing provision than we really need.
 
  Hm. Isn't that just about the same? I was thinking of the c type int8,
  not the 64bit type. It seems cleaner to do a pg_sendint(..., 1, 1) than
  to do it manually inside the string.
 
 -1.   Currently no other wire format types send version and it's not
 clear why this one is special.  We've changed the wire format versions
 before and it's upon the client to deal with those changes.  The
 server version *is* the version basically.  If a broader solution
 exists I think it should be addressed broadly.  Versioning one type
 only IMNSHO is a complete hack.

I don't find that very convincing. The entire reason jsonb exists is
because the parsing overhead of text json is significant, so it stands
to reason that soon somebody will try to work on a better wire protocol,
even if the current code cannot be made ready for 9.4. And I don't think
past instability of binary type's formats is a good reason for
*needlessly* breaking stuff like binary COPYs.
And it's not like one prefixed byte has any real-world relevant cost.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Merlin Moncure
On Mon, Feb 10, 2014 at 5:02 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-10 11:59:53 -0600, Merlin Moncure wrote:
 On Mon, Feb 10, 2014 at 6:39 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
  On 02/10/2014 05:05 AM, Andres Freund wrote:
  I'd suggest making the format discernible from possible different future
  formats, to allow introducing a proper binary at some later time. Maybe
  just send a int8 first, containing the format.
  
 
  Teodor privately suggested something similar.  I was thinking of just
  sending a version byte, which for now would be '\x01'. An int8 seems like
  more future-proofing provision than we really need.
 
  Hm. Isn't that just about the same? I was thinking of the c type int8,
  not the 64bit type. It seems cleaner to do a pg_sendint(..., 1, 1) than
  to do it manually inside the string.

 -1.   Currently no other wire format types send version and it's not
 clear why this one is special.  We've changed the wire format versions
 before and it's upon the client to deal with those changes.  The
 server version *is* the version basically.  If a broader solution
 exists I think it should be addressed broadly.  Versioning one type
 only IMNSHO is a complete hack.

 I don't find that very convincing. The entire reason jsonb exists is
 because the parsing overhead of text json is significant, so it stands
 to reason that soon somebody will try to work on a better wire protocol,
 even if the current code cannot be made ready for 9.4. And I don't think
 past instability of binary type's formats is a good reason for
 *needlessly* breaking stuff like binary COPYs.
 And it's not like one prefixed byte has any real-world relevant cost.

The point is, why does this one type get a version id?  Imagine a
hypothetical program that sent/received the binary format for jsonb.
All you have to to is manage the version flag appropriately, right?

Wrong.  You still need to have code that checks the server version and
see if it's supported (particularly for sending) and as there is *no
protocol negotiation of the formats at present it's all going to boil
down to if version = X do Y*.   How does the server know which
'versions' are ok to send? It doesn't.  Follow along with me here:
Suppose we don't introduce a version flag today and change the format
to some more exotic structure for 9.5.  How has the version flag made
things easier for the client?  It hasn't. The client goes if version
= X do Y.

I guess you could argue that having a version flag could, say, allow
libpq clients to gracefully error out if, say, a old non-exotic-format
speaking libpq happens to connect to a newer sever -- assuming the
client actually bothered to check the flag.  That's zero help to the
client though -- regardless the compatibility isn't established and
that's zero help to other binary formats that we have=, and probably
will continue to-, change.  What about them?  Are we now, at the
upteenth hour of the final commit fest, suddenly deciding that binary
wire formats going to be compatible across versions?

The kinda low effort way to deal with binary format compatibility is
to simply document the existing formats and document format changes in
some convenient place.  The 'real' long term path to doing it IMO is
to abstract out a shared/client server type library with some protocol
negotiation features.  Then, at connection time, the client/server
agree on what's the optimal way to send things -- perhaps the client
can signal things like 'want compression for long datums'.

The only case for a version flag at the data point level is if the
server is sending version X at this tuple and version Y at that tuple.
 I don't think that's a makable case.  Some might say, what about a
compression bit based on compressibility/length? and to that I'd
answer: why is that handling specific to the json type...are
text/bytea/arrays not worth that feature too?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andres Freund
On 2014-02-10 17:35:12 -0600, Merlin Moncure wrote:
 Wrong.  You still need to have code that checks the server version and
 see if it's supported (particularly for sending) and as there is *no
 protocol negotiation of the formats at present it's all going to boil
 down to if version = X do Y*.   How does the server know which
 'versions' are ok to send? It doesn't.  Follow along with me here:
 Suppose we don't introduce a version flag today and change the format
 to some more exotic structure for 9.5.  How has the version flag made
 things easier for the client?  It hasn't. The client goes if version
 = X do Y.

think of binary COPY outputting data in 9.4 and then trying to import
that data into 9.5. That's the interesting case here.

 What about them?  Are we now, at the
 upteenth hour of the final commit fest, suddenly deciding that binary
 wire formats going to be compatible across versions?

It has been a concern before.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Merlin Moncure
On Mon, Feb 10, 2014 at 5:38 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-10 17:35:12 -0600, Merlin Moncure wrote:
 Wrong.  You still need to have code that checks the server version and
 see if it's supported (particularly for sending) and as there is *no
 protocol negotiation of the formats at present it's all going to boil
 down to if version = X do Y*.   How does the server know which
 'versions' are ok to send? It doesn't.  Follow along with me here:
 Suppose we don't introduce a version flag today and change the format
 to some more exotic structure for 9.5.  How has the version flag made
 things easier for the client?  It hasn't. The client goes if version
 = X do Y.

 think of binary COPY outputting data in 9.4 and then trying to import
 that data into 9.5. That's the interesting case here.

right, json could be made work, but any other format change introduced
to any other already existing type will break.  That's not a real
solution unless we decree henceforth that no formats will change from
here on in, in which case I withdraw my objection.

I think COPY binary has exactly the same set of considerations as the
client side.  If you want to operate cleanly between versions (which
has never been promised in the past), you have to encode in a header
the kinds of things the server would need to parse it properly.
Starting with, but not necessarily limited to, the encoding server's
version.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andres Freund
On 2014-02-10 17:48:32 -0600, Merlin Moncure wrote:
 On Mon, Feb 10, 2014 at 5:38 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-02-10 17:35:12 -0600, Merlin Moncure wrote:
  Wrong.  You still need to have code that checks the server version and
  see if it's supported (particularly for sending) and as there is *no
  protocol negotiation of the formats at present it's all going to boil
  down to if version = X do Y*.   How does the server know which
  'versions' are ok to send? It doesn't.  Follow along with me here:
  Suppose we don't introduce a version flag today and change the format
  to some more exotic structure for 9.5.  How has the version flag made
  things easier for the client?  It hasn't. The client goes if version
  = X do Y.
 
  think of binary COPY outputting data in 9.4 and then trying to import
  that data into 9.5. That's the interesting case here.
 
 right, json could be made work, but any other format change introduced
 to any other already existing type will break.  That's not a real
 solution unless we decree henceforth that no formats will change from
 here on in, in which case I withdraw my objection.

Sure, it's not a full solution. But it's better than nothing, and it's
likely that we'll see breakage soonish. I don't think there's been much
recent mucking around with incompatible binary formats?

 I think COPY binary has exactly the same set of considerations as the
 client side.  If you want to operate cleanly between versions (which
 has never been promised in the past), you have to encode in a header
 the kinds of things the server would need to parse it properly.
 Starting with, but not necessarily limited to, the encoding server's
 version.

It works in enough cases atm that it's worthwile trying to keep it
working. Sure, it could be better, but it's what we have right now. Atm
it's e.g. the only realistic way to copy larger amounts of bytea between
servers without copying the entire cluster.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Merlin Moncure
On Mon, Feb 10, 2014 at 5:52 PM, Andres Freund and...@2ndquadrant.com wrote:
 It works in enough cases atm that it's worthwile trying to keep it
 working. Sure, it could be better, but it's what we have right now. Atm
 it's e.g. the only realistic way to copy larger amounts of bytea between
 servers without copying the entire cluster.

That's the thing -- it might work today, but what about tomorrow?
We'd be sending the wrong signals.  People start building processes
around all of this and now we've painted ourselves into a box.  Better
in my mind to simply educate users that this practice is dangerous and
unsupported, as we used to do. I guess until now.  It seems completely
odd to me that we're attaching a case to the jsonb type, in the wrong
way -- something that we've never attached to any other type before.
For example, why didn't we attach a version code to the json type send
function?  Wasn't the whole point of this is that jsonb send/recv be
more spiritually closer to json?  If we want to introduce alternative
type formats in the 9.5 cycle, why can't we attach version based
encoding handling to *that* problem?

The more angles I look at this from the more it looks messy and rushed.

Notwithstanding all the above, I figure here enough smart people
disagree (once again, heh) to call it consensus.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic shared memory and locks

2014-02-10 Thread Kohei KaiGai
2014-02-08 4:52 GMT+09:00 Robert Haas robertmh...@gmail.com:
 On Tue, Jan 21, 2014 at 11:37 AM, Robert Haas robertmh...@gmail.com wrote:
 One idea I just had is to improve the dsm_toc module so that it can
 optionally set up a tranche of lwlocks for you, and provide some
 analogues of RequestAddinLWLocks and LWLockAssign for that case.  That
 would probably make this quite a bit simpler to use, at least for
 people using it with dynamic shared memory.  But I think that's a
 separate patch.

 I played with this a bit today and it doesn't actually seem to
 simplify things very much.  The backend that creates the DSM needs to
 do this:

 lwlocks = shm_toc_allocate(toc, sizeof(LWLockPadded) * nlwlocks);
 for (i = 0; i  nlwlocks; ++i)
 LWLockInitialize(lwlocks[i].lock, tranche_id);

 Since that's all of three lines, encapsulating it doesn't look all
 that helpful.  Then each backend needs to do something like this:

 static LWLockTranche mytranche;
 mytranche.name = some descriptive module name;
 mytranche.array_base = lwlocks;
 mytranche.array_stride = sizeof(LWLockPadded);
 LWLockRegisterTranche(tranche_id, mytranche);

 That's not a lot of code either, and there's no obvious way to reduce
 it much by hooking into shm_toc.

 I thought maybe we needed some cleanup when the dynamic shared memory
 segment is unmapped, but looks like we really don't.  One can do
 something like LWLockRegisterTranche(tranche_id, NULL) for debugging
 clarity, but it isn't strictly needed; one can release all lwlocks
 from the tranche or assert that none are held, but that should really
 only be a problem if the user does something like
 LWLockAcquire(lock_in_the_segment); dsm_detach(seg), because error
 recovery starts by releasing *all* lwlocks.  And if the user does it
 explicitly, I'm kinda OK with that just seg faulting.  After all, the
 user could equally well have done dsm_detach(seg);
 LWLockAcquire(lock_in_the_segment) and there's no way at all to
 cross-check for that sort of mistake.

Does it make another problem if dsm_detach() also releases lwlocks
being allocated on the dsm segment to be released?
Lwlocks being held is tracked in the held_lwlocks[] array; its length is
usually 100. In case when dsm_detach() is called towards the segment
between addr ~ (addr + length), it seems to me an easy job to walk on
this array to find out lwlocks on the range.


 I do see one thing about the status quo that does look reasonably
 annoying: the code expects that tranche IDs are going to stay
 relatively small.  For example, consider a module foobar that uses DSM
 + LWLocks.  It won't do at all for each backend, on first use of the
 foobar module, to do LWLockNewTrancheId() and then reuse that
 tranche_id repeatedly for each new DSM - because over a system
 lifetime of months, tranche IDs could grow into the millions, causing
 LWLockTrancheArray to get really big (and eventually repalloc will
 fail).  Rather, the programmer needs to ensure that
 LWLockNewTrancheId() gets called *once per postmaster lifetime*,
 perhaps by allocating a chunk of permanent shared memory and using
 that to store the tranche_id that should be used each time an
 individual backend fires up a DSM.  Considering that one of the goals
 of dynamic shared memory is to allow modules to be loaded after
 postmaster startup and still be able to do useful stuff, that's kind
 of obnoxious.  I don't have a good idea what to do about it, though;
 making LWLockTrancheArray anything other than a flat array looks
 likely to slow down --enable-dtrace builds unacceptably.

Just my rough idea. Doesn't it make sense to have an offset from the
head of DSM segment that contains the lwlock, instead of the identifier
form, to indicate a cstring datum? It does not prevent to allocate it
later; after the postmaster starting up, and here is no problem if number
of dynamic lwlocks grows millions or more.
If lwlock has a tranche_offset, not tranche_id, all it needs to do is:
1. find out either of DSM segment or regular SHM segment that contains
the supplied lwlock.
2. Calculate mapped_address + tranche_offset; that is the local location
where cstring form is put.

Probably, it needs a common utility routine on dsm.c to find out
a particular DSM segment that contains the supplied address.
(Inverse function towards dsm_segment_address)

How about my ideas?

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andres Freund
On 2014-02-10 18:16:15 -0600, Merlin Moncure wrote:
 On Mon, Feb 10, 2014 at 5:52 PM, Andres Freund and...@2ndquadrant.com wrote:
  It works in enough cases atm that it's worthwile trying to keep it
  working. Sure, it could be better, but it's what we have right now. Atm
  it's e.g. the only realistic way to copy larger amounts of bytea between
  servers without copying the entire cluster.
 
 That's the thing -- it might work today, but what about tomorrow?
 We'd be sending the wrong signals.  People start building processes
 around all of this and now we've painted ourselves into a box.

That ship has sailed.

 Better in my mind to simply educate users that this practice is dangerous and
 unsupported, as we used to do.

But we don't have any alternatives for such scenarios, so that just
amounts to screw you. If there are good reason for just breaking
binary protocol compatibility, I can live with that, but that's really
not the case here. The additional amount of code is *miniscule*, even
after adding a real binary protocol format since all the code has to be
there for the plain send/recv functions anyway.

The amount of interesting and acceptable binary protocol changes has
gotten lower in step with the acceptance of on-disk compatibility
changes, which isn't particularly surprising.

 I guess until now.  It seems completely
 odd to me that we're attaching a case to the jsonb type, in the wrong
 way -- something that we've never attached to any other type before.
 For example, why didn't we attach a version code to the json type send
 function?  Wasn't the whole point of this is that jsonb send/recv be
 more spiritually closer to json?  If we want to introduce alternative
 type formats in the 9.5 cycle, why can't we attach version based
 encoding handling to *that* problem?

That doesn't make any sense to me. jsonb is a separate type because it
behaves differently than json. So I don't see how that plays a role
here.

And if we add a new format version in 9.5 we need to make it discernible
from the 9.4 format. Without space for a format indicator we'd have to
resort to ugly tricks like defining the high bit in the first byte set
indicates the new version. I don't see the improvement here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 right, json could be made work, but any other format change introduced
 to any other already existing type will break.  That's not a real
 solution unless we decree henceforth that no formats will change from
 here on in, in which case I withdraw my objection.

Well, I don't recall that we've made a practice of changing binary formats
a lot.  Doing so would break existing dumps, which is something we
strenuously avoid.

Even granting that sometime in the future we invent infrastructure to do
the kind of protocol negotiation you're talking about, one byte per JSON
value seems like a cheap and worthwhile cross-check that both ends came
to the same conclusion about what to send.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Merlin Moncure
On Mon, Feb 10, 2014 at 6:24 PM, Andres Freund and...@2ndquadrant.com wrote:
 And if we add a new format version in 9.5 we need to make it discernible
 from the 9.4 format. Without space for a format indicator we'd have to
 resort to ugly tricks like defining the high bit in the first byte set
 indicates the new version. I don't see the improvement here.

Point being: a 9.5 binary format reading server could look for a magic
token in the beginning of the file which would indicate the presence
of a header.  The server could then make intelligent decisions about
reading data inside the file which would be follow exactly the same
kinds of decisions binary format consuming client code would make.
Perhaps it would be a simple check on version, or something more
complex that would involve a negotiation.  The 'format' indicator,
should version not be precise enough, needs to be in the header, not
passed with every instance of the data type, and certainly not for one
type in the absence of others.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Mon, Feb 10, 2014 at 6:24 PM, Andres Freund and...@2ndquadrant.com wrote:
 And if we add a new format version in 9.5 we need to make it discernible
 from the 9.4 format. Without space for a format indicator we'd have to
 resort to ugly tricks like defining the high bit in the first byte set
 indicates the new version. I don't see the improvement here.

 Point being: a 9.5 binary format reading server could look for a magic
 token in the beginning of the file which would indicate the presence
 of a header.  The server could then make intelligent decisions about
 reading data inside the file which would be follow exactly the same
 kinds of decisions binary format consuming client code would make.
 Perhaps it would be a simple check on version, or something more
 complex that would involve a negotiation.  The 'format' indicator,
 should version not be precise enough, needs to be in the header, not
 passed with every instance of the data type, and certainly not for one
 type in the absence of others.

Basically, you want to move the goalposts to somewhere that's not only
out of reach today, but probably a few counties away from the stadium.
I don't see this happening at all frankly, because nobody has been
interested enough to work on something like it up to now.  And I
definitely don't see it as appropriate to block improvement of jsonb
until this happens.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Merlin Moncure
On Mon, Feb 10, 2014 at 6:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 On Mon, Feb 10, 2014 at 6:24 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
 And if we add a new format version in 9.5 we need to make it discernible
 from the 9.4 format. Without space for a format indicator we'd have to
 resort to ugly tricks like defining the high bit in the first byte set
 indicates the new version. I don't see the improvement here.

 Point being: a 9.5 binary format reading server could look for a magic
 token in the beginning of the file which would indicate the presence
 of a header.  The server could then make intelligent decisions about
 reading data inside the file which would be follow exactly the same
 kinds of decisions binary format consuming client code would make.
 Perhaps it would be a simple check on version, or something more
 complex that would involve a negotiation.  The 'format' indicator,
 should version not be precise enough, needs to be in the header, not
 passed with every instance of the data type, and certainly not for one
 type in the absence of others.

 Basically, you want to move the goalposts to somewhere that's not only
 out of reach today, but probably a few counties away from the stadium.
 I don't see this happening at all frankly, because nobody has been
 interested enough to work on something like it up to now.  And I
 definitely don't see it as appropriate to block improvement of jsonb
 until this happens.

That's completely unfair.  I'm arguing *not* to attach version
dependency expectations to the jsonb type, at all, not the other way
around.  If you want to do that, fine, but do it *later* as in, 9.5,
or beyond.  I just gave an example of how binary format changes could
be worked in later.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andres Freund
On 2014-02-10 19:01:48 -0600, Merlin Moncure wrote:
 On Mon, Feb 10, 2014 at 6:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Merlin Moncure mmonc...@gmail.com writes:
  On Mon, Feb 10, 2014 at 6:24 PM, Andres Freund and...@2ndquadrant.com 
  wrote:
  And if we add a new format version in 9.5 we need to make it discernible
  from the 9.4 format. Without space for a format indicator we'd have to
  resort to ugly tricks like defining the high bit in the first byte set
  indicates the new version. I don't see the improvement here.
 
  Point being: a 9.5 binary format reading server could look for a magic
  token in the beginning of the file which would indicate the presence
  of a header.  The server could then make intelligent decisions about
  reading data inside the file which would be follow exactly the same
  kinds of decisions binary format consuming client code would make.
  Perhaps it would be a simple check on version, or something more
  complex that would involve a negotiation.  The 'format' indicator,
  should version not be precise enough, needs to be in the header, not
  passed with every instance of the data type, and certainly not for one
  type in the absence of others.
 
  Basically, you want to move the goalposts to somewhere that's not only
  out of reach today, but probably a few counties away from the stadium.
  I don't see this happening at all frankly, because nobody has been
  interested enough to work on something like it up to now.  And I
  definitely don't see it as appropriate to block improvement of jsonb
  until this happens.
 
 That's completely unfair.  I'm arguing *not* to attach version
 dependency expectations to the jsonb type, at all, not the other way
 around.  If you want to do that, fine, but do it *later* as in, 9.5,
 or beyond.  I just gave an example of how binary format changes could
 be worked in later.

Comeon. Your way requires building HEAPS of new and generic
infrastructure in 9.5 and would only work for binary copy. The proposed
way requires about two lines of code. Without the generic infrastructure
we'd end up relying on some intracacies like the meaning of high bit in
the first byte or such.

Anyway, that's it on this subthread from me,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Tom Dunstan
On 10 February 2014 20:11, Hannu Krosing ha...@krosing.net wrote:
 The fastest and lowest parsing cost format for JSON is tnetstrings
 http://tnetstrings.org/ why not use it as the binary wire format ?

 It would be as binary as it gets and still be generally parse-able by
 lots of different platforms, at leas by all of these  we care about.

If we do go down the binary encoding path in a future release, can I
please suggest *not* using something like tnetstrings, which suffers
the same problem that a few binary transport formats suffer,
particularly when they're developed by people whose native language
doesn't distinguish between byte arrays and strings - all strings are
considered byte arrays and it's up to an application to decide on
character encoding and which things are data vs strings in the
application.

This makes writing a parser in a language which does treat byte arrays
and strings differently very difficult, see e.g. the java tnetstrings
API [1] which is forced into treating strings as byte arrays until the
programmer then asks it to parse the thing again, but please treat
everything as a string this time. The msgpack people after much
wrangling have ended up issuing a new version of the protocol which
avoids this issue and which they are strongly encouraging users to
switch to, see [2] for the gory details.

While we may not ever store types in our jsonb format other than the
standard json data types (I can foresee people wanting to do it,
though), I would strongly recommend picking a format which at least is
clear that a value is a string (text, whatever), and preferably makes
it clear what the character encoding is. Or maybe it should just
follow whatever the client encoding is at the time - as long as that
is completely unambiguous to a client.

Cheers

Tom

[1] https://github.com/asinger/tnetstringsj
[2] https://github.com/msgpack/msgpack/issues/128


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Merlin Moncure
On Monday, February 10, 2014, Andres Freund and...@2ndquadrant.com wrote:

 On 2014-02-10 19:01:48 -0600, Merlin Moncure wrote:
  On Mon, Feb 10, 2014 at 6:39 PM, Tom Lane t...@sss.pgh.pa.usjavascript:;
 wrote:
   Merlin Moncure mmonc...@gmail.com javascript:; writes:
   On Mon, Feb 10, 2014 at 6:24 PM, Andres Freund 
 and...@2ndquadrant.com javascript:; wrote:
   And if we add a new format version in 9.5 we need to make it
 discernible
   from the 9.4 format. Without space for a format indicator we'd have
 to
   resort to ugly tricks like defining the high bit in the first byte
 set
   indicates the new version. I don't see the improvement here.
  
   Point being: a 9.5 binary format reading server could look for a magic
   token in the beginning of the file which would indicate the presence
   of a header.  The server could then make intelligent decisions about
   reading data inside the file which would be follow exactly the same
   kinds of decisions binary format consuming client code would make.
   Perhaps it would be a simple check on version, or something more
   complex that would involve a negotiation.  The 'format' indicator,
   should version not be precise enough, needs to be in the header, not
   passed with every instance of the data type, and certainly not for one
   type in the absence of others.
  
   Basically, you want to move the goalposts to somewhere that's not only
   out of reach today, but probably a few counties away from the stadium.
   I don't see this happening at all frankly, because nobody has been
   interested enough to work on something like it up to now.  And I
   definitely don't see it as appropriate to block improvement of jsonb
   until this happens.
 
  That's completely unfair.  I'm arguing *not* to attach version
  dependency expectations to the jsonb type, at all, not the other way
  around.  If you want to do that, fine, but do it *later* as in, 9.5,
  or beyond.  I just gave an example of how binary format changes could
  be worked in later.

 Comeon. Your way requires building HEAPS of new and generic
 infrastructure in 9.5 and would only work for binary copy. The proposed
 way requires about two lines of code. Without the generic infrastructure
 we'd end up relying on some intracacies like the meaning of high bit in
 the first byte or such.

 Anyway, that's it on this subthread from me


Fair enough.  I'll concede the point.

merlin


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andres Freund
Hi,

Is it just me or is jsonapi.h not very well documented?

On 2014-02-06 18:47:31 -0500, Andrew Dunstan wrote:
 +/*
 + * for jsonb we always want the de-escaped value - that's what's in token
 + */
 +static void
 +jsonb_in_scalar(void *state, char *token, JsonTokenType tokentype)
 +{
 + JsonbInState *_state = (JsonbInState *) state;
 + JsonbValue  v;
 +
 + v.size = sizeof(JEntry);
 +
 + switch (tokentype)
 + {
 +
 + case JSON_TOKEN_STRING:
 + v.type = jbvString;
 + v.string.len = token ? checkStringLen(strlen(token)) : 
 0;
 + v.string.val = token ? pnstrdup(token, v.string.len) : 
 NULL;
 + v.size += v.string.len;
 + break;
 + case JSON_TOKEN_NUMBER:
 + v.type = jbvNumeric;
 + v.numeric = 
 DatumGetNumeric(DirectFunctionCall3(numeric_in, CStringGetDatum(token), 0, 
 -1));
 +
 + v.size += VARSIZE_ANY(v.numeric) +sizeof(JEntry) /* 
 alignment */ ;
missing space.

Why does + sizeof(JEntry) change anything about alignment? If it was
aligned before, adding a statically sized value doesn't give any new
guarantees about alignment?


 +/*
 + * jsonb type recv function
 + *
 + * the type is sent as text in binary mode, so this is almost the same
 + * as the input function.
 + */
 +Datum
 +jsonb_recv(PG_FUNCTION_ARGS)
 +{
 + StringInfo  buf = (StringInfo) PG_GETARG_POINTER(0);
 + text   *result = cstring_to_text_with_len(buf-data, buf-len);
 +
 + return deserialize_json_text(result);
 +}

This is a bit absurd, we're receiving a string in a StringInfo buffer,
just to copy it into text, and then in makeJsonLexContext() access the
raw chars again.

 +static void
 +putEscapedValue(StringInfo out, JsonbValue *v)
 +{
 + switch (v-type)
 + {
 + case jbvNull:
 + appendBinaryStringInfo(out, null, 4);
 + break;
 + case jbvString:
 + escape_json(out, pnstrdup(v-string.val, 
 v-string.len));
 + break;
 + case jbvBool:
 + if (v-boolean)
 + appendBinaryStringInfo(out, true, 4);
 + else
 + appendBinaryStringInfo(out, false, 5);
 + break;
 + case jbvNumeric:
 + appendStringInfoString(out, 
 DatumGetCString(DirectFunctionCall1(numeric_out, 
 PointerGetDatum(v-numeric;
 + break;
 + default:
 + elog(ERROR, unknown jsonb scalar type);
 + }
 +}

Hm, will the jbvNumeric always result in correct correct quoting?
datum_to_json() does extra hangups for that case, any reason we don't
need that here?

 +char *
 +JsonbToCString(StringInfo out, char *in, int estimated_len)
 +{
...
 + while (redo_switch || ((type = JsonbIteratorGet(it, v, false)) != 0))
 + {
 + redo_switch = false;

Not sure if I see the advantage over the goto here. A comment explaining
what the reason for the goto is wouldhave sufficed.

 + case WJB_KEY:
 + if (first == false)
 + appendBinaryStringInfo(out, , , 2);
 + first = true;
 +
 + putEscapedValue(out, v);
 + appendBinaryStringInfo(out, : , 2);

putEscapedValue doesn't gurantee only strings are output, but
datum_to_json does extra hangups for that case.

 + type = JsonbIteratorGet(it, v, false);
 + if (type == WJB_VALUE)
 + {
 + first = false;
 + putEscapedValue(out, v);
 + }
 + else
 + {
 + Assert(type == WJB_BEGIN_OBJECT || type 
 == WJB_BEGIN_ARRAY);
 + /*
 +  * We need to rerun current switch() 
 due to put
 +  * in current place object which we 
 just got
 +  * from iterator.
 +  */

due to put?

 +/*
 + * jsonb type send function
 + *
 + * Just send jsonb as a string of text
 + */
 +Datum
 +jsonb_send(PG_FUNCTION_ARGS)
 +{
 + Jsonb  *jb = PG_GETARG_JSONB(0);
 + StringInfoData buf;
 + char   *out;
 +
 + out = JsonbToCString(NULL, (JB_ISEMPTY(jb)) ? NULL : VARDATA(jb), 
 VARSIZE(jb));
 +
 + pq_begintypsend(buf);
 + pq_sendtext(buf, out, strlen(out));
 + PG_RETURN_BYTEA_P(pq_endtypsend(buf));
 +}

Why aren't you using using the stringbuf passing JsonbToCString
convention here to avoid the 

Re: [HACKERS] newlines at end of generated SQL

2014-02-10 Thread Peter Eisentraut
committed



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andrew Dunstan


On 02/10/2014 08:50 PM, Tom Dunstan wrote:

On 10 February 2014 20:11, Hannu Krosing ha...@krosing.net wrote:

The fastest and lowest parsing cost format for JSON is tnetstrings
http://tnetstrings.org/ why not use it as the binary wire format ?

It would be as binary as it gets and still be generally parse-able by
lots of different platforms, at leas by all of these  we care about.

If we do go down the binary encoding path in a future release, can I
please suggest *not* using something like tnetstrings, which suffers
the same problem that a few binary transport formats suffer,
particularly when they're developed by people whose native language
doesn't distinguish between byte arrays and strings - all strings are
considered byte arrays and it's up to an application to decide on
character encoding and which things are data vs strings in the
application.

This makes writing a parser in a language which does treat byte arrays
and strings differently very difficult, see e.g. the java tnetstrings
API [1] which is forced into treating strings as byte arrays until the
programmer then asks it to parse the thing again, but please treat
everything as a string this time. The msgpack people after much
wrangling have ended up issuing a new version of the protocol which
avoids this issue and which they are strongly encouraging users to
switch to, see [2] for the gory details.

While we may not ever store types in our jsonb format other than the
standard json data types (I can foresee people wanting to do it,
though), I would strongly recommend picking a format which at least is
clear that a value is a string (text, whatever), and preferably makes
it clear what the character encoding is. Or maybe it should just
follow whatever the client encoding is at the time - as long as that
is completely unambiguous to a client.



Its treatment of numbers is also broken from my POV (numbers are not 
just integers or floats), so no, we're not going to use tnetstrings. 
Plus, the whole idea of us moving to text for send/recv was to save 
code, not to have to write new code, so to suggest using it now is to 
ignore the discussion that went on before.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andrew Dunstan


On 02/10/2014 09:11 PM, Andres Freund wrote:

diff --git a/src/backend/utils/adt/jsonfuncs.c 
b/src/backend/utils/adt/jsonfuncs.c
index e1d8aae..50ddf50 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c

there's lots of whitespace/tab damage in this file. Check git log/diff
--check or such.



I don't know exactly what you're looking at. Here's what I get:

   [andrew@emma pg_jsonb]$ git diff --check master
   contrib/hstore/hstore--1.3.sql:465: trailing whitespace.
   +  WITHOUT FUNCTION AS IMPLICIT;
   contrib/hstore/hstore--1.3.sql:468: trailing whitespace.
   +  WITHOUT FUNCTION AS IMPLICIT;
   [andrew@emma pg_jsonb]$


I'll have a look at some of your other complaints when I get back home 
in a two or three of days, weather permitting.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-10 Thread Andres Freund
On 2014-02-10 22:15:21 -0500, Andrew Dunstan wrote:
 
 On 02/10/2014 09:11 PM, Andres Freund wrote:
 diff --git a/src/backend/utils/adt/jsonfuncs.c 
 b/src/backend/utils/adt/jsonfuncs.c
 index e1d8aae..50ddf50 100644
 --- a/src/backend/utils/adt/jsonfuncs.c
 +++ b/src/backend/utils/adt/jsonfuncs.c
 there's lots of whitespace/tab damage in this file. Check git log/diff
 --check or such.
 
 
 I don't know exactly what you're looking at. Here's what I get:

Sorry, forget that bit.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-10 Thread Inoue, Hiroshi

(2014/02/10 22:42), Hiroshi Inoue wrote:

(2014/02/09 8:06), Andrew Dunstan wrote:


On 02/08/2014 05:34 PM, Tom Lane wrote:

Hiroshi Inoue in...@tpf.co.jp writes:

Though I'm not a MINGW expert at all, I know dllwrap is a deprecated
tool and dlltool is almost a deprecated tool. Cygwin port is removing
the use of dllwrap and dlltool now. Isn't it better for MINGW port to
follow it?

Only way to make that happen is to prepare and test a patch ...


Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We
did get rid of dllwrap. But I agree this is worth trying for Mingw.


I tried MINGW port with the attached change and successfully built
src and contrib and all pararell regression tests were OK.


I forgot to mention the environment. I tried the change in 2 machines
and both worked.

1. 32bit Windows7   GCC 4.6.1
2. 32bit Windows Vista  GCC 3.4.5

I didn't test 64bit platform.

regards,
Hiroshi Inoue


--
I am using the free version of SPAMfighter.
SPAMfighter has removed 3567 of my spam emails to date.
Get the free SPAMfighter here: http://www.spamfighter.com/len

Do you have a slow PC? Try a Free scan 
http://www.spamfighter.com/SLOW-PCfighter?cid=sigen




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-10 Thread Tom Lane
Inoue, Hiroshi in...@tpf.co.jp writes:
 (2014/02/10 22:42), Hiroshi Inoue wrote:
 I tried MINGW port with the attached change and successfully built
 src and contrib and all pararell regression tests were OK.

 I forgot to mention the environment. I tried the change in 2 machines
 and both worked.

If there are no objections, I'll push this patch into HEAD tomorrow,
along with the upthread patches from Craig Ringer and Marco Atzeri.
We might as well see if this stuff is going to work ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-10 Thread Craig Ringer
On 02/11/2014 01:28 PM, Tom Lane wrote:
 If there are no objections, I'll push this patch into HEAD tomorrow,
 along with the upthread patches from Craig Ringer and Marco Atzeri.
 We might as well see if this stuff is going to work ...

I'd love to test my patch properly before pushing it, but my dev machine
is going to need a total teardown and rebuild, and I'm currently focused
on polishing off urgent open work.

So let's see what the buildfarm makes of it, I guess.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers