subject:"Re\: \[HACKERS\] Speed dblink using alternate libpq tuple storage"

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-05 Thread Marko Kreen

On Wed, Apr 04, 2012 at 06:41:00PM -0400, Tom Lane wrote:
 Marko Kreen mark...@gmail.com writes:
  On Wed, Apr 4, 2012 at 10:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Given the lack of consensus around the suspension API, maybe the best
  way to get the underlying libpq patch to a committable state is to take
  it out --- that is, remove the return zero option for row processors.
 
  Agreed.
 
 Done that way.

Minor cleanups:

* Change callback return value to be 'bool': 0 is error.
  Currently the accepted return codes are 1 and -1,
  which is weird.

  If we happen to have the 'early-exit' logic in the future,
  it should not work via callback return code.  So keeping the 0
  in reserve is unnecessary.

* Support exceptions in multi-statement PQexec() by storing
  finished result under PGconn temporarily.  Without doing it,
  the result can leak if callback longjmps while processing
  next result.

* Add caution to docs for permanently keeping custom callback.

  This API fragility is also reason why early-exit (if it appears)
  should not work via callback - instead it should give safer API.

-- 
marko

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 0ec501e..1e7678c 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -5608,6 +5608,16 @@ defaultNoticeProcessor(void *arg, const char *message)
 
   caution
para
+It's dangerous to leave custom row processor attached to connection
+permanently as there might be occasional queries that expect
+regular PGresult behaviour.  So unless it is certain all code
+is familiar with custom callback, it is better to restore default
+row processor after query is finished.
+   /para
+  /caution
+
+  caution
+   para
 The row processor function sees the rows before it is known whether the
 query will succeed overall, since the server might return some rows before
 encountering an error.  For proper transactional behavior, it must be
@@ -5674,8 +5684,8 @@ typedef struct pgDataValue
parametererrmsgp/parameter is an output parameter used only for error
reporting.  If the row processor needs to report an error, it can set
literal*/parametererrmsgp/parameter to point to a suitable message
-   string (and then return literal-1/).  As a special case, returning
-   literal-1/ without changing literal*/parametererrmsgp/parameter
+   string (and then return literal0/).  As a special case, returning
+   literal0/ without changing literal*/parametererrmsgp/parameter
from its initial value of NULL is taken to mean quoteout of memory/.
   /para
 
@@ -5702,10 +5712,10 @@ typedef struct pgDataValue
 
   para
The row processor function must return either literal1/ or
-   literal-1/.
+   literal0/.
literal1/ is the normal, successful result value; applicationlibpq/
will continue with receiving row values from the server and passing them to
-   the row processor.  literal-1/ indicates that the row processor has
+   the row processor.  literal0/ indicates that the row processor has
encountered an error.  In that case,
applicationlibpq/ will discard all remaining rows in the result set
and then return a literalPGRES_FATAL_ERROR/ typePGresult/type to
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 03fd6e4..90f6d6a 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2897,6 +2897,11 @@ closePGconn(PGconn *conn)
 		 * absent */
 	conn-asyncStatus = PGASYNC_IDLE;
 	pqClearAsyncResult(conn);	/* deallocate result */
+	if (conn-tempResult)
+	{
+		PQclear(conn-tempResult);
+		conn-tempResult = NULL;
+	}
 	pg_freeaddrinfo_all(conn-addrlist_family, conn-addrlist);
 	conn-addrlist = NULL;
 	conn-addr_cur = NULL;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 86f157c..554df94 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1028,7 +1028,7 @@ PQgetRowProcessor(const PGconn *conn, void **param)
 /*
  * pqStdRowProcessor
  *	  Add the received row to the PGresult structure
- *	  Returns 1 if OK, -1 if error occurred.
+ *	  Returns 1 if OK, 0 if error occurred.
  *
  * Note: param should point to the PGconn, but we don't actually need that
  * as of the current coding.
@@ -1059,7 +1059,7 @@ pqStdRowProcessor(PGresult *res, const PGdataValue *columns,
 	tup = (PGresAttValue *)
 		pqResultAlloc(res, nfields * sizeof(PGresAttValue), TRUE);
 	if (tup == NULL)
-		return -1;
+		return 0;
 
 	for (i = 0; i  nfields; i++)
 	{
@@ -1078,7 +1078,7 @@ pqStdRowProcessor(PGresult *res, const PGdataValue *columns,
 
 			val = (char *) pqResultAlloc(res, clen + 1, isbinary);
 			if (val == NULL)
-return -1;
+return 0;
 
 			/* copy and zero-terminate the data (even if it's binary) */
 			memcpy(val, columns[i].value, clen);
@@ -1091,7 +1091,7 @@ pqStdRowProcessor(PGresult *res, const PGdataValue *columns,
 
 	/* And add the tuple to the

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-05 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 Minor cleanups:

 * Change callback return value to be 'bool': 0 is error.
   Currently the accepted return codes are 1 and -1,
   which is weird.

No, I did it that way intentionally, with the thought that we might add
back return code zero (or other return codes) in the future.  If we go
with bool then sensible expansion is impossible, or at least ugly.
(I think it was you that objected to 0/1/2 in the first place, no?)

   If we happen to have the 'early-exit' logic in the future,
   it should not work via callback return code.

This assumption seems unproven to me, and even if it were,
it doesn't mean we will never have any other exit codes.

 * Support exceptions in multi-statement PQexec() by storing
   finished result under PGconn temporarily.  Without doing it,
   the result can leak if callback longjmps while processing
   next result.

I'm unconvinced this is a good thing either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-05 Thread Marko Kreen

On Thu, Apr 05, 2012 at 12:49:37PM -0400, Tom Lane wrote:
 Marko Kreen mark...@gmail.com writes:
  Minor cleanups:
 
  * Change callback return value to be 'bool': 0 is error.
Currently the accepted return codes are 1 and -1,
which is weird.
 
 No, I did it that way intentionally, with the thought that we might add
 back return code zero (or other return codes) in the future.  If we go
 with bool then sensible expansion is impossible, or at least ugly.
 (I think it was you that objected to 0/1/2 in the first place, no?)

Well, I was the one who added the 0/1/2 in the first place,
then I noticed that -1/0/1 works better as a triple.

But the early-exit from callback creates unnecessary fragility
so now I'm convinced we don't want to do it that way.

If we happen to have the 'early-exit' logic in the future,
it should not work via callback return code.
 
 This assumption seems unproven to me, and even if it were,
 it doesn't mean we will never have any other exit codes.

I cannot even imagine why we would want new codes, so it seems
like unnecessary mess in API.

But it's a minor issue, so if you intend to keep it, I won't
push it further.

  * Support exceptions in multi-statement PQexec() by storing
finished result under PGconn temporarily.  Without doing it,
the result can leak if callback longjmps while processing
next result.
 
 I'm unconvinced this is a good thing either.

This is less minor issue.  Do we support longjmp() or not?

Why are you convinced that it's not needed?

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-04 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 On Tue, Apr 03, 2012 at 05:32:25PM -0400, Tom Lane wrote:
 Well, there are really four levels to the API design:
 * Plain old PQexec.
 * Break down PQexec into PQsendQuery and PQgetResult.
 * Avoid waiting in PQgetResult by testing PQisBusy.
 * Avoid waiting in PQsendQuery (ie, avoid the risk of blocking
   on socket writes) by using PQisnonblocking.

 Thats actually nice overview.  I think our basic disagreement comes
 from how we map the early-exit into those modes.
 I want to think of the early-exit row-processing as 5th and 6th modes:

 * Row-by-row processing on sync connection (PQsendQuery() + ???)
 * Row-by-row processing on async connection (PQsendQuery() + ???)

 But instead you want work with almost no changes on existing modes.

Well, the trouble with the proposed PQgetRow/PQrecvRow is that they only
work safely at the second API level.  They're completely unsafe to use
with PQisBusy, and I think that is a show-stopper.  In your own terms,
the 6th mode doesn't work.

More generally, it's not very safe to change the row processor while a
query is in progress.  PQskipResult can get away with doing so, but only
because the entire point of that function is to lose data, and we don't
much care whether some rows already got handled differently.  For every
other use-case, you have to set up the row processor in advance and
leave it in place, which is a guideline that PQgetRow/PQrecvRow violate.

So I think the only way to use row-by-row processing is to permanently
install a row processor that normally returns zero.  It's possible that
we could provide a predefined row processor that acts that way and
invite people to install it.  However, I think it's premature to suppose
that we know all the details of how somebody might want to use this.
In particular the notion of cloning the PGresult for each row seems
expensive and not obviously more useful than direct access to the
network buffer.  So I'd rather leave it as-is and see if any common
usage patterns arise, then add support for those patterns.

 In particular, I flat out will not accept a design in which that option
 doesn't work unless the current call came via PQisBusy, much less some
 entirely new call like PQhasRowOrResult.  It's unusably fragile (ie,
 timing sensitive) if that has to be true.

 Agreed for PQisBusy, but why is PQhasRowOrResult() fragile?

Because it breaks if you use PQisBusy *anywhere* in the application.
That's not just a bug hazard but a loss of functionality.  I think it's
important to have a pure is data available state test function that
doesn't cause data to be consumed from the connection, and there's no
way to have that if there are API functions that change the row
processor setting mid-query.  (Another way to say this is that PQisBusy
ought to be idempotent from the standpoint of the application --- we
know that it does perform work inside libpq, but it doesn't change the
state of the connection so far as the app can tell, and so it doesn't
matter if you call it zero, one, or N times between other calls.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-04 Thread Kyotaro HORIGUCHI

Hello, This is the new version of dblink patch.

- Calling dblink_is_busy prevents row processor from being used.

- some PGresult leak fixed.

- Rebased to current head.

 A hack on top of that hack would be to collect the data into a
 tuplestore that contains all text columns, and then convert to the
 correct rowtype during dblink_get_result, but that seems rather ugly
 and not terribly high-performance.

 What I'm currently thinking we should do is just use the old method
 for async queries, and only optimize the synchronous case.

Ok, I agree with you except for performance issue. I give up to use
row processor for async query with dblink_is_busy called.

 I thought for awhile that this might represent a generic deficiency
 in the whole concept of a row processor, but probably it's mostly
 down to dblink's rather bizarre API.  It would be unusual I think for
 people to want a row processor that couldn't know what to do until
 after the entire query result is received.

I hope so. Thank you.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


dblink_rowproc_20120405.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-04 Thread Tom Lane

Kyotaro HORIGUCHI horiguchi.kyot...@oss.ntt.co.jp writes:
 What I'm currently thinking we should do is just use the old method
 for async queries, and only optimize the synchronous case.

 Ok, I agree with you except for performance issue. I give up to use
 row processor for async query with dblink_is_busy called.

Yeah, that seems like a reasonable idea.


Given the lack of consensus around the suspension API, maybe the best
way to get the underlying libpq patch to a committable state is to take
it out --- that is, remove the return zero option for row processors.
Since we don't have a test case for it in dblink, it's hard to escape
the feeling that we may be expending a lot of effort for something that
nobody really wants, and/or misdesigning it for lack of a concrete use
case.  Is anybody going to be really unhappy if that part of the patch
gets left behind?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-04 Thread Marko Kreen

On Wed, Apr 4, 2012 at 10:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Given the lack of consensus around the suspension API, maybe the best
 way to get the underlying libpq patch to a committable state is to take
 it out --- that is, remove the return zero option for row processors.
 Since we don't have a test case for it in dblink, it's hard to escape
 the feeling that we may be expending a lot of effort for something that
 nobody really wants, and/or misdesigning it for lack of a concrete use
 case.  Is anybody going to be really unhappy if that part of the patch
 gets left behind?

Agreed.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-04 Thread Kyotaro HORIGUCHI

I'm afraid not re-initializing materialize_needed for the next query
in the latest dblink patch.
I will confirm that and send the another one if needed in a few hours.

# I need to catch the train I usually get on..

 Hello, This is the new version of dblink patch.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-04 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 On Wed, Apr 4, 2012 at 10:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Given the lack of consensus around the suspension API, maybe the best
 way to get the underlying libpq patch to a committable state is to take
 it out --- that is, remove the return zero option for row processors.

 Agreed.

Done that way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-04 Thread Tom Lane

Kyotaro HORIGUCHI horiguchi.kyot...@oss.ntt.co.jp writes:
 I'm afraid not re-initializing materialize_needed for the next query
 in the latest dblink patch.
 I will confirm that and send the another one if needed in a few hours.

I've committed a revised version of the previous patch.  I'm not sure
that the case of dblink_is_busy not being used is interesting enough
to justify contorting the logic, and I'm worried about introducing
corner case bugs for that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-04 Thread Kyotaro HORIGUCHI

  I'm afraid not re-initializing materialize_needed for the next query
  in the latest dblink patch.

I've found no need to worry about the re-initializing issue.

 I've committed a revised version of the previous patch.

Thank you for that. 

 I'm not sure that the case of dblink_is_busy not being used is
 interesting enough to justify contorting the logic, and I'm
 worried about introducing corner case bugs for that.

I'm afraid of indefinite state by mixing sync and async queries
or API call sequence for async query out of my expectations
(which is rather narrow).

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-03 Thread Marko Kreen

On Sun, Apr 01, 2012 at 07:23:06PM -0400, Tom Lane wrote:
 Marko Kreen mark...@gmail.com writes:
  So the proper approach would be to have new API call, designed to
  handle it, and allow early-exit only from there.
 
  That would also avoid any breakage of old APIs.  Also it would avoid
  any accidental data loss, if the user code does not have exactly
  right sequence of calls.
 
  How about PQisBusy2(), which returns '2' when early-exit is requested?
  Please suggest something better...
 
 My proposal is way better than that.  You apparently aren't absorbing my
 point, which is that making this behavior unusable with every existing
 API (whether intentionally or by oversight) isn't an improvement.
 The row processor needs to be able to do this *without* assuming a
 particular usage style,

I don't get what kind of usage scenario you think of when you 
early-exit without assuming anything about upper-level usage.

Could you show example code where it is useful?

The fact remains that upper-level code must cooperate with callback.
Why is it useful to hijack PQgetResult() to do so?  Especially as the
PGresult it returns is not useful in any way and the callback still needs
to use side channel to pass actual values to upper level.

IMHO it's much better to remove the concept of early-exit from public
API completely and instead give get style API that does the early-exit
internally.  See below for example.

 and most particularly it should not force people
 to use async mode.

Seems our concept of async mode is different.  I associate
PQisnonblocking() with it.  And old code, eg. PQrecvRow()
works fine in both modes.

 An alternative that I'd prefer to that one is to get rid of the
 suspension return mode altogether.

That might be good idea.  I would prefer to postpone controversial
features instead having hurried design for them.

Also - is there really need to make callback API ready for *all*
potential usage scenarios?  IMHO it would be better to make sure
it's good enough that higher-level and easier to use APIs can
be built on top of it.  And thats it.

 However, that leaves us needing
 to document what it means to longjmp out of a row processor without
 having any comparable API concept, so I don't really find it better.

Why do you think any new API concept is needed?  Why is following rule
not enough (for sync connection):

  After exception you need to call PQgetResult() / PQfinish().

Only thing that needs fixing is storing lastResult under PGconn
to support exceptions for PQexec().  (If we need to support
exceptions everywhere.)

---

Again, note that if we would provide PQrecvRow()-style API, then we *don't*
need early-exit in callback API.  Nor exceptions...

If current PQrecvRow() / PQgetRow() are too bloated for you, then how about
thin wrapper around PQisBusy():


/* 0 - need more data, 1 - have result, 2 - have row */
int PQhasRowOrResult(PGconn *conn, PGresult **hdr, PGrowValue **cols)
{
int gotrow = 0;
PQrowProcessor oldproc;
void *oldarg;
int busy;

/* call PQisBusy with our callback */
oldproc = PQgetRowProcessor(conn, oldarg);
PQsetRowProcessor(conn, hasrow_cb, flag);
busy = PQisBusy(conn);
PQsetRowProcessor(conn, oldproc, oldarg);

if (gotrow)
{
*hdr = conn-result;
*cols = conn-rowBuf;
return 2;
}
return busy ? 0 : 1;
}

static int hasrow_cb(PGresult *res, PGrowValue *columns, void *param)
{
int *gotrow = param;
*gotrow = 1;
return 0;
}



Also, instead hasrow_cb(), we could have integer flag under PGconn that
getAnotherTuple() checks, thus getting rid if it even in internal
callback API.

Yes, it requires async-style usage pattern, but works for both
sync and async connections.  And it can be used to build even
simpler API on top of it.

Summary: it would be better to keep early-exit internal detail,
as the actual feature needed is processing rows outside of callback.
So why not provide API for it?

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-03 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 The fact remains that upper-level code must cooperate with callback.
 Why is it useful to hijack PQgetResult() to do so?

Because that's the defined communications channel.  We're not
hijacking it.  If we're going to start using pejorative words here,
I will assert that your proposal hijacks PQisBusy to make it do
something substantially different than the traditional understanding
of it (more about that below).

 Seems our concept of async mode is different.  I associate
 PQisnonblocking() with it.

Well, there are really four levels to the API design:

* Plain old PQexec.
* Break down PQexec into PQsendQuery and PQgetResult.
* Avoid waiting in PQgetResult by testing PQisBusy.
* Avoid waiting in PQsendQuery (ie, avoid the risk of blocking
  on socket writes) by using PQisnonblocking.

Any given app might choose to run at any one of those four levels,
although the first one probably isn't interesting for an app that would
care about using a suspending row processor.  But I see no reason that
we should design the suspension feature such that it doesn't work at the
second level.  PQisBusy is, and should be, an optional state-testing
function, not something that changes the set of things you can do with
the connection.

 IMHO it's much better to remove the concept of early-exit from public
 API completely and instead give get style API that does the early-exit
 internally.

If the row processor has an early-exit option, it hardly seems
reasonable to me to claim that that's not part of the public API.

In particular, I flat out will not accept a design in which that option
doesn't work unless the current call came via PQisBusy, much less some
entirely new call like PQhasRowOrResult.  It's unusably fragile (ie,
timing sensitive) if that has to be true.

There's another way in which I think your proposal breaks the existing
API, which is that without an internal PQASYNC_SUSPENDED state that is
cleared by PQgetResult, it's unsafe to probe PQisBusy multiple times
before doing something useful.  That shouldn't be the case.  PQisBusy
is supposed to test whether data is ready for the application to do
something with --- it should not throw away data until a separate call
has been made to consume the data.  Changing its behavior like that
would risk creating subtle bugs in existing event-loop logic.  A
possibly useful analogy is that select() and poll() don't clear a
socket's read-ready state, you have to do a separate read() to do that.
There are good reasons for that separation of actions.

 Again, note that if we would provide PQrecvRow()-style API, then we *don't*
 need early-exit in callback API.  Nor exceptions...

AFAICS, we *do* need exceptions for dblink's usage.  So most of what's
at stake here is not avoidable, unless you're proposing we put this
whole set of patches off till 9.3 so we can think it over some more.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-03 Thread Marko Kreen

On Tue, Apr 03, 2012 at 05:32:25PM -0400, Tom Lane wrote:
 Marko Kreen mark...@gmail.com writes:
  The fact remains that upper-level code must cooperate with callback.
  Why is it useful to hijack PQgetResult() to do so?
 
 Because that's the defined communications channel.  We're not
 hijacking it.  If we're going to start using pejorative words here,
 I will assert that your proposal hijacks PQisBusy to make it do
 something substantially different than the traditional understanding
 of it (more about that below).

And I would agree with it - and I already did:

Seems we both lost sight of actual usage scenario for the early-exit
logic - that both callback and upper-level code *must* cooperate
for it to be useful.  Instead, we designed API for non-cooperating case,
which is wrong.

  Seems our concept of async mode is different.  I associate
  PQisnonblocking() with it.
 
 Well, there are really four levels to the API design:
 
   * Plain old PQexec.
   * Break down PQexec into PQsendQuery and PQgetResult.
   * Avoid waiting in PQgetResult by testing PQisBusy.
   * Avoid waiting in PQsendQuery (ie, avoid the risk of blocking
 on socket writes) by using PQisnonblocking.
 
 Any given app might choose to run at any one of those four levels,
 although the first one probably isn't interesting for an app that would
 care about using a suspending row processor.  But I see no reason that
 we should design the suspension feature such that it doesn't work at the
 second level.  PQisBusy is, and should be, an optional state-testing
 function, not something that changes the set of things you can do with
 the connection.

Thats actually nice overview.  I think our basic disagreement comes
from how we map the early-exit into those modes.

I want to think of the early-exit row-processing as 5th and 6th modes:

* Row-by-row processing on sync connection (PQsendQuery() + ???)
* Row-by-row processing on async connection (PQsendQuery() + ???)

But instead you want work with almost no changes on existing modes.

And I don't like it because as I've said it previously, the upper
level must know about callback and handle it properly, so it does
not make sense keep existing loop structure in stone.

Instead we should design the new mode that is logical for user
and also logical inside libpq.

  IMHO it's much better to remove the concept of early-exit from public
  API completely and instead give get style API that does the early-exit
  internally.
 
 If the row processor has an early-exit option, it hardly seems
 reasonable to me to claim that that's not part of the public API.

Please keep in mind the final goal - nobody is interested in
early-exit on it's own, the interesting goal is processing rows
outside of libpq.

And the early-exit return code is clumsy hack to make it possible
with callbacks.

But why should we provide complex way of achieving something,
when we can provide easy and direct way?

 In particular, I flat out will not accept a design in which that option
 doesn't work unless the current call came via PQisBusy, much less some
 entirely new call like PQhasRowOrResult.  It's unusably fragile (ie,
 timing sensitive) if that has to be true.

Agreed for PQisBusy, but why is PQhasRowOrResult() fragile?

It's easy to make PQhasRowOrResult more robust - let it set flag under
PGconn and let getAnotherTuple() do early exit on it's own, thus keeping
callback completely out of loop.  Thus avoiding any chance user callback
can accidentally trigger the behaviour.

 There's another way in which I think your proposal breaks the existing
 API, which is that without an internal PQASYNC_SUSPENDED state that is
 cleared by PQgetResult, it's unsafe to probe PQisBusy multiple times
 before doing something useful.  That shouldn't be the case.  PQisBusy
 is supposed to test whether data is ready for the application to do
 something with --- it should not throw away data until a separate call
 has been made to consume the data.  Changing its behavior like that
 would risk creating subtle bugs in existing event-loop logic.  A
 possibly useful analogy is that select() and poll() don't clear a
 socket's read-ready state, you have to do a separate read() to do that.
 There are good reasons for that separation of actions.

It does look like argument for new modes for row-by-row processing,
preferably with new API calls.

Because adding magic to existing APIs seems to only cause confusion.

  Again, note that if we would provide PQrecvRow()-style API, then we *don't*
  need early-exit in callback API.  Nor exceptions...
 
 AFAICS, we *do* need exceptions for dblink's usage.  So most of what's
 at stake here is not avoidable, unless you're proposing we put this
 whole set of patches off till 9.3 so we can think it over some more.

My point was that if we have an API for processing rows outside libpq,
the need for exceptions for callbacks is mostly gone.  Converting dblink
to PQhasRowOrResult()

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-01 Thread Tom Lane

I've been thinking some more about the early-termination cases (where
the row processor returns zero or longjmps), and I believe I have got
proposals that smooth off most of the rough edges there.

First off, returning zero is really pretty unsafe as it stands, because
it only works more-or-less-sanely if the connection is being used in
async style.  If the row processor returns zero within a regular
PQgetResult call, that will cause PQgetResult to block waiting for more
input.  Which might not be forthcoming, if we're in the last bufferload
of a query response from the server.  Even in the async case, I think
it's a bad design to have PQisBusy return true when the row processor
requested stoppage.  In that situation, there is work available for the
outer application code to do, whereas normally PQisBusy == true means
we're still waiting for the server.

I think we can fix this by introducing a new PQresultStatus, called say
PGRES_SUSPENDED, and having PQgetResult return an empty PGresult with
status PGRES_SUSPENDED after the row processor has returned zero.
Internally, there'd also be a new asyncStatus PGASYNC_SUSPENDED,
which we'd set before exiting from the getAnotherTuple call.  This would
cause PQisBusy and PQgetResult to do the right things.  In PQgetResult,
we'd switch back to PGASYNC_BUSY state after producing a PGRES_SUSPENDED
result, so that subsequent calls would resume parsing input.

With this design, a suspending row processor can be used safely in
either async or non-async mode.  It does cost an extra PGresult creation
and deletion per cycle, but that's not much more than a malloc and free.

Also, we can document that a longjmp out of the row processor leaves the
library in the same state as if the row processor had returned zero and
a PGRES_SUSPENDED result had been returned to the application; which
will be a true statement in all cases, sync or async.

I also mentioned earlier that I wasn't too thrilled with the design of
PQskipResult; in particular that it would encourage application writers
to miss server-sent error results, which would inevitably be a bad idea.
I think what we ought to do is define (and implement) it as being
exactly equivalent to PQgetResult, except that it temporarily installs
a dummy row processor so that data rows are discarded rather than
accumulated.  Then, the documented way to clean up after deciding to
abandon a suspended query will be to do PQskipResult until it returns
null, paying normal attention to any result statuses other than
PGRES_TUPLES_OK.  This is still not terribly helpful for async-mode
applications, but what they'd probably end up doing is installing their
own dummy row processors and then flushing results as part of their
normal outer loop.  The only thing we could do for them is to expose
a dummy row processor, which seems barely worthwhile given that it's
a one-line function.

I remain of the opinion that PQgetRow/PQrecvRow aren't adding much
usability-wise.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-01 Thread Marko Kreen

On Sun, Apr 01, 2012 at 05:51:19PM -0400, Tom Lane wrote:
 I've been thinking some more about the early-termination cases (where
 the row processor returns zero or longjmps), and I believe I have got
 proposals that smooth off most of the rough edges there.
 
 First off, returning zero is really pretty unsafe as it stands, because
 it only works more-or-less-sanely if the connection is being used in
 async style.  If the row processor returns zero within a regular
 PQgetResult call, that will cause PQgetResult to block waiting for more
 input.  Which might not be forthcoming, if we're in the last bufferload
 of a query response from the server.  Even in the async case, I think
 it's a bad design to have PQisBusy return true when the row processor
 requested stoppage.  In that situation, there is work available for the
 outer application code to do, whereas normally PQisBusy == true means
 we're still waiting for the server.
 
 I think we can fix this by introducing a new PQresultStatus, called say
 PGRES_SUSPENDED, and having PQgetResult return an empty PGresult with
 status PGRES_SUSPENDED after the row processor has returned zero.
 Internally, there'd also be a new asyncStatus PGASYNC_SUSPENDED,
 which we'd set before exiting from the getAnotherTuple call.  This would
 cause PQisBusy and PQgetResult to do the right things.  In PQgetResult,
 we'd switch back to PGASYNC_BUSY state after producing a PGRES_SUSPENDED
 result, so that subsequent calls would resume parsing input.
 
 With this design, a suspending row processor can be used safely in
 either async or non-async mode.  It does cost an extra PGresult creation
 and deletion per cycle, but that's not much more than a malloc and free.

I added extra magic to PQisBusy(), you are adding extra magic to
PQgetResult().  Not much difference.

Seems we both lost sight of actual usage scenario for the early-exit
logic - that both callback and upper-level code *must* cooperate
for it to be useful.  Instead, we designed API for non-cooperating case,
which is wrong.

So the proper approach would be to have new API call, designed to
handle it, and allow early-exit only from there.

That would also avoid any breakage of old APIs.  Also it would avoid
any accidental data loss, if the user code does not have exactly
right sequence of calls.

How about PQisBusy2(), which returns '2' when early-exit is requested?
Please suggest something better...


-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-04-01 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 Seems we both lost sight of actual usage scenario for the early-exit
 logic - that both callback and upper-level code *must* cooperate
 for it to be useful.  Instead, we designed API for non-cooperating case,
 which is wrong.

Exactly.  So you need an extra result state, or something isomorphic.

 So the proper approach would be to have new API call, designed to
 handle it, and allow early-exit only from there.

 That would also avoid any breakage of old APIs.  Also it would avoid
 any accidental data loss, if the user code does not have exactly
 right sequence of calls.

 How about PQisBusy2(), which returns '2' when early-exit is requested?
 Please suggest something better...

My proposal is way better than that.  You apparently aren't absorbing my
point, which is that making this behavior unusable with every existing
API (whether intentionally or by oversight) isn't an improvement.
The row processor needs to be able to do this *without* assuming a
particular usage style, and most particularly it should not force people
to use async mode.

An alternative that I'd prefer to that one is to get rid of the
suspension return mode altogether.  However, that leaves us needing
to document what it means to longjmp out of a row processor without
having any comparable API concept, so I don't really find it better.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-31 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 On Thu, Mar 29, 2012 at 06:56:30PM -0400, Tom Lane wrote:
 Yeah.  Perhaps we should tweak the row-processor callback API so that
 it gets an explicit notification that this is a new resultset.
 Duplicating PQexec's behavior would then involve having the dblink row
 processor throw away any existing tuplestore and start over when it
 gets such a call.
 
 There's multiple ways to express that but the most convenient thing
 from libpq's viewpoint, I think, is to have a callback that occurs
 immediately after collecting a RowDescription message, before any
 rows have arrived.  So maybe we could express that as a callback
 with valid res but columns set to NULL?
 
 A different approach would be to add a row counter to the arguments
 provided to the row processor; then you'd know a new resultset had
 started if you saw rowcounter == 0.  This might have another advantage
 of not requiring the row processor to count the rows for itself, which
 I think many row processors would otherwise have to do.

I had been leaning towards the second approach with a row counter,
because it seemed cleaner, but I thought of another consideration that
makes the first way seem better.  Suppose that your row processor has to
do some setup work at the start of a result set, and that work needs to
see the resultset properties (eg number of columns) so it can't be done
before starting PQgetResult.  In the patch as submitted, the
only way to manage that is to keep enough state to recognize that the
current row processor call is the first one, which we realized is
inadequate for multiple-result-set cases.  With a row counter argument
you can do the setup whenever rowcount == 0, which fixes that.  But
neither of these methods deals correctly with an empty result set!
To make that work, you need to add extra logic after the PQgetResult
call to do the setup work the row processor should have done but never
got a chance to.  So that's ugly, and it makes for an easy-to-miss bug.
A call that occurs when we receive RowDescription, independently of
whether the result set contains any rows, makes this a lot cleaner.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-30 Thread Marko Kreen

On Thu, Mar 29, 2012 at 06:56:30PM -0400, Tom Lane wrote:
 Marko Kreen mark...@gmail.com writes:
  My conclusion is that row-processor API is low-level expert API and
  quite easy to misuse.  It would be preferable to have something more
  robust as end-user API, the PQgetRow() is my suggestion for that.
  Thus I see 3 choices:
 
  1) Push row-processor as main API anyway and describe all dangerous
 scenarios in documentation.
  2) Have both PQgetRow() and row-processor available in libpq-fe.h,
 PQgetRow() as preferred API and row-processor for expert usage,
 with proper documentation what works and what does not.
  3) Have PQgetRow() in libpq-fe.h, move row-processor to libpq-int.h.
 
 I still am failing to see the use-case for PQgetRow.  ISTM the entire
 point of a special row processor is to reduce the per-row processing
 overhead, but PQgetRow greatly increases that overhead.

No, decreasing CPU overhead is minor win.  I guess in realistic
application, like dblink, you can't even measure the difference.

The *major* win comes from avoiding buffering of all rows in PGresult.
Ofcourse, this is noticeable only with bigger resultsets.
I guess such buffering pessimizes memory usage: code always
works on cold cache.  And simply keeping RSS low is good for
long-term health of a process.

Second major win is avoiding the need to use cursor with small chunks
to access resultset of unknown size.  Thus stalling application
until next block arrives from network.


The PGresult *PQgetRow() is for applications that do not convert
rows immediately to some internal format, but keep using PGresult.
So they can be converted to row-by-row processing with minimal
changes to actual code.

Note that the PGrowValue is temporary struct that application *must*
move data away from.  If app internally uses PGresult, then it's
pretty annoying to invent a new internal format for long-term
storage.

But maybe I'm overestimating the number of such applications.

 And it doesn't
 reduce complexity much either IMO: you still have all the primary risk
 factors arising from processing rows in advance of being sure that the
 whole query completed successfully.

It avoids the complexity of:

* How to pass error from callback to upper code

* Needing to know how exceptions behave

* How to use early exit to pass rows to upper code one-by-one,
  (by storing the PGresult and PGrowValue in temp place
   and later checking their values)

* How to detect that new resultset has started. (keeping track
  of previous PGresult or noting some quirky API behaviour
  we may invent for such case)

* Needing to make sure the callback does not leak to call-sites
  that expect regular libpq behaviour.
  (Always call PQregisterRowProcessor(db, NULL, NULL) after query finishes )
  [But now I'm in exception handler, how do I find the connection?]


I've now reviewed the callback code and even done some coding with it
and IMHO it's too low-level to be end-user-API.

Yes, the query-may-still-fail complexity remains, but thats not unlike
the usual multi-statement-transaction-is-not-guaranteed-to-succeed
complexity.

Another compexity that remains is how-to-skip-current-resultset,
but that is a problem only on sync connections and the answer is
simple - call PQgetResult().  Or call PQgetRow/PQrecvRow if
user wants to avoid buffering.

 Plus it conflates no more data
 with there was an error receiving the data or there was an error on
 the server side.

Well, current PQgetRow() is written with style: return only single-row
PGresult, to see errors user must call PQgetResult().  Basically
so that user it forced to fall back familiar libpq usage pattern.

It can be changed, so that PQgetRow() returns also errors.

Or we can drop it and just keep PQrecvRow().

 PQrecvRow alleviates the per-row-overhead aspect of
 that but doesn't really do a thing from the complexity standpoint;
 it doesn't look to me to be noticeably easier to use than a row
 processor callback.

 I think PQgetRow and PQrecvRow just add more API calls without making
 any fundamental improvements, and so we could do without them.  There's
 more than one way to do it is not necessarily a virtue.

Please re-read the above list of problematic situations that this API
fixes.  Then, if you still think that PQrecvRow() is pointless, sure,
let's drop it.

We can also postpone it to 9.3, to poll users whether they want
easier API, or is maximum performance important.  (PQrecvRow()
*does* have few cycles of overhead compared to callbacks.)

Only option that we have on the table for 9.2 but not later
is moving the callback API to libpq-int.h.

  Second conclusion is that current dblink row-processor usage is broken
  when user uses multiple SELECTs in SQL as dblink uses plain PQexec().
 
 Yeah.  Perhaps we should tweak the row-processor callback API so that
 it gets an explicit notification that this is a new resultset.
 Duplicating PQexec's behavior would then involve having the dblink row

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-30 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 On Thu, Mar 29, 2012 at 06:56:30PM -0400, Tom Lane wrote:
 Marko Kreen mark...@gmail.com writes:
 Second conclusion is that current dblink row-processor usage is broken
 when user uses multiple SELECTs in SQL as dblink uses plain PQexec().

 Yeah.  Perhaps we should tweak the row-processor callback API so that
 it gets an explicit notification that this is a new resultset.
 Duplicating PQexec's behavior would then involve having the dblink row
 processor throw away any existing tuplestore and start over when it
 gets such a call.
 
 There's multiple ways to express that but the most convenient thing
 from libpq's viewpoint, I think, is to have a callback that occurs
 immediately after collecting a RowDescription message, before any
 rows have arrived.  So maybe we could express that as a callback
 with valid res but columns set to NULL?
 
 A different approach would be to add a row counter to the arguments
 provided to the row processor; then you'd know a new resultset had
 started if you saw rowcounter == 0.  This might have another advantage
 of not requiring the row processor to count the rows for itself, which
 I think many row processors would otherwise have to do.

 Try to imagine how final documentation will look like.

 Then imagine documentation for PGrecvRow() / PQgetRow().

What's your point, exactly?  PGrecvRow() / PQgetRow() aren't going to
make that any better as currently defined, because there's noplace to
indicate this is a new resultset in those APIs either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-30 Thread Marko Kreen

On Fri, Mar 30, 2012 at 11:59:12AM -0400, Tom Lane wrote:
 Marko Kreen mark...@gmail.com writes:
  On Thu, Mar 29, 2012 at 06:56:30PM -0400, Tom Lane wrote:
  Marko Kreen mark...@gmail.com writes:
  Second conclusion is that current dblink row-processor usage is broken
  when user uses multiple SELECTs in SQL as dblink uses plain PQexec().
 
  Yeah.  Perhaps we should tweak the row-processor callback API so that
  it gets an explicit notification that this is a new resultset.
  Duplicating PQexec's behavior would then involve having the dblink row
  processor throw away any existing tuplestore and start over when it
  gets such a call.
  
  There's multiple ways to express that but the most convenient thing
  from libpq's viewpoint, I think, is to have a callback that occurs
  immediately after collecting a RowDescription message, before any
  rows have arrived.  So maybe we could express that as a callback
  with valid res but columns set to NULL?
  
  A different approach would be to add a row counter to the arguments
  provided to the row processor; then you'd know a new resultset had
  started if you saw rowcounter == 0.  This might have another advantage
  of not requiring the row processor to count the rows for itself, which
  I think many row processors would otherwise have to do.
 
  Try to imagine how final documentation will look like.
 
  Then imagine documentation for PGrecvRow() / PQgetRow().
 
 What's your point, exactly?  PGrecvRow() / PQgetRow() aren't going to
 make that any better as currently defined, because there's noplace to
 indicate this is a new resultset in those APIs either.

Have you looked at the examples?  PQgetResult() is pretty good hint
that one resultset finished...

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-30 Thread Marko Kreen

On Fri, Mar 30, 2012 at 7:04 PM, Marko Kreen mark...@gmail.com wrote:
 Have you looked at the examples?  PQgetResult() is pretty good hint
 that one resultset finished...

Ok, the demos are around this long thread and hard to find,
so here is a summary of links:

Original design mail:

  http://archives.postgresql.org/message-id/20120224154616.ga16...@gmail.com

First patch with quick demos:

  http://archives.postgresql.org/message-id/20120226221922.ga6...@gmail.com

Demos as diff:

  http://archives.postgresql.org/message-id/20120324002224.ga19...@gmail.com

Demos/experiments/tests (bit messier than the demos-as-diffs):

  https://github.com/markokr/libpq-rowproc-demos

Note - the point is that user *must* call PQgetResult() when resultset ends.
Thus also the PQgetRow() does not return errors decision.

I'll put this mail into commitfest page too, seems I've forgotten to
put some mails there.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-30 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 On Wed, Mar 07, 2012 at 03:14:57PM +0900, Kyotaro HORIGUCHI wrote:
 My suggestion - check in getAnotherTuple whether resultStatus is
 already error and do nothing then.  This allows internal pqAddRow
 to set regular out of memory error.  Otherwise give generic
 row processor error.

 Current implement seems already doing this in
 parseInput3(). Could you give me further explanation?

 The suggestion was about getAnotherTuple() - currently it sets
 always error in row processor.  With such check, the callback
 can set the error result itself.  Currently only callbacks that
 live inside libpq can set errors, but if we happen to expose
 error-setting function in outside API, then the getAnotherTuple()
 would already be ready for it.

I'm pretty dissatisfied with the error reporting situation for row
processors.  You can't just decide not to solve it, which seems to be
the current state of affairs.  What I'm inclined to do is to add a
char ** parameter to the row processor, and say that when the
processor returns -1 it can store an error message string there.
If it does so, that's what we report.  If it doesn't (which we'd detect
by presetting the value to NULL), then use a generic error in row
processor message.  This is cheap and doesn't prevent the row processor
from using some application-specific error reporting method if it wants;
but it does allow the processor to make use of libpq's error mechanism
when that's preferable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-30 Thread Marko Kreen

On Fri, Mar 30, 2012 at 05:18:42PM -0400, Tom Lane wrote:
 Marko Kreen mark...@gmail.com writes:
  On Wed, Mar 07, 2012 at 03:14:57PM +0900, Kyotaro HORIGUCHI wrote:
  My suggestion - check in getAnotherTuple whether resultStatus is
  already error and do nothing then.  This allows internal pqAddRow
  to set regular out of memory error.  Otherwise give generic
  row processor error.
 
  Current implement seems already doing this in
  parseInput3(). Could you give me further explanation?
 
  The suggestion was about getAnotherTuple() - currently it sets
  always error in row processor.  With such check, the callback
  can set the error result itself.  Currently only callbacks that
  live inside libpq can set errors, but if we happen to expose
  error-setting function in outside API, then the getAnotherTuple()
  would already be ready for it.
 
 I'm pretty dissatisfied with the error reporting situation for row
 processors.  You can't just decide not to solve it, which seems to be
 the current state of affairs.  What I'm inclined to do is to add a
 char ** parameter to the row processor, and say that when the
 processor returns -1 it can store an error message string there.
 If it does so, that's what we report.  If it doesn't (which we'd detect
 by presetting the value to NULL), then use a generic error in row
 processor message.  This is cheap and doesn't prevent the row processor
 from using some application-specific error reporting method if it wants;
 but it does allow the processor to make use of libpq's error mechanism
 when that's preferable.

Yeah.

But such API seems to require specifying allocator, which seems ugly.
I think it would be better to just use Kyotaro's original idea
of PQsetRowProcessorError() which nicer to use.

Few thoughts on the issue:

--

As libpq already provides quite good coverage of PGresult
manipulation APIs, then how about:

  void PQsetResultError(PGresult *res, const char *msg);

that does:

  res-errMsg = pqResultStrdup(msg);
  res-resultStatus = PGRES_FATAL_ERROR;

that would also cause minimal fuss in getAnotherTuple().



I would actually like even more:

  void PQsetConnectionError(PGconn *conn, const char *msg);

that does full-blown libpq error logic.  Thus it would be 
useful everywherewhere in libpq.  But it seems bit too disruptive,
so I would like a ACK from a somebody who knows libpq better.
(well, from you...)

---

Another thought - if we have API to set error from *outside*
of row-processor callback, that would immediately solve the
how to skip incoming resultset without buffering it problem.

And it would be usable for PQgetRow()/PQrecvRow() too.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-30 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 On Fri, Mar 30, 2012 at 05:18:42PM -0400, Tom Lane wrote:
 I'm pretty dissatisfied with the error reporting situation for row
 processors.  You can't just decide not to solve it, which seems to be
 the current state of affairs.  What I'm inclined to do is to add a
 char ** parameter to the row processor, and say that when the
 processor returns -1 it can store an error message string there.

 But such API seems to require specifying allocator, which seems ugly.

Not if the message is a constant string, which seems like the typical
situation (think out of memory).  If the row processor does need a
buffer for a constructed string, it could make use of some space in its
void *param area, for instance.

 I think it would be better to just use Kyotaro's original idea
 of PQsetRowProcessorError() which nicer to use.

I don't particularly care for that idea because it opens up all sorts of
potential issues when such a function is called at the wrong time.
Moreover, you have to remember that the typical situation here is that
we're going to be out of memory or otherwise in trouble, which means
you've got to be really circumspect about what you assume will work.
Row processors that think they can do a lot of fancy message
construction should be discouraged, and an API that requires
construction of a new PGresult in order to return an error is right out.
(This is why getAnotherTuple is careful to clear the failed result
before it tries to build a new one.  But that trick isn't going to be
available to an external row processor.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-30 Thread Marko Kreen

On Sat, Mar 31, 2012 at 1:13 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 On Fri, Mar 30, 2012 at 05:18:42PM -0400, Tom Lane wrote:
 I'm pretty dissatisfied with the error reporting situation for row
 processors.  You can't just decide not to solve it, which seems to be
 the current state of affairs.  What I'm inclined to do is to add a
 char ** parameter to the row processor, and say that when the
 processor returns -1 it can store an error message string there.

 But such API seems to require specifying allocator, which seems ugly.

 Not if the message is a constant string, which seems like the typical
 situation (think out of memory).  If the row processor does need a
 buffer for a constructed string, it could make use of some space in its
 void *param area, for instance.

If it's specified as string that libpq does not own, then I'm fine with it.

 I think it would be better to just use Kyotaro's original idea
 of PQsetRowProcessorError() which nicer to use.

 I don't particularly care for that idea because it opens up all sorts of
 potential issues when such a function is called at the wrong time.
 Moreover, you have to remember that the typical situation here is that
 we're going to be out of memory or otherwise in trouble, which means
 you've got to be really circumspect about what you assume will work.
 Row processors that think they can do a lot of fancy message
 construction should be discouraged, and an API that requires
 construction of a new PGresult in order to return an error is right out.
 (This is why getAnotherTuple is careful to clear the failed result
 before it tries to build a new one.  But that trick isn't going to be
 available to an external row processor.)

Kyotaro's original idea was to assume out-of-memory if error
string was not set, thus the callback needed to set the string
only when it really had something to say.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-30 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 On Sat, Mar 31, 2012 at 1:13 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Not if the message is a constant string, which seems like the typical
 situation (think out of memory).  If the row processor does need a
 buffer for a constructed string, it could make use of some space in its
 void *param area, for instance.

 If it's specified as string that libpq does not own, then I'm fine with it.

Check.  Let's make it const char ** in fact, just to be clear on that.

 (This is why getAnotherTuple is careful to clear the failed result
 before it tries to build a new one.  But that trick isn't going to be
 available to an external row processor.)

 Kyotaro's original idea was to assume out-of-memory if error
 string was not set, thus the callback needed to set the string
 only when it really had something to say.

Hmm.  We could still do that in conjunction with the idea of returning
the string from the row processor, but I'm not sure if it's useful or
just overly cute.

[ thinks... ]  A small advantage of assuming NULL means that is that
we could postpone the libpq_gettext(out of memory) call until after
clearing the overflowed PGresult, which would greatly improve the odds
of getting a nicely translated result and not just ASCII.  Might be
worth it just for that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-29 Thread Tom Lane

Kyotaro HORIGUCHI horiguchi.kyot...@oss.ntt.co.jp writes:
 I'm sorry to have coded a silly bug.
 The previous patch has a bug in realloc size calculation.
 And separation of the 'connname patch' was incomplete in regtest.
 It is fixed in this patch.

I've applied a modified form of the conname update patch.  It seemed to
me that the fault is really in the DBLINK_GET_CONN and
DBLINK_GET_NAMED_CONN macros, which ought to be responsible for setting
the surrounding function's conname variable along with conn, rconn, etc.
There was actually a second error of the same type visible in the dblink
regression test, which is also fixed by this more general method.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-29 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 My conclusion is that row-processor API is low-level expert API and
 quite easy to misuse.  It would be preferable to have something more
 robust as end-user API, the PQgetRow() is my suggestion for that.
 Thus I see 3 choices:

 1) Push row-processor as main API anyway and describe all dangerous
scenarios in documentation.
 2) Have both PQgetRow() and row-processor available in libpq-fe.h,
PQgetRow() as preferred API and row-processor for expert usage,
with proper documentation what works and what does not.
 3) Have PQgetRow() in libpq-fe.h, move row-processor to libpq-int.h.

I still am failing to see the use-case for PQgetRow.  ISTM the entire
point of a special row processor is to reduce the per-row processing
overhead, but PQgetRow greatly increases that overhead.  And it doesn't
reduce complexity much either IMO: you still have all the primary risk
factors arising from processing rows in advance of being sure that the
whole query completed successfully.  Plus it conflates no more data
with there was an error receiving the data or there was an error on
the server side.  PQrecvRow alleviates the per-row-overhead aspect of
that but doesn't really do a thing from the complexity standpoint;
it doesn't look to me to be noticeably easier to use than a row
processor callback.

I think PQgetRow and PQrecvRow just add more API calls without making
any fundamental improvements, and so we could do without them.  There's
more than one way to do it is not necessarily a virtue.

 Second conclusion is that current dblink row-processor usage is broken
 when user uses multiple SELECTs in SQL as dblink uses plain PQexec().

Yeah.  Perhaps we should tweak the row-processor callback API so that
it gets an explicit notification that this is a new resultset.
Duplicating PQexec's behavior would then involve having the dblink row
processor throw away any existing tuplestore and start over when it
gets such a call.

There's multiple ways to express that but the most convenient thing
from libpq's viewpoint, I think, is to have a callback that occurs
immediately after collecting a RowDescription message, before any
rows have arrived.  So maybe we could express that as a callback
with valid res but columns set to NULL?

A different approach would be to add a row counter to the arguments
provided to the row processor; then you'd know a new resultset had
started if you saw rowcounter == 0.  This might have another advantage
of not requiring the row processor to count the rows for itself, which
I think many row processors would otherwise have to do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-29 Thread Kyotaro HORIGUCHI

 I've applied a modified form of the conname update patch.  It seemed to
 me that the fault is really in the DBLINK_GET_CONN and
 DBLINK_GET_NAMED_CONN macros, which ought to be responsible for setting
 the surrounding function's conname variable along with conn, rconn, etc.
 There was actually a second error of the same type visible in the dblink
 regression test, which is also fixed by this more general method.

Come to think of it, the patch is mere a verifying touch for the
bug mingled with the other part of the dblink patch to all
appearances. I totally agree with you. It should be dropped for
this time and done another time.

I'am sorry for bothering you by such a damn thing.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-27 Thread Marko Kreen

On Sat, Mar 24, 2012 at 02:22:24AM +0200, Marko Kreen wrote:
 Main advantage of including PQgetRow() together with low-level
 rowproc API is that it allows de-emphasizing more complex parts of
 rowproc API (exceptions, early exit, skipresult, custom error msg).
 And drop/undocument them or simply call them postgres-internal-only.

I thought more about exceptions and PQgetRow and found
interesting pattern:

- Exceptions are problematic if always allowed.  Although PQexec() is
  easy to fix in current code, trying to keep to promise of exceptions
  are allowed from everywhere adds non-trivial maintainability overhead
  to future libpq changes, so instead we should simply fix documentation.
  Especially as I cannot see any usage scenario that would benefit from
  such promise.

- Multiple SELECTs from PQexec() are problematic even without
  exceptions: additional documentation is needed how to detect
  that rows are coming from new statement.

Now the interesting one:

- PQregisterProcessor() API allows changing the callback permanently.
  Thus breaking any user code which simply calls PQexec()
  and expects regular PGresult back.  Again, nothing to fix
  code-wise, need to document that callback should be set
  only for current query, later changed back.

My conclusion is that row-processor API is low-level expert API and
quite easy to misuse.  It would be preferable to have something more
robust as end-user API, the PQgetRow() is my suggestion for that.
Thus I see 3 choices:

1) Push row-processor as main API anyway and describe all dangerous
   scenarios in documentation.
2) Have both PQgetRow() and row-processor available in libpq-fe.h,
   PQgetRow() as preferred API and row-processor for expert usage,
   with proper documentation what works and what does not.
3) Have PQgetRow() in libpq-fe.h, move row-processor to libpq-int.h.

I guess this needs committer decision which way to go?


Second conclusion is that current dblink row-processor usage is broken
when user uses multiple SELECTs in SQL as dblink uses plain PQexec().
Simplest fix would be to use PQexecParams() instead, but if keeping old
behaviour is important, then dblink needs to emulate PQexec() resultset
behaviour with row-processor or PQgetRow().

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-26 Thread Kyotaro HORIGUCHI

Hello, This is new version of patch for dblink using row processor.

 - Use palloc to allocate temporaly memoriy blocks.

 - Memory allocation is now done in once. Preallocate a block of
   initial size and palloc simplified reallocation code.

 - Resurrected the route for command invoking. And small
   adjustment of behavior on error.

 - Modification to fix connection name missing bug is removed out
   to another patch.

 - Commenting on the functions skipped over by lonjmp is
   withholded according to Marko's discussion.

 - rebased to e8476f46fc847060250c92ec9b310559293087fc

dblink_use_rowproc_20120326.patch - dblink row processor patch.
dblink_connname_20120326.patch- dblink connname fix patch.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 36a8e3e..dd73aa5 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -63,11 +63,23 @@ typedef struct remoteConn
 	bool		newXactForCursor;		/* Opened a transaction for a cursor */
 } remoteConn;
 
+typedef struct storeInfo
+{
+	Tuplestorestate *tuplestore;
+	int nattrs;
+	MemoryContext oldcontext;
+	AttInMetadata *attinmeta;
+	char* valbuf;
+	int valbuflen;
+	char **cstrs;
+	bool error_occurred;
+	bool nummismatch;
+} storeInfo;
+
 /*
  * Internal declarations
  */
 static Datum dblink_record_internal(FunctionCallInfo fcinfo, bool is_async);
-static void materializeResult(FunctionCallInfo fcinfo, PGresult *res);
 static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
@@ -90,6 +102,10 @@ static char *escape_param_str(const char *from);
 static void validate_pkattnums(Relation rel,
    int2vector *pkattnums_arg, int32 pknumatts_arg,
    int **pkattnums, int *pknumatts);
+static void initStoreInfo(storeInfo *sinfo, FunctionCallInfo fcinfo);
+static void finishStoreInfo(storeInfo *sinfo);
+static int storeHandler(PGresult *res, PGrowValue *columns, void *param);
+
 
 /* Global */
 static remoteConn *pconn = NULL;
@@ -111,6 +127,9 @@ typedef struct remoteConnHashEnt
 /* initial number of connection hashes */
 #define NUMCONN 16
 
+/* Initial block size for value buffer in storeHandler */
+#define INITBUFLEN 64
+
 /* general utility */
 #define xpfree(var_) \
 	do { \
@@ -503,6 +522,7 @@ dblink_fetch(PG_FUNCTION_ARGS)
 	char	   *curname = NULL;
 	int			howmany = 0;
 	bool		fail = true;	/* default to backward compatible */
+	storeInfo   storeinfo;
 
 	DBLINK_INIT;
 
@@ -559,15 +579,51 @@ dblink_fetch(PG_FUNCTION_ARGS)
 	appendStringInfo(buf, FETCH %d FROM %s, howmany, curname);
 
 	/*
+	 * Result is stored into storeinfo.tuplestore instead of
+	 * res-result retuned by PQexec below
+	 */
+	initStoreInfo(storeinfo, fcinfo);
+	PQsetRowProcessor(conn, storeHandler, storeinfo);
+
+	/*
 	 * Try to execute the query.  Note that since libpq uses malloc, the
 	 * PGresult will be long-lived even though we are still in a short-lived
 	 * memory context.
 	 */
-	res = PQexec(conn, buf.data);
+	PG_TRY();
+	{
+		res = PQexec(conn, buf.data);
+	}
+	PG_CATCH();
+	{
+		ErrorData *edata;
+
+		finishStoreInfo(storeinfo);
+		edata = CopyErrorData();
+		FlushErrorState();
+
+		/* Skip remaining results when storeHandler raises exception. */
+		PQskipResult(conn, TRUE);
+		ReThrowError(edata);
+	}
+	PG_END_TRY();
+
+	finishStoreInfo(storeinfo);
+
 	if (!res ||
 		(PQresultStatus(res) != PGRES_COMMAND_OK 
 		 PQresultStatus(res) != PGRES_TUPLES_OK))
 	{
+		/* finishStoreInfo saves the fields referred to below. */
+		if (storeinfo.nummismatch)
+		{
+			/* This is only for backward compatibility */
+			ereport(ERROR,
+	(errcode(ERRCODE_DATATYPE_MISMATCH),
+	 errmsg(remote query result rowtype does not match 
+			the specified FROM clause rowtype)));
+		}
+
 		dblink_res_error(conname, res, could not fetch from cursor, fail);
 		return (Datum) 0;
 	}
@@ -579,8 +635,8 @@ dblink_fetch(PG_FUNCTION_ARGS)
 (errcode(ERRCODE_INVALID_CURSOR_NAME),
  errmsg(cursor \%s\ does not exist, curname)));
 	}
+	PQclear(res);
 
-	materializeResult(fcinfo, res);
 	return (Datum) 0;
 }
 
@@ -640,6 +696,7 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 	remoteConn *rconn = NULL;
 	bool		fail = true;	/* default to backward compatible */
 	bool		freeconn = false;
+	storeInfo   storeinfo;
 
 	/* check to see if caller supports us returning a tuplestore */
 	if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
@@ -660,6 +717,7 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 		{
 			/* text,text,bool */
 			DBLINK_GET_CONN;
+			conname = text_to_cstring(PG_GETARG_TEXT_PP(0));
 			sql = text_to_cstring(PG_GETARG_TEXT_PP(1));
 			fail = PG_GETARG_BOOL(2);
 		}
@@ -715,164 +773,229 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 	rsinfo-setResult = NULL;
 	rsinfo-setDesc = NULL;
 
-	/* synchronous query, or async result

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-26 Thread Kyotaro HORIGUCHI

I'm sorry to have coded a silly bug.

The previous patch has a bug in realloc size calculation.
And separation of the 'connname patch' was incomplete in regtest.
It is fixed in this patch.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 36a8e3e..4de28ef 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -63,11 +63,23 @@ typedef struct remoteConn
 	bool		newXactForCursor;		/* Opened a transaction for a cursor */
 } remoteConn;
 
+typedef struct storeInfo
+{
+	Tuplestorestate *tuplestore;
+	int nattrs;
+	MemoryContext oldcontext;
+	AttInMetadata *attinmeta;
+	char* valbuf;
+	int valbuflen;
+	char **cstrs;
+	bool error_occurred;
+	bool nummismatch;
+} storeInfo;
+
 /*
  * Internal declarations
  */
 static Datum dblink_record_internal(FunctionCallInfo fcinfo, bool is_async);
-static void materializeResult(FunctionCallInfo fcinfo, PGresult *res);
 static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
@@ -90,6 +102,10 @@ static char *escape_param_str(const char *from);
 static void validate_pkattnums(Relation rel,
    int2vector *pkattnums_arg, int32 pknumatts_arg,
    int **pkattnums, int *pknumatts);
+static void initStoreInfo(storeInfo *sinfo, FunctionCallInfo fcinfo);
+static void finishStoreInfo(storeInfo *sinfo);
+static int storeHandler(PGresult *res, PGrowValue *columns, void *param);
+
 
 /* Global */
 static remoteConn *pconn = NULL;
@@ -111,6 +127,9 @@ typedef struct remoteConnHashEnt
 /* initial number of connection hashes */
 #define NUMCONN 16
 
+/* Initial block size for value buffer in storeHandler */
+#define INITBUFLEN 64
+
 /* general utility */
 #define xpfree(var_) \
 	do { \
@@ -503,6 +522,7 @@ dblink_fetch(PG_FUNCTION_ARGS)
 	char	   *curname = NULL;
 	int			howmany = 0;
 	bool		fail = true;	/* default to backward compatible */
+	storeInfo   storeinfo;
 
 	DBLINK_INIT;
 
@@ -559,15 +579,51 @@ dblink_fetch(PG_FUNCTION_ARGS)
 	appendStringInfo(buf, FETCH %d FROM %s, howmany, curname);
 
 	/*
+	 * Result is stored into storeinfo.tuplestore instead of
+	 * res-result retuned by PQexec below
+	 */
+	initStoreInfo(storeinfo, fcinfo);
+	PQsetRowProcessor(conn, storeHandler, storeinfo);
+
+	/*
 	 * Try to execute the query.  Note that since libpq uses malloc, the
 	 * PGresult will be long-lived even though we are still in a short-lived
 	 * memory context.
 	 */
-	res = PQexec(conn, buf.data);
+	PG_TRY();
+	{
+		res = PQexec(conn, buf.data);
+	}
+	PG_CATCH();
+	{
+		ErrorData *edata;
+
+		finishStoreInfo(storeinfo);
+		edata = CopyErrorData();
+		FlushErrorState();
+
+		/* Skip remaining results when storeHandler raises exception. */
+		PQskipResult(conn, TRUE);
+		ReThrowError(edata);
+	}
+	PG_END_TRY();
+
+	finishStoreInfo(storeinfo);
+
 	if (!res ||
 		(PQresultStatus(res) != PGRES_COMMAND_OK 
 		 PQresultStatus(res) != PGRES_TUPLES_OK))
 	{
+		/* finishStoreInfo saves the fields referred to below. */
+		if (storeinfo.nummismatch)
+		{
+			/* This is only for backward compatibility */
+			ereport(ERROR,
+	(errcode(ERRCODE_DATATYPE_MISMATCH),
+	 errmsg(remote query result rowtype does not match 
+			the specified FROM clause rowtype)));
+		}
+
 		dblink_res_error(conname, res, could not fetch from cursor, fail);
 		return (Datum) 0;
 	}
@@ -579,8 +635,8 @@ dblink_fetch(PG_FUNCTION_ARGS)
 (errcode(ERRCODE_INVALID_CURSOR_NAME),
  errmsg(cursor \%s\ does not exist, curname)));
 	}
+	PQclear(res);
 
-	materializeResult(fcinfo, res);
 	return (Datum) 0;
 }
 
@@ -640,6 +696,7 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 	remoteConn *rconn = NULL;
 	bool		fail = true;	/* default to backward compatible */
 	bool		freeconn = false;
+	storeInfo   storeinfo;
 
 	/* check to see if caller supports us returning a tuplestore */
 	if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
@@ -660,6 +717,7 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 		{
 			/* text,text,bool */
 			DBLINK_GET_CONN;
+			conname = text_to_cstring(PG_GETARG_TEXT_PP(0));
 			sql = text_to_cstring(PG_GETARG_TEXT_PP(1));
 			fail = PG_GETARG_BOOL(2);
 		}
@@ -715,164 +773,234 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 	rsinfo-setResult = NULL;
 	rsinfo-setDesc = NULL;
 
-	/* synchronous query, or async result retrieval */
-	if (!is_async)
-		res = PQexec(conn, sql);
-	else
+
+	/*
+	 * Result is stored into storeinfo.tuplestore instead of
+	 * res-result retuned by PQexec/PQgetResult below
+	 */
+	initStoreInfo(storeinfo, fcinfo);
+	PQsetRowProcessor(conn, storeHandler, storeinfo);
+
+	PG_TRY();
 	{
-		res = PQgetResult(conn);
-		/* NULL means we're all done with the async results */
-		if (!res)
-			return (Datum) 0;
+		/* synchronous query, or async result retrieval */
+		if (!is_async)
+			res = PQexec(conn, sql);
+		else
+			res

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-23 Thread Kyotaro HORIGUCHI

Thank you for picking up.

 is considering three cases: it got a 2-byte integer (and can continue on),
 or there aren't yet 2 more bytes available in the buffer, in which case it
 should return EOF without doing anything, or pqGetInt detected a hard
 error and updated the connection error state accordingly, in which case
 again there is nothing to do except return EOF.  In the patched code we
 have:
...
 which handles neither the second nor third case correctly: it thinks that
 data not here yet is a hard error, and then makes sure it is an error by
 destroying the parsing state :-(.

Marko and I think that, in protocol 3, all bytes of the incoming
message should have been surely loaded when entering
getAnotherTuple(). The following part In pqParseInput3() does
this.

| if (avail  msgLength)
| {
| /*
|  * Before returning, enlarge the input buffer if needed to hold
|  * the whole message.
|  (snipped)..
|  */
| if (pqCheckInBufferSpace(conn-inCursor + (size_t) msgLength,
|  conn))
| {
| /*
|  * XXX add some better recovery code...
| (snipped)..
|  */
| handleSyncLoss(conn, id, msgLength);
| }
| return;


So, if cursor state is broken just after exiting
getAnotherTuple(), it had already been broken BEFORE entering
getAnotherTuple() according to current disign. That is the
'protocol error' means. pqGetInt there should not detect any
errors except for broken message.


 error, that possibly-useful error message is overwritten with an entirely
 useless protocol error text.

 Plus, current pqGetInt seems to set its own error message only
for the wrong parameter 'bytes'. 

On the other hand, in protocol 2 (to be removed ?) the error
handling mechanism get touched, because full-load of the message
is not guraranteed.

 I don't think the error return cases for the row processor have been
 thought out too well either.  The row processor is not in charge of what
 happens to the PGresult,

Default row processor stuffs PGresult with tuples, another (say
that of dblink) leave it empty. Row processor manages PGresult by
the extent of their own.

  and it certainly has no business telling libpq to just exit
 immediately from the topmost libpq function. If we do that
 we'll probably lose sync with the data stream and be unable to
 recover use of the connection at all.

I don't think PGresult has any charge of error handling system in
current implement. The phrase 'exit immediately from the topmost
libpq function' should not be able to be seen in the patch.

The exit routes from row processor are following,

 - Do longjmp (or PG_PG_TRY-CATCH mechanism) out of the row
   processor.

 - Row processor returns 0 when entered from PQisBusy(),
   immediately exit from PQisBusy().

Curosor consistency will be kept in both case. The cursor already
be on the next to the last byte of the current message.

 Also, do we need to consider any error cases for the row
 processor other than out-of-memory?  If so it might be a good
 idea for it to have some ability to store a custom error
 message into the PGconn, which it cannot do given the current
 function API.

It seems not have so strong necessity concerning dblink or
PQgetRow comparing to expected additional complexity around. So
this patch does not include it.

 In the same vein, I am fairly uncomfortable with the blithe assertion that
 a row processor can safely longjmp out of libpq.  This was never foreseen
 in the original library coding and there are any number of places that
 that might break, now or in the future.  Do we really need to allow it?

To protect row processor from longjmp'ing out, I enclosed the
functions potentially throw exception by PG_TRY-CATCH clause in
the early verson. This was totally safe but the penalty was not
negligible because the TRY-CATCH was passed for every row.


 If we do, it would be a good idea to decorate the libpq
 functions that are now expected to possibly longjmp with
 comments saying so.  Tracing all the potential call paths that
 might be aborted by a longjmp is an essential activity anyway.

Concerning now but the future, I can show you the trail of
confirmation process.

- There is no difference between with and without the patch at
  the level of getAnotherTuple() from the view of consistency.

- Assuming pqParseInput3 detects the next message has not come
  after getAnotherTuple returned. It exits immediately on reading
  the length of the next message. This is the same condition to
  longjumping.
 
   if (pqGetInt(msgLength, 4, conn))
   return;

- parseInput passes it through and immediately exits in
  consistent state.

- The caller of PQgetResult, PQisBusy, PQskipResult, PQnotifies,
  PQputCopyData, pqHandleSendFailure gain the control finally. I
  am convinced that the async status at the time must be
  PGASYNC_BUSY and the conn cursor in consistent state.

  So the ancestor of row processor is encouraged to call

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-23 Thread Marko Kreen

I saw Kyotaro already answered, but I give my view as well.

On Thu, Mar 22, 2012 at 06:07:16PM -0400, Tom Lane wrote:
 AFAICT it breaks async processing entirely, because many changes have been
 made that fail to distinguish insufficient data available as yet from
 hard error.  As an example, this code at the beginning of
 getAnotherTuple:
   
   /* Get the field count and make sure it's what we expect */
   if (pqGetInt(tupnfields, 2, conn))
 ! return EOF;
 
 is considering three cases: it got a 2-byte integer (and can continue on),
 or there aren't yet 2 more bytes available in the buffer, in which case it
 should return EOF without doing anything, or pqGetInt detected a hard
 error and updated the connection error state accordingly, in which case
 again there is nothing to do except return EOF.  In the patched code we
 have:
 
   /* Get the field count and make sure it's what we expect */
   if (pqGetInt(tupnfields, 2, conn))
 ! {
 ! /* Whole the message must be loaded on the buffer here */
 ! errmsg = libpq_gettext(protocol error\n);
 ! goto error_and_forward;
 ! }
 
 which handles neither the second nor third case correctly: it thinks that
 data not here yet is a hard error, and then makes sure it is an error by
 destroying the parsing state :-(.  And if in fact pqGetInt did log an
 error, that possibly-useful error message is overwritten with an entirely
 useless protocol error text.

No, protocol error really is only error case here.

- pqGetInt() does not set errors.

- V3 getAnotherTuple() is called only if packet is fully in buffer.

 I don't think the error return cases for the row processor have been
 thought out too well either.  The row processor is not in charge of what
 happens to the PGresult, and it certainly has no business telling libpq to
 just exit immediately from the topmost libpq function.  If we do that
 we'll probably lose sync with the data stream and be unable to recover use
 of the connection at all.  Also, do we need to consider any error cases
 for the row processor other than out-of-memory?

No, the rule is *not* exit to topmost, but exit PQisBusy().

This is exactly so that if any code that does not expect row-processor
behaviour continues to work.

Also, from programmers POV, this also means row-processor callback causes
minimal changes to existing APIs.

 If so it might be a good
 idea for it to have some ability to store a custom error message into the
 PGconn, which it cannot do given the current function API.

There already was such function, but it was row-processor specific hack
that could leak out and create confusion.  I rejected it.  Instead there
should be generic error setting function, equivalent to current libpq
internal error setting.

But such generic error setting function would need review all libpq
error states as it allows error state appear in new situations.  Also
we need to have well-defined behaviour of client-side errors vs. incoming
server errors.

Considering that even current cut-down patch is troubling committers,
I would definitely suggest postponing such generic error setter to 9.3.

Especially as it does not change anything coding-style-wise.

 In the same vein, I am fairly uncomfortable with the blithe assertion that
 a row processor can safely longjmp out of libpq.  This was never foreseen
 in the original library coding and there are any number of places that
 that might break, now or in the future.  Do we really need to allow it?
 If we do, it would be a good idea to decorate the libpq functions that are
 now expected to possibly longjmp with comments saying so.  Tracing all the
 potential call paths that might be aborted by a longjmp is an essential
 activity anyway.

I think we *should* allow exceptions, but in limited number of APIs.

Basically, the usefulness for users vs. effort from our side
is clearly on the side of providing it.

But its up to us to define what the *limited* means (what needs
least effort from us), so that later when users want to use exceptions
in callback, they need to pick right API.

Currently it seems only PQexec() + multiple SELECTS can give trouble,
as previous PGresult is kept in stack.  Should we unsupport
PQexec or multiple SELECTS?

But such case it borken even without exceptions - or at least
very confusing.  Not sure what to do with it.


In any case, decorating libpq functions is wrong approach.  This gives
suggestion that caller of eg. PQexec() needs to take care of any possible
behaviour of unknown callback.  This will not work.   Instead allowed
functions should be simply listed in row-processor documentation.

Basically custom callback should be always matched by caller that
knows about it and knows how to handle it.  Not sure how to put
such suggestion into documentation tho'.


 Another design deficiency is PQskipResult().  This is badly designed for
 async operation because once you call it, it will absolutely not give

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-22 Thread Tom Lane

Marko Kreen mark...@gmail.com writes:
 [ patches against libpq and dblink ]

I think this patch needs a lot of work.

AFAICT it breaks async processing entirely, because many changes have been
made that fail to distinguish insufficient data available as yet from
hard error.  As an example, this code at the beginning of
getAnotherTuple:
  
/* Get the field count and make sure it's what we expect */
if (pqGetInt(tupnfields, 2, conn))
!   return EOF;

is considering three cases: it got a 2-byte integer (and can continue on),
or there aren't yet 2 more bytes available in the buffer, in which case it
should return EOF without doing anything, or pqGetInt detected a hard
error and updated the connection error state accordingly, in which case
again there is nothing to do except return EOF.  In the patched code we
have:

/* Get the field count and make sure it's what we expect */
if (pqGetInt(tupnfields, 2, conn))
!   {
!   /* Whole the message must be loaded on the buffer here */
!   errmsg = libpq_gettext(protocol error\n);
!   goto error_and_forward;
!   }

which handles neither the second nor third case correctly: it thinks that
data not here yet is a hard error, and then makes sure it is an error by
destroying the parsing state :-(.  And if in fact pqGetInt did log an
error, that possibly-useful error message is overwritten with an entirely
useless protocol error text.

I don't think the error return cases for the row processor have been
thought out too well either.  The row processor is not in charge of what
happens to the PGresult, and it certainly has no business telling libpq to
just exit immediately from the topmost libpq function.  If we do that
we'll probably lose sync with the data stream and be unable to recover use
of the connection at all.  Also, do we need to consider any error cases
for the row processor other than out-of-memory?  If so it might be a good
idea for it to have some ability to store a custom error message into the
PGconn, which it cannot do given the current function API.

In the same vein, I am fairly uncomfortable with the blithe assertion that
a row processor can safely longjmp out of libpq.  This was never foreseen
in the original library coding and there are any number of places that
that might break, now or in the future.  Do we really need to allow it?
If we do, it would be a good idea to decorate the libpq functions that are
now expected to possibly longjmp with comments saying so.  Tracing all the
potential call paths that might be aborted by a longjmp is an essential
activity anyway.

Another design deficiency is PQskipResult().  This is badly designed for
async operation because once you call it, it will absolutely not give back
control until it's read the whole query result.  (It'd be better to have
it set some kind of flag that makes future library calls discard data
until the query result end is reached.)  Something else that's not very
well-designed about it is that it might throw away more than just incoming
tuples.  As an example, suppose that the incoming data at the time you
call it consists of half a dozen rows followed by an ErrorResponse.  The
ErrorResponse will be eaten and discarded, leaving the application no clue
why its transaction has been aborted, or perhaps even the entire session
cancelled.  What we probably want here is just to transiently install a
row processor that discards all incoming data, but the application is
still expected to eventually fetch a PGresult that will tell it whether
the server side of the query completed or not.

In the dblink patch, given that you have to worry about elogs coming out
of BuildTupleFromCStrings and tuplestore_puttuple anyway, it's not clear
what is the benefit of using malloc rather than palloc to manage the
intermediate buffers in storeHandler --- what that seems to mostly
accomplish is increase the risk of permanent memory leaks.  I don't see
much value in per-column buffers either; it'd likely be cheaper to just
palloc one workspace that's big enough for all the column strings
together.  And speaking of leaks, doesn't storeHandler leak the
constructed tuple on each call, not to mention whatever might be leaked by
the called datatype input functions?

It also seems to me that the dblink patch breaks the case formerly handled
in materializeResult() of a PGRES_COMMAND_OK rather than PGRES_TUPLES_OK
result.  The COMMAND case is supposed to be converted to a single-column
text result, and I sure don't see where that's happening now.

BTW, am I right in thinking that some of the hunks in this patch are meant
to fix a pre-existing bug of failing to report the correct connection
name?  If so, it'd likely be better to split those out and submit as a
separate patch, instead of conflating them with a feature addition.

Lastly, as to the pqgetrow patch, the lack of any demonstrated test case
for these functions makes me uncomfortable

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-07 Thread Kyotaro HORIGUCHI

Hello,

  #- We expect PQisBusy(), PQconsumeInput()(?) and 
  #- PQgetResult() to exit immediately and we can
  #- call PQgetResult(), PQskipResult() or
  #- PQisBusy() after.
  | 1 - OK (I'm done with the row)
  |- save result and getAnotherTuple returns 0.
  
  The lines prefixed with '#' is the desirable behavior I have
  understood from the discussion so far. And I doubt that it works
  as we expected for PQgetResult().
 
 No, the desirable behavior is already implemented and documented -
 the stop parsing return code affects only PQisBusy().  As that
 is the function that does the actual parsing.

I am satisfied with your answer. Thank you.

 The main plus if such scheme is that we do not change the behaviour
 of any existing APIs.

I agree with the policy.

 You optimized libpq_gettext() calls, but it's wrong - they must
 wrap the actual strings so that the strings can be extracted
 for translating.

Ouch. I remember to have had an build error about that before...

 Fixed in attached patch.

I'm sorry to annoy you.

   My suggestion - check in getAnotherTuple whether resultStatus is
   already error and do nothing then.  This allows internal pqAddRow
   to set regular out of memory error.  Otherwise give generic
   row processor error.
..
 The suggestion was about getAnotherTuple() - currently it sets
 always error in row processor.  With such check, the callback
 can set the error result itself.  Currently only callbacks that
 live inside libpq can set errors, but if we happen to expose
 error-setting function in outside API, then the getAnotherTuple()
 would already be ready for it.

I see. And I found it implemented in your patch.

 See attached patch, I did some generic comment/docs cleanups
 but also minor code cleanups:
 
 - PQsetRowProcessor(NULL,NULL) sets Param to PGconn, instead NULL,
 - pqAddRow sets out of memory error itself on PGconn.
 - getAnotherTuple(): when callback returns -1, it checks if error
 - dropped the error_saveresult label, it was unnecessary branch.

Ok, I've confirmed them.

 - put libpq_gettext() back around strings.
 - made functions survive conn==NULL.
 - dblink: refreshed regtest result, as now we get actual

Thank you for fixing my bugs.

 - dblink: changed skipAll parameter for PQskipResult() to TRUE,
   as dblink uses PQexec which can send several queries.

I agree to the change. I intended to allow to receive the results
after skipping the current result for failure. But that seems not
only not very likely, but also to be something dangerous.

 - Synced PQgetRow patch with return value changes.
 
 - Synced demos at https://github.com/markokr/libpq-rowproc-demos
   with return value changes.


 I'm pretty happy with current state.  So tagging it
 ReadyForCommitter.

Thank you very much for all your help.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-06 Thread Marko Kreen

On Tue, Mar 06, 2012 at 11:13:45AM +0900, Kyotaro HORIGUCHI wrote:
  But it's broken in V3 protocol - getAnotherTuple() will be called
  only if the packet is fully read.  If the packet contents do not
  agree with packet header, it's protocol error.  Only valid EOF
  return in V3 getAnotherTuple() is when row processor asks
  for early exit.
 
  Original code of getAnotherTuple returns EOF when the bytes to
 be read is not fully loaded. I understand that this was
 inappropriately (redundant checks?) written at least for the
 pqGetInt() for the field length in getAnotherTuple.  But I don't
 understand how to secure the rows (or table data) fully loaded at
 the point of getAnotherTuple called...

Look into pqParseInput3():

if (avail  msgLength)
{
...
return;
}


  * remove pqIsnonblocking(conn) check when row processor returned 2.
I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult
on sync connection.
 
 mmm. EOF from getAnotherTuple makes PQgetResult try furthur
 reading until asyncStatus != PGASYNC_BUSY as far as I saw. And It
 seemed to do so when I tried to remove 'return 2'. I think that
 it is needed at least one additional state for asyncStatus to
 work EOF as desied here.

No.  It's valid to do PQisBusy() + PQconsumeInput() loop until
PQisBusy() returns 0.  Otherwise, yes, PQgetResult() will
block until final result is available.  But thats OK.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-06 Thread Kyotaro HORIGUCHI

Hello, 

 But I don't understand how to secure the rows (or table data)
 fully loaded at the point of getAnotherTuple called...

I found how pqParseInput ensures the entire message is loaded
before getAnotherTuple called.

fe-protocol3.c:107
|   avail = conn-inEnd - conn-inCursor;
|   if (avail  msgLength)
|   {
| if (pqCheckInBufferSpace(conn-inCursor + (size_t)msgLength, conn))

So now I convinced that the whole row data is loaded at the point
that getAnotherTuple is called. I agree that getAnotherTuple
should not return EOF to request for unloaded part of the
message.

Please wait for a while for the new patch.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-06 Thread Kyotaro HORIGUCHI

Hello,

Nevertheless, the problem that exit from parseInput() caused by
non-zero return of getAnotherTuple() results in immediate
re-enter into getAnotherTuple() via parseInput() and no other
side effect is still there. But I will do that in the next patch,
though.

 So now I convinced that the whole row data is loaded at the point
  ^am
 that getAnotherTuple is called. I agree that getAnotherTuple
 should not return EOF to request for unloaded part of the
 message.
 
 Please wait for a while for the new patch.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-06 Thread Kyotaro HORIGUCHI

Hello, this is new version of the patch.

# early_exit.diff is not included for this time and maybe also
# later.  The set of the return values of PQrowProcessor looks
# unnatural if the 0 is removed.

 * Convert old EOFs to protocol errors in V3 getAnotherTuple()

done.

 * V2 getAnotherTuple() can leak PGresult when handling custom
   error from row processor.

Custom error message is removed from both V2 and V3.

 * remove pqIsnonblocking(conn) check when row processor returned 2.
   I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult
   on sync connection.

done. This affects PQisBusy, but PQgetResult won't be affected as
far as I see. I have no idea for PQconsumeInput()..

 * It seems the return codes from callback should be remapped,
   (0, 1, 2) is unusual pattern.  Better would be:
 
-1 - error
 0 - stop parsing / early exit (I'm not done yet)
 1 - OK (I'm done with the row)

done.

This might be described more precisely as follows,

|-1 - error - erase result and change result status to
|   - FATAL_ERROR All the rest rows in current result
|   - will skipped(consumed).
| 0 - stop parsing / early exit (I'm not done yet)
|- getAnotherTuple returns EOF without dropping PGresult.
#- We expect PQisBusy(), PQconsumeInput()(?) and 
#- PQgetResult() to exit immediately and we can
#- call PQgetResult(), PQskipResult() or
#- PQisBusy() after.
| 1 - OK (I'm done with the row)
|- save result and getAnotherTuple returns 0.

The lines prefixed with '#' is the desirable behavior I have
understood from the discussion so far. And I doubt that it works
as we expected for PQgetResult().

 * Please drop PQsetRowProcessorErrMsg() / PQresultSetErrMsg().

done.

 My suggestion - check in getAnotherTuple whether resultStatus is
 already error and do nothing then.  This allows internal pqAddRow
 to set regular out of memory error.  Otherwise give generic
 row processor error.

Current implement seems already doing this in
parseInput3(). Could you give me further explanation?

regards, 

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1af8df6..a6418ec 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -160,3 +160,6 @@ PQconnectStartParams  157
 PQping158
 PQpingParams  159
 PQlibVersion  160
+PQsetRowProcessor	  	  161
+PQgetRowProcessor	  	  162
+PQskipResult		  	  163
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 27a9805..4605e49 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2693,6 +2693,9 @@ makeEmptyPGconn(void)
 	conn-wait_ssl_try = false;
 #endif
 
+	/* set default row processor */
+	PQsetRowProcessor(conn, NULL, NULL);
+
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe
 	 * buffers on Unix systems.  That way, when we are sending a large amount
@@ -2711,8 +2714,13 @@ makeEmptyPGconn(void)
 	initPQExpBuffer(conn-errorMessage);
 	initPQExpBuffer(conn-workBuffer);
 
+	/* set up initial row buffer */
+	conn-rowBufLen = 32;
+	conn-rowBuf = (PGrowValue *)malloc(conn-rowBufLen * sizeof(PGrowValue));
+
 	if (conn-inBuffer == NULL ||
 		conn-outBuffer == NULL ||
+		conn-rowBuf == NULL ||
 		PQExpBufferBroken(conn-errorMessage) ||
 		PQExpBufferBroken(conn-workBuffer))
 	{
@@ -2814,6 +2822,8 @@ freePGconn(PGconn *conn)
 		free(conn-inBuffer);
 	if (conn-outBuffer)
 		free(conn-outBuffer);
+	if (conn-rowBuf)
+		free(conn-rowBuf);
 	termPQExpBuffer(conn-errorMessage);
 	termPQExpBuffer(conn-workBuffer);
 
@@ -5078,3 +5088,4 @@ PQregisterThreadLock(pgthreadlock_t newhandler)
 
 	return prev;
 }
+
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b743566..161d210 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -66,6 +66,7 @@ static PGresult *PQexecFinish(PGconn *conn);
 static int PQsendDescribe(PGconn *conn, char desc_type,
 			   const char *desc_target);
 static int	check_field_number(const PGresult *res, int field_num);
+static int	pqAddRow(PGresult *res, PGrowValue *columns, void *param);
 
 
 /* 
@@ -701,7 +702,6 @@ pqClearAsyncResult(PGconn *conn)
 	if (conn-result)
 		PQclear(conn-result);
 	conn-result = NULL;
-	conn-curTuple = NULL;
 }
 
 /*
@@ -756,7 +756,6 @@ pqPrepareAsyncResult(PGconn *conn)
 	 */
 	res = conn-result;
 	conn-result = NULL;		/* handing over ownership to caller */
-	conn-curTuple = NULL;		/* just in case */
 	if (!res)
 		res = PQmakeEmptyPGresult(conn, PGRES_FATAL_ERROR);
 	else
@@ -828,6 +827,71 @@ pqInternalNotice(const PGNoticeHooks *hooks, const char *fmt,...)
 }
 
 /*
+ * PQsetRowProcessor
+ *   Set function that copies column data out from network buffer.
+ */

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-05 Thread Kyotaro HORIGUCHI

Hello, I'm sorry for the abesnce.

 But it's broken in V3 protocol - getAnotherTuple() will be called
 only if the packet is fully read.  If the packet contents do not
 agree with packet header, it's protocol error.  Only valid EOF
 return in V3 getAnotherTuple() is when row processor asks
 for early exit.

 Original code of getAnotherTuple returns EOF when the bytes to
be read is not fully loaded. I understand that this was
inappropriately (redundant checks?) written at least for the
pqGetInt() for the field length in getAnotherTuple.  But I don't
understand how to secure the rows (or table data) fully loaded at
the point of getAnotherTuple called...

Nevertheles the first pgGetInt() can return EOF when the
previsous row is fully loaded but the next row is not loaded so
the EOF-rerurn seems necessary even if the each row will passed
after fully loaded.

 * Convert old EOFs to protocol errors in V3 getAnotherTuple()

Ok. I will do that.

 * V2 getAnotherTuple() can leak PGresult when handling custom
   error from row processor.

mmm. I will confirm it.

 * remove pqIsnonblocking(conn) check when row processor returned 2.
   I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult
   on sync connection.

mmm. EOF from getAnotherTuple makes PQgetResult try furthur
reading until asyncStatus != PGASYNC_BUSY as far as I saw. And It
seemed to do so when I tried to remove 'return 2'. I think that
it is needed at least one additional state for asyncStatus to
work EOF as desied here.


 * It seems the return codes from callback should be remapped,
   (0, 1, 2) is unusual pattern.  Better would be:
 
-1 - error
 0 - stop parsing / early exit (I'm not done yet)
 1 - OK (I'm done with the row)

I almost agree with it. I will consider the suggestion related to
pqAddRow together.


 * Please drop PQsetRowProcessorErrMsg() / PQresultSetErrMsg().
   Main problem is that it needs to be synced with error handling
   in rest of libpq, which is unlike the rest of row processor patch,
   which consists only of local changes.  All solutions here
   are either ugly hacks or too complex to be part of this patch.

Ok, I will take your advice. 

 Also considering that we have working exceptions and PQgetRow,
 I don't see much need for custom error messages.  If really needed,
 it should be introduced as separate patch, as the area of code it
 affects is completely different.

I agree with it.

 Currently the custom error messaging seems to be the blocker for
 this patch, because of raised complexity when implementing it and
 when reviewing it.  Considering how unimportant the provided
 functionality is, compared to rest of the patch, I think we should
 simply drop it.

Ok.

 My suggestion - check in getAnotherTuple whether resultStatus is
 already error and do nothing then.  This allows internal pqAddRow
 to set regular out of memory error.  Otherwise give generic
 row processor error.

regards, 

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-03-01 Thread Marko Kreen

On Tue, Feb 28, 2012 at 05:04:44PM +0900, Kyotaro HORIGUCHI wrote:
  There is still one EOF in v3 getAnotherTuple() -
  pqGetInt(tupnfields), please turn that one also to
  protocolerror.
 
 pqGetInt() returns EOF only when it wants additional reading from
 network if the parameter `bytes' is appropreate. Non-zero return
 from it seems should be handled as EOF, not a protocol error. The
 one point I had modified bugilly is also restored. The so-called
 'protocol error' has been vanished eventually.

But it's broken in V3 protocol - getAnotherTuple() will be called
only if the packet is fully read.  If the packet contents do not
agree with packet header, it's protocol error.  Only valid EOF
return in V3 getAnotherTuple() is when row processor asks
for early exit.

 Is there someting left undone?

* Convert old EOFs to protocol errors in V3 getAnotherTuple()

* V2 getAnotherTuple() can leak PGresult when handling custom
  error from row processor.

* remove pqIsnonblocking(conn) check when row processor returned 2.
  I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult
  on sync connection.

* It seems the return codes from callback should be remapped,
  (0, 1, 2) is unusual pattern.  Better would be:

   -1 - error
0 - stop parsing / early exit (I'm not done yet)
1 - OK (I'm done with the row)

* Please drop PQsetRowProcessorErrMsg() / PQresultSetErrMsg().
  Main problem is that it needs to be synced with error handling
  in rest of libpq, which is unlike the rest of row processor patch,
  which consists only of local changes.  All solutions here
  are either ugly hacks or too complex to be part of this patch.

Also considering that we have working exceptions and PQgetRow,
I don't see much need for custom error messages.  If really needed,
it should be introduced as separate patch, as the area of code it
affects is completely different.

Currently the custom error messaging seems to be the blocker for
this patch, because of raised complexity when implementing it and
when reviewing it.  Considering how unimportant the provided
functionality is, compared to rest of the patch, I think we should
simply drop it.

My suggestion - check in getAnotherTuple whether resultStatus is
already error and do nothing then.  This allows internal pqAddRow
to set regular out of memory error.  Otherwise give generic
row processor error.

 By the way, I noticed that dblink always says that the current
 connection is 'unnamed' in messages the errors in
 dblink_record_internal@dblink.  I could see that
 dblink_record_internal defines the local variable conname = NULL
 and pass it to dblink_res_error to display the error message. But
 no assignment on it in the function.
 
 It seemed properly shown when I added the code to set conname
 from PG_GETARG_TEXT_PP(0) if available, in other words do that
 just after DBLINK_GET_CONN/DBLINK_GET_NAMED_CONN's. It seems the
 dblink's manner...  This is not included in this patch.
 
 Furthurmore dblink_res_error looks only into returned PGresult to
 display the error and always says only `Error occurred on dblink
 connection..: could not execute query'..
 
 Is it right to consider this as follows?
 
  - dblink is wrong in error handling. A client of libpq should
see PGconn by PQerrorMessage() if (or regardless of whether?)
PGresult says nothing about error.

Yes, it seems like bug.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-28 Thread Kyotaro HORIGUCHI

This is the new version of the patch.
It is not rebased to the HEAD because of a build error.

 It's better to restore old two-path error handling.

I restorerd OOM and save result route. But it seems needed to
get back any amount of memory on REAL OOM as the comment in
original code says.

So I restored the meaning of rp == 0  errMsg == NULL as REAL
OOM which is to throw the async result away and the result will
be preserved if errMsg is not NULL. `unknown error' has been
removed.

As the result, if row processor returns 0 the parser skips to the
end of rows and returns the working result or an error result
according to whether errMsg is set or not in the row processor.


 I don't think that should be required.  Just use a dummy msg.

Considering the above, pqAddRow is also restored to leave errMsg
NULL on OOM.

 There is still one EOF in v3 getAnotherTuple() -
 pqGetInt(tupnfields), please turn that one also to
 protocolerror.

pqGetInt() returns EOF only when it wants additional reading from
network if the parameter `bytes' is appropreate. Non-zero return
from it seems should be handled as EOF, not a protocol error. The
one point I had modified bugilly is also restored. The so-called
'protocol error' has been vanished eventually.

 Instead use (%s, errmsg) as argument there.  libpq code
 is noisy enough, no need to add more.

done

Is there someting left undone?


By the way, I noticed that dblink always says that the current
connection is 'unnamed' in messages the errors in
dblink_record_internal@dblink.  I could see that
dblink_record_internal defines the local variable conname = NULL
and pass it to dblink_res_error to display the error message. But
no assignment on it in the function.

It seemed properly shown when I added the code to set conname
from PG_GETARG_TEXT_PP(0) if available, in other words do that
just after DBLINK_GET_CONN/DBLINK_GET_NAMED_CONN's. It seems the
dblink's manner...  This is not included in this patch.

Furthurmore dblink_res_error looks only into returned PGresult to
display the error and always says only `Error occurred on dblink
connection..: could not execute query'..

Is it right to consider this as follows?

 - dblink is wrong in error handling. A client of libpq should
   see PGconn by PQerrorMessage() if (or regardless of whether?)
   PGresult says nothing about error.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1af8df6..239edb8 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -160,3 +160,7 @@ PQconnectStartParams  157
 PQping158
 PQpingParams  159
 PQlibVersion  160
+PQsetRowProcessor	  161
+PQgetRowProcessor	  162
+PQresultSetErrMsg	  163
+PQskipResult		  164
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 27a9805..4605e49 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2693,6 +2693,9 @@ makeEmptyPGconn(void)
 	conn-wait_ssl_try = false;
 #endif
 
+	/* set default row processor */
+	PQsetRowProcessor(conn, NULL, NULL);
+
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe
 	 * buffers on Unix systems.  That way, when we are sending a large amount
@@ -2711,8 +2714,13 @@ makeEmptyPGconn(void)
 	initPQExpBuffer(conn-errorMessage);
 	initPQExpBuffer(conn-workBuffer);
 
+	/* set up initial row buffer */
+	conn-rowBufLen = 32;
+	conn-rowBuf = (PGrowValue *)malloc(conn-rowBufLen * sizeof(PGrowValue));
+
 	if (conn-inBuffer == NULL ||
 		conn-outBuffer == NULL ||
+		conn-rowBuf == NULL ||
 		PQExpBufferBroken(conn-errorMessage) ||
 		PQExpBufferBroken(conn-workBuffer))
 	{
@@ -2814,6 +2822,8 @@ freePGconn(PGconn *conn)
 		free(conn-inBuffer);
 	if (conn-outBuffer)
 		free(conn-outBuffer);
+	if (conn-rowBuf)
+		free(conn-rowBuf);
 	termPQExpBuffer(conn-errorMessage);
 	termPQExpBuffer(conn-workBuffer);
 
@@ -5078,3 +5088,4 @@ PQregisterThreadLock(pgthreadlock_t newhandler)
 
 	return prev;
 }
+
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b743566..ce58778 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -66,6 +66,7 @@ static PGresult *PQexecFinish(PGconn *conn);
 static int PQsendDescribe(PGconn *conn, char desc_type,
 			   const char *desc_target);
 static int	check_field_number(const PGresult *res, int field_num);
+static int	pqAddRow(PGresult *res, PGrowValue *columns, void *param);
 
 
 /* 
@@ -701,7 +702,6 @@ pqClearAsyncResult(PGconn *conn)
 	if (conn-result)
 		PQclear(conn-result);
 	conn-result = NULL;
-	conn-curTuple = NULL;
 }
 
 /*
@@ -756,7 +756,6 @@ pqPrepareAsyncResult(PGconn *conn)
 	 */
 	res = conn-result;
 	conn-result = NULL;		/* handing over ownership to caller */
-	conn-curTuple = NULL;		/* just in case */
 	if (!res)
 		res = PQmakeEmptyPGresult(conn,

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-27 Thread Kyotaro HORIGUCHI

Hello, 

 On OOM, the old result is freed to have higher chance that
 constructing new result succeeds.  But if we want to transport
 error message, we need to keep old PGresult around.  Thus
 two separate paths.

Ok, I understood the aim. But now we can use non-local exit to do
that for not only asynchronous reading (PQgetResult()) but
synchronous (PQexec()). If we should provide a means other than
exceptions to do that, I think it should be usable for both
syncronous and asynchronous reading. conn-asyncStatus seems to
be used for the case.

Wow is the modification below?

- getAnotherTuple() now returns 0 to continue as before, and 1
  instead of EOF to signal EOF state, and 2 to instruct to exit
  immediately.

- pqParseInput3 set conn-asyncStatus to PGASYNC_BREAK for the
  last case,

- then PQgetResult() returns immediately when
  asyncStatus == PGASYNC_TUPLES_BREAK after parseInput() retunes.

- and PQexecFinish() returns immediately if PQgetResult() returns
  with aysncStatys == PGASYNC_TUPLES_BREAK.

- PGgetResult() sets asyncStatus = PGRES_TUPLES_OK if called with
  asyncStatus == PGRES_TUPLES_BREAK

- New libpq API PQisBreaked(PGconn*) returns if asyncStatus ==
  PGRES_TUPLES_BREAK can be used to check if the transfer is breaked.

 Instead use (%s, errmsg) as argument there.  libpq code
 is noisy enough, no need to add more.

Ok. I will do so.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-27 Thread Marko Kreen

On Mon, Feb 27, 2012 at 05:20:30PM +0900, Kyotaro HORIGUCHI wrote:
 Hello, 
 
  On OOM, the old result is freed to have higher chance that
  constructing new result succeeds.  But if we want to transport
  error message, we need to keep old PGresult around.  Thus
  two separate paths.
 
 Ok, I understood the aim. But now we can use non-local exit to do
 that for not only asynchronous reading (PQgetResult()) but
 synchronous (PQexec()). If we should provide a means other than
 exceptions to do that, I think it should be usable for both
 syncronous and asynchronous reading. conn-asyncStatus seems to
 be used for the case.
 
 Wow is the modification below?
 
 - getAnotherTuple() now returns 0 to continue as before, and 1
   instead of EOF to signal EOF state, and 2 to instruct to exit
   immediately.
 
 - pqParseInput3 set conn-asyncStatus to PGASYNC_BREAK for the
   last case,
 
 - then PQgetResult() returns immediately when
   asyncStatus == PGASYNC_TUPLES_BREAK after parseInput() retunes.
 
 - and PQexecFinish() returns immediately if PQgetResult() returns
   with aysncStatys == PGASYNC_TUPLES_BREAK.
 
 - PGgetResult() sets asyncStatus = PGRES_TUPLES_OK if called with
   asyncStatus == PGRES_TUPLES_BREAK
 
 - New libpq API PQisBreaked(PGconn*) returns if asyncStatus ==
   PGRES_TUPLES_BREAK can be used to check if the transfer is breaked.

I don't get where are you going with such changes.  Are you trying to
make V2 code survive row-processor errors?  (Only V2 has concept of
EOF state.)  Then forget about it.  It's more important to not
destabilize V3 code.

And error from row processor is not something special from other errors.
Why does it need special state?  And note that adding new PGASYNC or
PGRES state needs review of all if() and switch() statements in the
code that use those fields...  So there must serious need for it.
At the moment I don't see such need.

I just asked you to replace -rowProcessorErrMsg with -errMsg
to get rid of unnecessary field.

But if you want to do bigger cleanup, then you can instead make
PQsetRowProcessorErrMsg() more generic:

  PQsetErrorMessage(PGconn *conn, const char *msg)

Current code has the tiny problem that it adds new special state between
PQsetRowProcessorErrMsg() and return from callback to getAnotherTuple()
where getAnotherTuple() sets actual error state.  When the callback
exits via exception, the state can leak to other code.  It should not
break anything, but it's still annoying.

Also, with the PQgetRow() patch, the need for doing complex processing
under callback has decreased and the need to set error outside callback
has increased.

As a bonus, such generic error-setting function would lose any extra
special state introduced by row-processor patch.

Previously I mentioned that callback would need to have additional
PGconn* argument to make connection available to callback to use such
generic error setting function, but now I think it does not need it -
simple callbacks don't need to set errors and complex callback can make
the PGconn available via Param.  Eg. the internal callback should set
Param to PGconn, instead keeping NULL there.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-27 Thread Kyotaro HORIGUCHI

Hello.

I will show you fixed version patch later, please wait for a
while.

==
 It's more important to not destabilize V3 code.

Ok, I withdraw that agreeing with that point. And I noticed that
the proposal before is totally a crap becuase I have mixed up
asyncStatus with resultStatus in it.

 And error from row processor is not something special from
 other errors.  Why does it need special state?

I'm sorry to have upset the discussion. What I wanted there is a
means other than exceptions to exit out of PQexec() by row
processor trigger without discarding the result built halfway,
like async.

 I just asked you to replace -rowProcessorErrMsg with -errMsg
 to get rid of unnecessary field.

Ok, I will remove extra code.

 Also, with the PQgetRow() patch, the need for doing complex processing
 under callback has decreased and the need to set error outside callback
 has increased.
 
 As a bonus, such generic error-setting function would lose any extra
 special state introduced by row-processor patch.

That sounds nice. I will show you the patch without it for the
present, then try to include.

 Previously I mentioned that callback would need to have additional
 PGconn* argument to make connection available to callback to use such
 generic error setting function, but now I think it does not need it -
 simple callbacks don't need to set errors and complex callback can make
 the PGconn available via Param.  Eg. the internal callback should set
 Param to PGconn, instead keeping NULL there.

I agree with it.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-26 Thread Marko Kreen

On Fri, Feb 24, 2012 at 05:46:16PM +0200, Marko Kreen wrote:
 - rename to PQrecvRow() and additionally provide PQgetRow()

I tried it and it seems to work as API - there is valid behaviour
for both sync and async connections.

Sync connection - PQgetRow() waits for data from network:

if (!PQsendQuery(db, q))
die(db, PQsendQuery);
while (1) {
r = PQgetRow(db);
if (!r)
break;
handle(r);
PQclear(r);
}
r = PQgetResult(db);

Async connection - PQgetRow() does PQisBusy() loop internally,
but does not read from network:

   on read event:
PQconsumeInput(db);
while (1) {
r = PQgetRow(db);
if (!r)
break;
handle(r);
PQclear(r);
}
if (!PQisBusy(db))
r = PQgetResult(db);
else
waitForMoredata();


As it seems to simplify life for quite many potential users,
it seems worth including in libpq properly.

Attached patch is on top of v20120223 of row-processor
patch.  Only change in general code is allowing
early exit for syncronous connection, as we now have
valid use-case for it.

-- 
marko

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 0087b43..b2779a8 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -4115,6 +4115,111 @@ int PQflush(PGconn *conn);
read-ready and then read the response as described above.
   /para
 
+  para
+   Above-mentioned functions always wait until full resultset has arrived
+   before makeing row data available as PGresult.  Sometimes it's
+   more useful to process rows as soon as the arrive from network.
+   For that, following functions can be used:
+   variablelist
+varlistentry id=libpq-pqgetrow
+ term
+  functionPQgetRow/function
+  indexterm
+   primaryPQgetRow/primary
+  /indexterm
+ /term
+
+ listitem
+  para
+   Waits for the next row from a prior
+   functionPQsendQuery/function,
+   functionPQsendQueryParams/function,
+   functionPQsendQueryPrepared/function call, and returns it.
+   A null pointer is returned when no more rows are available or
+   some error happened.
+synopsis
+PGresult *PQgetRow(PGconn *conn);
+/synopsis
+  /para
+
+  para
+   If this function returns non-NULL result, it is a
+   structnamePGresult/structname that contains exactly 1 row.
+   It needs to be freed later with functionPQclear/function.
+  /para
+  para
+   On synchronous connection, the function will wait for more
+   data from network until all resultset is done.  So it returns
+   NULL only if resultset has completely received or some error
+   happened.  In both cases, call functionPQgetResult/function
+   next to get final status.
+  /para
+
+  para
+   On asynchronous connection the function does not read more data
+   from network.   So after NULL call functionPQisBusy/function
+   to see whether final structnamePGresult/structname is avilable
+   or more data needs to be read from network via
+   functionPQconsumeInput/function.  Do not call
+   functionPQisBusy/function before functionPQgetRow/function
+   has returned NULL, as functionPQisBusy/function will parse
+   any available rows and add them to main functionPGresult/function
+   that will be returned later by functionPQgetResult/function.
+  /para
+
+ /listitem
+/varlistentry
+
+varlistentry id=libpq-pqrecvrow
+ term
+  functionPQrecvRow/function
+  indexterm
+   primaryPQrecvRow/primary
+  /indexterm
+ /term
+
+ listitem
+  para
+   Get row data without constructing PGresult for it.  This is the
+   underlying function for functionPQgetRow/function.
+synopsis
+int PQrecvRow(PGconn *conn, PGresult **hdr_p, PGrowValue **row_p);
+/synopsis
+  /para
+
+  para
+   It returns row data as pointers to network buffer.
+   All structures are owned by applicationlibpq/application's
+   structnamePGconn/structname and must not be freed or stored
+   by user.  Instead row data should be copied to user structures, before
+   any applicationlibpq/application result-processing function
+   is called.
+  /para
+  para
+   It returns 1 when row data is available.
+   Argument parameterhdr_p/parameter will contain pointer
+   to empty structnamePGresult/structname that describes
+   row contents.  Actual data is in parameterrow_p/parameter.
+   For the description of structure structnamePGrowValue/structname
+   see xref linkend=libpq-altrowprocessor.
+  /para
+  paraIt returns 0 when no more rows are avalable.  On synchronous
+   connection, it means resultset is fully arrived.  Call
+   functionPQgetResult/function to get final

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-24 Thread Kyotaro HORIGUCHI

Hello, this is new version of the patch.

  By the way, I would like to ask you one question. What is the
  reason why void* should be head or tail of the parameter list?
 
 Aesthetical reasons:

 I got it. Thank you.

 Last comment - if we drop the plan to make PQsetRowProcessorErrMsg()
 usable outside of handler, we can simplify internal usage as well:
 the PGresult-rowProcessorErrMsg can be dropped and let's use
 -errMsg to transport the error message.
 
 The PGresult is long-lived structure and adding fields for such
 temporary usage feels wrong.  There is no other libpq code between
 row processor and getAnotherTuple, so the use is safe.

I almost agree with it. Plus, I think it is no reason to consider
out of memory as particular because now row processor becomes
generic. But the previous patch does different process for OOM
and others, but I couldn't see obvious reason to do so.

- PGresult.rowProcessorErrMes is removed and use PGconn.errMsg
  instead with the new interface function PQresultSetErrMes().

- Now row processors should set message for any error status
  occurred within. pqAddRow and dblink.c is modified to do so.

- getAnotherTuple() sets the error message `unknown error' for
  the condition rp == 0  -errMsg == NULL.

- Always forward input cursor and do pqClearAsyncResult() and
  pqSaveErrorResult() when rp == 0 in getAnotherTuple()
  regardless whether -errMsg is NULL or not in fe-protocol3.c.

- conn-inStart is already moved to the start point of the next
  message when row processor is called. So forwarding inStart in
  outOfMemory1 seems redundant. I removed it.

- printfPQExpBuffer() compains for variable message. So use
  resetPQExpBuffer() and appendPQExpBufferStr() instead.
  
=
- dblink does not use conn-errorMessage before, and also now...
  all errors are displayed as `Error occurred on dblink connection...'.

- TODO: No NLS messages for error messages.

- Somehow make check yields error for base revision. So I have
  not done that.

- I have no idea how to do test for protocol 2...

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1af8df6..239edb8 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -160,3 +160,7 @@ PQconnectStartParams  157
 PQping158
 PQpingParams  159
 PQlibVersion  160
+PQsetRowProcessor	  161
+PQgetRowProcessor	  162
+PQresultSetErrMsg	  163
+PQskipResult		  164
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 27a9805..4605e49 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2693,6 +2693,9 @@ makeEmptyPGconn(void)
 	conn-wait_ssl_try = false;
 #endif
 
+	/* set default row processor */
+	PQsetRowProcessor(conn, NULL, NULL);
+
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe
 	 * buffers on Unix systems.  That way, when we are sending a large amount
@@ -2711,8 +2714,13 @@ makeEmptyPGconn(void)
 	initPQExpBuffer(conn-errorMessage);
 	initPQExpBuffer(conn-workBuffer);
 
+	/* set up initial row buffer */
+	conn-rowBufLen = 32;
+	conn-rowBuf = (PGrowValue *)malloc(conn-rowBufLen * sizeof(PGrowValue));
+
 	if (conn-inBuffer == NULL ||
 		conn-outBuffer == NULL ||
+		conn-rowBuf == NULL ||
 		PQExpBufferBroken(conn-errorMessage) ||
 		PQExpBufferBroken(conn-workBuffer))
 	{
@@ -2814,6 +2822,8 @@ freePGconn(PGconn *conn)
 		free(conn-inBuffer);
 	if (conn-outBuffer)
 		free(conn-outBuffer);
+	if (conn-rowBuf)
+		free(conn-rowBuf);
 	termPQExpBuffer(conn-errorMessage);
 	termPQExpBuffer(conn-workBuffer);
 
@@ -5078,3 +5088,4 @@ PQregisterThreadLock(pgthreadlock_t newhandler)
 
 	return prev;
 }
+
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b743566..7fd3c9c 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -66,6 +66,7 @@ static PGresult *PQexecFinish(PGconn *conn);
 static int PQsendDescribe(PGconn *conn, char desc_type,
 			   const char *desc_target);
 static int	check_field_number(const PGresult *res, int field_num);
+static int	pqAddRow(PGresult *res, PGrowValue *columns, void *param);
 
 
 /* 
@@ -701,7 +702,6 @@ pqClearAsyncResult(PGconn *conn)
 	if (conn-result)
 		PQclear(conn-result);
 	conn-result = NULL;
-	conn-curTuple = NULL;
 }
 
 /*
@@ -756,7 +756,6 @@ pqPrepareAsyncResult(PGconn *conn)
 	 */
 	res = conn-result;
 	conn-result = NULL;		/* handing over ownership to caller */
-	conn-curTuple = NULL;		/* just in case */
 	if (!res)
 		res = PQmakeEmptyPGresult(conn, PGRES_FATAL_ERROR);
 	else
@@ -828,6 +827,93 @@ pqInternalNotice(const PGNoticeHooks *hooks, const char *fmt,...)
 }
 
 /*
+ * PQsetRowProcessor
+ *   Set function that copies column data out from network buffer.
+ */
+void
+PQsetRowProcessor(PGconn *conn, PQrowProcessor func, void *param)
+{
+

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-24 Thread Marko Kreen

On Fri, Feb 24, 2012 at 07:53:14PM +0900, Kyotaro HORIGUCHI wrote:
 Hello, this is new version of the patch.
 
   By the way, I would like to ask you one question. What is the
   reason why void* should be head or tail of the parameter list?
  
  Aesthetical reasons:
 
  I got it. Thank you.
 
  Last comment - if we drop the plan to make PQsetRowProcessorErrMsg()
  usable outside of handler, we can simplify internal usage as well:
  the PGresult-rowProcessorErrMsg can be dropped and let's use
  -errMsg to transport the error message.
  
  The PGresult is long-lived structure and adding fields for such
  temporary usage feels wrong.  There is no other libpq code between
  row processor and getAnotherTuple, so the use is safe.
 
 I almost agree with it. Plus, I think it is no reason to consider
 out of memory as particular because now row processor becomes
 generic. But the previous patch does different process for OOM
 and others, but I couldn't see obvious reason to do so.

On OOM, the old result is freed to have higher chance that
constructing new result succeeds.  But if we want to transport
error message, we need to keep old PGresult around.  Thus
two separate paths.

This also means your new code is broken, the errmsg becomes
invalid after pqClearAsyncResult().

It's better to restore old two-path error handling.

 - Now row processors should set message for any error status
   occurred within. pqAddRow and dblink.c is modified to do so.

I don't think that should be required.  Just use a dummy msg.

 - getAnotherTuple() sets the error message `unknown error' for
   the condition rp == 0  -errMsg == NULL.

Ok.  I think most client will want to drop connection
on error from rowproc, so exact message does not matter.

 - Always forward input cursor and do pqClearAsyncResult() and
   pqSaveErrorResult() when rp == 0 in getAnotherTuple()
   regardless whether -errMsg is NULL or not in fe-protocol3.c.

Ok.  Although skipping packet on OOM does is dubious,
we will skip all further packets anyway, so let's be
consistent on problems.

There is still one EOF in v3 getAnotherTuple() - pqGetInt(tupnfields),
please turn that one also to protocolerror.

 - conn-inStart is already moved to the start point of the next
   message when row processor is called. So forwarding inStart in
   outOfMemory1 seems redundant. I removed it.
 
 - printfPQExpBuffer() compains for variable message. So use
   resetPQExpBuffer() and appendPQExpBufferStr() instead.

Instead use (%s, errmsg) as argument there.  libpq code
is noisy enough, no need to add more.

 - I have no idea how to do test for protocol 2...

I have a urge to test with rm fe-protocol2.c...

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-24 Thread Marko Kreen

On Tue, Feb 14, 2012 at 01:39:06AM +0200, Marko Kreen wrote:
 I tried imaging some sort of getFoo() style API for fetching in-flight
 row data, but I always ended up with rewrite libpq step, so I feel
 it's not productive to go there.
 
 Instead I added simple feature: rowProcessor can return '2',
 in which case getAnotherTuple() does early exit without setting
 any error state.  In user side it appears as PQisBusy() returned
 with TRUE result.  All pointers stay valid, so callback can just
 stuff them into some temp area.  ATM there is not indication though
 whether the exit was due to callback or other reasons, so user
 must detect it based on whether new temp pointers appeares,
 which means those must be cleaned before calling PQisBusy() again.
 This actually feels like feature, those must not stay around
 after single call.

To see how iterating a resultset would look like I implemented PQgetRow()
function using the currently available public API:

/*
 * Wait and return next row in resultset.
 *
 * returns:
 *   1 - row data available, the pointers are owned by PGconn
 *   0 - result done, use PQgetResult() to get final result
 *  -1 - some problem, check connection error
 */
int PQgetRow(PGconn *db, PGresult **hdr_p, PGrowValue **row_p);

code at:

  https://github.com/markokr/libpq-rowproc-demos/blob/master/getrow.c

usage:

/* send query */
if (!PQsendQuery(db, q))
die(db, PQsendQuery);

/* fetch rows one-by-one */
while (1) {
rc = PQgetRow(db, hdr, row);
if (rc  0)
proc_row(hdr, row);
else if (rc == 0)
break;
else
die(db, streamResult);
}
/* final PGresult, either PGRES_TUPLES_OK or error */
final = PQgetResult(db);


It does not look like it can replace the public callback API,
because it does not work with fully-async connections well.
But it *does* continue the line of synchronous APIs:

- PQexec(): last result only
- PQsendQuery() + PQgetResult(): each result separately
- PQsendQuery() + PQgetRow() + PQgetResult(): each row separately

Also the generic implementation is slightly messy, because
it cannot assume anything about surrounding usage patterns,
while same code living in some user framework can.  But
for simple users who just want to synchronously iterate
over resultset, it might be good enough API?


It does have a inconsistency problem - the row data does
not live in PGresult but in custom container.  Proper
API pattern would be to have PQgetRow() that gives
functional PGresult, but that is not interesting for
high-performace users.  Solutions:

- rename to PQrecvRow()
- rename to PQrecvRow() and additionally provide PQgetRow()
- Don't bother, let users implement it themselves via callback API.

Comments?

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-23 Thread Kyotaro HORIGUCHI

Hello, this is new version of the patch.

# This patch is based on the commit
# 2bbd88f8f841b01efb073972b60d4dc1ff1f6fd0 @ Feb 13 to avoid the
# compile error caused by undeclared LEAKPROOF in kwlist.h.

 It must free the PGresults that PQgetResult() returns.

 I'm sorry. It slipped out of my mind. Add PQclear() for the
 return value.

 Also, please fix 2 issues mentined here:

- PQsetRowProcessorErrMsg() now handles msg as const string.

- Changed the order of the parameters of the type PQrowProcessor.
  New order is (PGresult *res, PGrowValue *columns, void *params).

# PQsetRowProcessorErrMsg outside of callback is not implemented.

- Documentation and dblink are modified according to the changes
  above.


By the way, I would like to ask you one question. What is the
reason why void* should be head or tail of the parameter list?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1af8df6..7e02497 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -160,3 +160,7 @@ PQconnectStartParams  157
 PQping158
 PQpingParams  159
 PQlibVersion  160
+PQsetRowProcessor	  161
+PQgetRowProcessor	  162
+PQsetRowProcessorErrMsg	  163
+PQskipResult		  164
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 27a9805..4605e49 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2693,6 +2693,9 @@ makeEmptyPGconn(void)
 	conn-wait_ssl_try = false;
 #endif
 
+	/* set default row processor */
+	PQsetRowProcessor(conn, NULL, NULL);
+
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe
 	 * buffers on Unix systems.  That way, when we are sending a large amount
@@ -2711,8 +2714,13 @@ makeEmptyPGconn(void)
 	initPQExpBuffer(conn-errorMessage);
 	initPQExpBuffer(conn-workBuffer);
 
+	/* set up initial row buffer */
+	conn-rowBufLen = 32;
+	conn-rowBuf = (PGrowValue *)malloc(conn-rowBufLen * sizeof(PGrowValue));
+
 	if (conn-inBuffer == NULL ||
 		conn-outBuffer == NULL ||
+		conn-rowBuf == NULL ||
 		PQExpBufferBroken(conn-errorMessage) ||
 		PQExpBufferBroken(conn-workBuffer))
 	{
@@ -2814,6 +2822,8 @@ freePGconn(PGconn *conn)
 		free(conn-inBuffer);
 	if (conn-outBuffer)
 		free(conn-outBuffer);
+	if (conn-rowBuf)
+		free(conn-rowBuf);
 	termPQExpBuffer(conn-errorMessage);
 	termPQExpBuffer(conn-workBuffer);
 
@@ -5078,3 +5088,4 @@ PQregisterThreadLock(pgthreadlock_t newhandler)
 
 	return prev;
 }
+
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b743566..cd287cd 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -66,6 +66,7 @@ static PGresult *PQexecFinish(PGconn *conn);
 static int PQsendDescribe(PGconn *conn, char desc_type,
 			   const char *desc_target);
 static int	check_field_number(const PGresult *res, int field_num);
+static int	pqAddRow(PGresult *res, PGrowValue *columns, void *param);
 
 
 /* 
@@ -160,6 +161,7 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
 	result-curBlock = NULL;
 	result-curOffset = 0;
 	result-spaceLeft = 0;
+	result-rowProcessorErrMsg = NULL;
 
 	if (conn)
 	{
@@ -701,7 +703,6 @@ pqClearAsyncResult(PGconn *conn)
 	if (conn-result)
 		PQclear(conn-result);
 	conn-result = NULL;
-	conn-curTuple = NULL;
 }
 
 /*
@@ -756,7 +757,6 @@ pqPrepareAsyncResult(PGconn *conn)
 	 */
 	res = conn-result;
 	conn-result = NULL;		/* handing over ownership to caller */
-	conn-curTuple = NULL;		/* just in case */
 	if (!res)
 		res = PQmakeEmptyPGresult(conn, PGRES_FATAL_ERROR);
 	else
@@ -828,6 +828,87 @@ pqInternalNotice(const PGNoticeHooks *hooks, const char *fmt,...)
 }
 
 /*
+ * PQsetRowProcessor
+ *   Set function that copies column data out from network buffer.
+ */
+void
+PQsetRowProcessor(PGconn *conn, PQrowProcessor func, void *param)
+{
+	conn-rowProcessor = (func ? func : pqAddRow);
+	conn-rowProcessorParam = param;
+}
+
+/*
+ * PQgetRowProcessor
+ *   Get current row processor of conn. set pointer to current parameter for
+ *   row processor to param if not NULL.
+ */
+PQrowProcessor
+PQgetRowProcessor(PGconn *conn, void **param)
+{
+	if (param)
+		*param = conn-rowProcessorParam;
+
+	return conn-rowProcessor;
+}
+
+/*
+ * PQsetRowProcessorErrMsg
+ *Set the error message pass back to the caller of RowProcessor.
+ *
+ *You can replace the previous message by alternative mes, or clear
+ *it with NULL.
+ */
+void
+PQsetRowProcessorErrMsg(PGresult *res, const char *msg)
+{
+	if (msg)
+		res-rowProcessorErrMsg = pqResultStrdup(res, msg);
+	else
+		res-rowProcessorErrMsg = NULL;
+}
+
+/*
+ * pqAddRow
+ *	  add a row to the PGresult structure, growing it if necessary
+ *	  Returns TRUE if OK, FALSE if not enough memory to add the row.
+ */
+static int
+pqAddRow(PGresult *res, PGrowValue *columns, void *param)
+{
+

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-23 Thread Marko Kreen

On Thu, Feb 23, 2012 at 07:14:03PM +0900, Kyotaro HORIGUCHI wrote:
 Hello, this is new version of the patch.

Looks good.

 By the way, I would like to ask you one question. What is the
 reason why void* should be head or tail of the parameter list?

Aesthetical reasons:

1) PGresult and PGrowValue belong together.

2) void* is probably the context object for handler.  When doing
   object-oriented programming in C the main object is usually first.
   Like libpq does - PGconn is always first argument.

But as libpq does not know the actual meaning of void* for handler,
is can be last param as well.

When I wrote the demo code, I noticed that it is unnatural to have
void* in the middle.


Last comment - if we drop the plan to make PQsetRowProcessorErrMsg()
usable outside of handler, we can simplify internal usage as well:
the PGresult-rowProcessorErrMsg can be dropped and let's use
-errMsg to transport the error message.

The PGresult is long-lived structure and adding fields for such
temporary usage feels wrong.  There is no other libpq code between
row processor and getAnotherTuple, so the use is safe.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-21 Thread Kyotaro HORIGUCHI

Hello,

  I don't have any attachment to PQskipRemainingResults(), but I
  think that some formal method to skip until Command-Complete('C')
  without breaking session is necessary because current libpq does
  so.
 
 We have such function: PQgetResult().  Only question is how
 to flag that the rows should be dropped.

I agree with it. I did this by conn-result-resultStatus ==
PGRES_FATAL_ERROR that instructs pqParseInput[23]() to skip
calling getAnotherTuple() but another means to do the same thing
without marking error is needed.

 Also, as row handler is on connection, then changing it's
 behavior is connection context, not result.

OK, current implement copying the pointer to the row processor
from PGconn to PGresult and getAnotherTuple() using it on
PGresult to avoid unintended replacement of row processor by
PQsetRowProcessor(), and I understand that row processor setting
should be in PGconn context and the change by PGsetRowProcessor()
should immediately become effective. That's right?

 Ok, lets see how it looks.  But please do it like this:
 
 - PQgetRowProcessor() that returns existing row processor.

This seems also can be done by the return value of
PQsetRowProcessor() (currently void). Anyway, I provide this
function and also change the return value of PQsetRowProcessor().

 - PQskipResult() that just replaces row processor, then calls
   PQgetResult() to eat the result.  It's behaviour fully
   matches PQgetResult() then.

There seems to be two choices, one is that PQskipResult()
replaces the row processor with NULL pointer and
getAnotherTuple() calls row processor if not NULL. And the
another is PQskipResult() sets the empty function as row
processor. I do the latter for the present.

This approach does needless call of getAnotherTuple(). Seeing if
the pointer to row processor is NULL in pqParseInput[23]() could
avoid this extra calls and may reduce the time for skip, but I
think it is no need to do so for such rare cases.

 I guess the main thing that annoys me with skipping is that
 it would require additional magic flag inside libpq.
 With PQgetRowProcessor() it does not need anymore, it's
 just a helper function that user can implement as well.

Ok.

 Only question here is what should happen if there are
 several incoming resultsets (several queries in PQexec).
 Should we leave to user to loop over them?
 
 Seems there is 2 approaches here:
 
 1) int PQskipResult(PGconn)
 2) int PQskipResult(PGconn, int skipAll)
 
 Both cases returning:
 1 - got resultset, there might be more
 0 - PQgetResult() returned NULL, connection is empty
-1 - error
 
 Although 1) mirrors regular PGgetResult() better, most users
 might not know that function as they are using sync API.
 They have simpler time with 2).  So 2) then?

Let me confirm the effects of this function. Is the below
description right?

- PQskipResult(conn, false) makes just following PQgetResult() to
  skip current bunch of rows and the consequent PQgetResult()'s
  gathers rows as usual.

- PQskipResult(conn, true) makes all consequent PQgetResult()'s
  to skip all the rows.

If this is right, row processor should stay also in PGresult
context. PQskipResult() replaces the row processor in PGconn when
the second parameter is true, and in PGresult for false.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-21 Thread Kyotaro HORIGUCHI

Sorry, I should fix a wrong word selection..

 Let me confirm the effects of this function. Is the below
 description right?
 
 - PQskipResult(conn, false) makes just following PQgetResult() to
   skip current bunch of rows and the consequent PQgetResult()'s
   gathers rows as usual.

- PQskipResult(conn, false) makes just following PQgetResult() to
  skip current bunch of rows and the succeeding PQgetResult()'s
  gathers rows as usual. ~~

 - PQskipResult(conn, true) makes all consequent PQgetResult()'s
   to skip all the rows.

- PQskipResult(conn, true) makes all succeeding PQgetResult()'s
  to skip all the rows.  ~~

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-21 Thread Marko Kreen

On Tue, Feb 21, 2012 at 11:42 AM, Kyotaro HORIGUCHI
horiguchi.kyot...@oss.ntt.co.jp wrote:
 Also, as row handler is on connection, then changing it's
 behavior is connection context, not result.

 OK, current implement copying the pointer to the row processor
 from PGconn to PGresult and getAnotherTuple() using it on
 PGresult to avoid unintended replacement of row processor by
 PQsetRowProcessor(), and I understand that row processor setting
 should be in PGconn context and the change by PGsetRowProcessor()
 should immediately become effective. That's right?

Note I dropped the row processor from under PGresult.
Please don't put it back there.

 Ok, lets see how it looks.  But please do it like this:

 - PQgetRowProcessor() that returns existing row processor.

 This seems also can be done by the return value of
 PQsetRowProcessor() (currently void). Anyway, I provide this
 function and also change the return value of PQsetRowProcessor().

Note you need processorParam as well.
I think it's simpler to rely on PQgetProcessor()

 - PQskipResult() that just replaces row processor, then calls
   PQgetResult() to eat the result.  It's behaviour fully
   matches PQgetResult() then.

 There seems to be two choices, one is that PQskipResult()
 replaces the row processor with NULL pointer and
 getAnotherTuple() calls row processor if not NULL. And the
 another is PQskipResult() sets the empty function as row
 processor. I do the latter for the present.

Yes, let's avoid NULLs.

 This approach does needless call of getAnotherTuple(). Seeing if
 the pointer to row processor is NULL in pqParseInput[23]() could
 avoid this extra calls and may reduce the time for skip, but I
 think it is no need to do so for such rare cases.

We should optimize for common case, which is non-skipping
row processor.

 Only question here is what should happen if there are
 several incoming resultsets (several queries in PQexec).
 Should we leave to user to loop over them?

 Seems there is 2 approaches here:

 1) int PQskipResult(PGconn)
 2) int PQskipResult(PGconn, int skipAll)

 Both cases returning:
     1 - got resultset, there might be more
     0 - PQgetResult() returned NULL, connection is empty
    -1 - error

 Although 1) mirrors regular PGgetResult() better, most users
 might not know that function as they are using sync API.
 They have simpler time with 2).  So 2) then?

 Let me confirm the effects of this function. Is the below
 description right?

 - PQskipResult(conn, false) makes just following PQgetResult() to
  skip current bunch of rows and the consequent PQgetResult()'s
  gathers rows as usual.

Yes.

 - PQskipResult(conn, true) makes all consequent PQgetResult()'s
  to skip all the rows.

 If this is right, row processor should stay also in PGresult
 context. PQskipResult() replaces the row processor in PGconn when
 the second parameter is true, and in PGresult for false.

No, let's keep row processor only under PGconn.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-21 Thread Kyotaro HORIGUCHI

Hello,

 Note I dropped the row processor from under PGresult.
 Please don't put it back there.

I overlooked that. I understand it.

  This seems also can be done by the return value of
  PQsetRowProcessor() (currently void). Anyway, I provide this
  function and also change the return value of PQsetRowProcessor().
 
 Note you need processorParam as well.
 I think it's simpler to rely on PQgetProcessor()

Hmm. Ok.

  Let me confirm the effects of this function. Is the below
  description right?
 
  - PQskipResult(conn, false) makes just following PQgetResult() to
   skip current bunch of rows and the consequent PQgetResult()'s
   gathers rows as usual.
 
 Yes.
 
  - PQskipResult(conn, true) makes all consequent PQgetResult()'s
   to skip all the rows.

Well, Is this right?

  If this is right, row processor should stay also in PGresult
  context. PQskipResult() replaces the row processor in PGconn when
  the second parameter is true, and in PGresult for false.
 
 No, let's keep row processor only under PGconn.

Then, Should I add the stash for the row processor (and needless
for param) to recall after in PGconn?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-21 Thread Marko Kreen

On Tue, Feb 21, 2012 at 12:13 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@oss.ntt.co.jp wrote:
  - PQskipResult(conn, true) makes all consequent PQgetResult()'s
   to skip all the rows.

 Well, Is this right?

Yes, call getResult() until it returns NULL.

  If this is right, row processor should stay also in PGresult
  context. PQskipResult() replaces the row processor in PGconn when
  the second parameter is true, and in PGresult for false.

 No, let's keep row processor only under PGconn.

 Then, Should I add the stash for the row processor (and needless
 for param) to recall after in PGconn?

PQskipResult:
- store old callback and param in local vars
- set do-nothing row callback
- call PQgetresult() once, or until it returns NULL
- restore old callback
- return 1 if last result was non-NULL, 0 otherwise

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-21 Thread Kyotaro HORIGUCHI

Thank you. Everything seems clear.
Please wait for a while.

 PQskipResult:
 - store old callback and param in local vars
 - set do-nothing row callback
 - call PQgetresu

 On Tue, Feb 21, 2012 at 12:13 PM, Kyotaro HORIGUCHI
 horiguchi.kyot...@oss.ntt.co.jp wrote:
   - PQskipResult(conn, true) makes all consequent PQgetResult()'s
to skip all the rows.
 
  Well, Is this right?

 Yes, call getResult() until it returns NULL.

   If this is right, row processor should stay also in PGresult
   context. PQskipResult() replaces the row processor in PGconn when
   the second parameter is true, and in PGresult for false.
 
  No, let's keep row processor only under PGconn.
 
  Then, Should I add the stash for the row processor (and needless
  for param) to recall after in PGconn?

 PQskipResult:
 - store old callback and param in local vars
 - set do-nothing row callback
 - call PQgetresult() once, or until it returns NULL
 - restore old callback
 - return 1 if last result was non-NULL, 0 otherwise

 --
 marko

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-21 Thread Marko Kreen

On Wed, Feb 22, 2012 at 12:27:57AM +0900, Kyotaro HORIGUCHI wrote:
 fe-exec.c
 - PQskipResult() is added instead of PGskipRemainigResults().

It must free the PGresults that PQgetResult() returns.

Also, please fix 2 issues mentined here:

  
http://archives.postgresql.org/message-id/CACMqXCLvpkjb9+c6sqJXitMHvrRCo+yu4q4bQ--0d7L=vw6...@mail.gmail.com

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-20 Thread Kyotaro HORIGUCHI

Hello, 

I don't have any attachment to PQskipRemainingResults(), but I
think that some formal method to skip until Command-Complete('C')
without breaking session is necessary because current libpq does
so.


 On Thu, Feb 16, 2012 at 02:24:19PM +0900, Kyotaro HORIGUCHI wrote:
  The choices of the libpq user on that point are,
  
  - Continue to read succeeding tuples.
  - Throw away the remaining results.
 
 There is also third choice, which may be even more popular than
 those ones - PQfinish().

That's it. I've implicitly assumed not to tear off the current
session.

 I think we already have such function - PQsetRowProcessor().
 Considering the user can use that to install skipping callback
 or simply set some flag in it's own per-connection state,

PQsetRowProcessor() sets row processor not to PGresult but
PGconn. So using PGsetRowProcessor() makes no sense for the
PGresult on which the user currently working. Another interface
function is needed to do that on PGresult.

But of course the users can do that by signalling using flags
within their code without PQsetRowProcessor or
PQskipRemainingResults.

Or returning to the beginning implement using PG_TRY() to inhibit
longjmp out of the row processor itself makes that possible too.

Altough it is possible in that ways, it needs (currently)
undocumented (in human-readable langs :-) knowledge about the
client protocol and the handling manner of that in libpq which
might be changed with no description in the release notes.

 I suspect the need is not that big.

I think so, too. But current implement of libpq does so for the
out-of-memory on receiving result rows. So I think some formal
(documented, in other words) way to do that is indispensable.


 But if you want to set error state for skipping, I would instead
 generalize PQsetRowProcessorErrMsg() to support setting error state
 outside of callback.  That would also help the external processing with
 'return 2'.  But I would keep the requirement that row processing must
 be ongoing, standalone error setting does not make sense.  Thus the name
 can stay.

mmm.. I consider that the cause of the problem proposed here is
the exceptions raised by certain server-side functions called in
row processor especially in C/C++ code. And we shouldn't use
PG_TRY() to catch it there where is too frequently executed. I
think 'return 2' is not applicable for the case. Some aid for
non-local exit from row processors (PQexec and the link from
users's sight) is needed. And I think it should be formal.


 There seems to be 2 ways to do it:
 
 1) Replace the PGresult under PGconn.  This seems ot require that
PQsetRowProcessorErrMsg takes PGconn as argument instead of
PGresult.  This also means callback should get PGconn as
argument.  Kind of makes sense even.
 
 2) Convert current partial PGresult to error state.  That also
makes sense, current use -rowProcessorErrMsg to transport
the message to later set the error in caller feels weird.
 
 I guess I slightly prefer 2) to 1).

The former might be inappropreate from the point of view of the
`undocumented knowledge' above. The latter seems missing the
point about exceptions.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-20 Thread Marko Kreen

On Tue, Feb 21, 2012 at 02:11:35PM +0900, Kyotaro HORIGUCHI wrote:
 I don't have any attachment to PQskipRemainingResults(), but I
 think that some formal method to skip until Command-Complete('C')
 without breaking session is necessary because current libpq does
 so.

We have such function: PQgetResult().  Only question is how
to flag that the rows should be dropped.

  I think we already have such function - PQsetRowProcessor().
  Considering the user can use that to install skipping callback
  or simply set some flag in it's own per-connection state,
 
 PQsetRowProcessor() sets row processor not to PGresult but
 PGconn. So using PGsetRowProcessor() makes no sense for the
 PGresult on which the user currently working. Another interface
 function is needed to do that on PGresult.

If we are talking about skipping incoming result rows,
it's PGconn feature, not PGresult.  Because you want to do
network traffic for that, yes?

Also, as row handler is on connection, then changing it's
behavior is connection context, not result.

 But of course the users can do that by signalling using flags
 within their code without PQsetRowProcessor or
 PQskipRemainingResults.
 
 Or returning to the beginning implement using PG_TRY() to inhibit
 longjmp out of the row processor itself makes that possible too.
 
 Altough it is possible in that ways, it needs (currently)
 undocumented (in human-readable langs :-) knowledge about the
 client protocol and the handling manner of that in libpq which
 might be changed with no description in the release notes.

You might be right that how to do it may not be obvious.

Ok, lets see how it looks.  But please do it like this:

- PQgetRowProcessor() that returns existing row processor.

- PQskipResult() that just replaces row processor, then calls
  PQgetResult() to eat the result.  It's behaviour fully
  matches PQgetResult() then.

I guess the main thing that annoys me with skipping is that
it would require additional magic flag inside libpq.
With PQgetRowProcessor() it does not need anymore, it's
just a helper function that user can implement as well.

Only question here is what should happen if there are
several incoming resultsets (several queries in PQexec).
Should we leave to user to loop over them?

Seems there is 2 approaches here:

1) int PQskipResult(PGconn)
2) int PQskipResult(PGconn, int skipAll)

Both cases returning:
1 - got resultset, there might be more
0 - PQgetResult() returned NULL, connection is empty
   -1 - error

Although 1) mirrors regular PGgetResult() better, most users
might not know that function as they are using sync API.
They have simpler time with 2).  So 2) then?

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-18 Thread Marko Kreen

Demos/tests of the new API:

  https://github.com/markokr/libpq-rowproc-demos

Comments resulting from that:

- PQsetRowProcessorErrMsg() should take const char*

- callback API should be (void *, PGresult *, PQrowValue*)
  or void* at the end, but not in the middle

I have not looked yet what needs to be done to get
ErrMsg callable outside of callback, if it requires PGconn,
then we should add PGconn also to callback args.

 On Thu, Feb 16, 2012 at 05:49:34PM +0900, Kyotaro HORIGUCHI wrote:
  I added the function PQskipRemainingResult() and use it in
 dblink. This reduces the number of executing try-catch block from
 the number of rows to one per query in dblink.

I still think we don't need extra skipping function.

Yes, the callback function needs have a flag to know that
rows need to be skip, but for such low-level API it does
not seem to be that hard requirement.

If this really needs to be made easier then getRowProcessor
might be better approach, to allow easy implementation
of generic skipping func for user.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-16 Thread Marko Kreen

On Thu, Feb 16, 2012 at 02:24:19PM +0900, Kyotaro HORIGUCHI wrote:
 As far as I see, on an out-of-memory in getAnotherTuple() makes
 conn-result-resultStatus = PGRES_FATAL_ERROR and
 qpParseInputp[23]() skips succeeding 'D' messages consequently.
 
 When exception raised within row processor, pg_conn-inCursor
 always positioned in consistent and result-resultStatus ==
 PGRES_TUPLES_OK.
 
 The choices of the libpq user on that point are,
 
 - Continue to read succeeding tuples.
 
   Call PQgetResult() to read 'D' messages and hand it to row
   processor succeedingly.
 
 - Throw away the remaining results.
 
   Call pqClearAsyncResult() and pqSaveErrorResult(), then call
   PQgetResult() to skip over the succeeding 'D' messages. (Of
   course the user can't do that on current implement.)

There is also third choice, which may be even more popular than
those ones - PQfinish().
 
 To make the users able to select the second choice (I think this
 is rather major), we should only provide and export the new PQ*
 function to do that, I think.

I think we already have such function - PQsetRowProcessor().
Considering the user can use that to install skipping callback
or simply set some flag in it's own per-connection state,
I suspect the need is not that big.

But if you want to set error state for skipping, I would instead
generalize PQsetRowProcessorErrMsg() to support setting error state
outside of callback.  That would also help the external processing with
'return 2'.  But I would keep the requirement that row processing must
be ongoing, standalone error setting does not make sense.  Thus the name
can stay.

There seems to be 2 ways to do it:

1) Replace the PGresult under PGconn.  This seems ot require that
   PQsetRowProcessorErrMsg takes PGconn as argument instead of
   PGresult.  This also means callback should get PGconn as
   argument.  Kind of makes sense even.

2) Convert current partial PGresult to error state.  That also
   makes sense, current use -rowProcessorErrMsg to transport
   the message to later set the error in caller feels weird.

I guess I slightly prefer 2) to 1).

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-16 Thread Marko Kreen

On Thu, Feb 16, 2012 at 05:49:34PM +0900, Kyotaro HORIGUCHI wrote:
  I added the function PQskipRemainingResult() and use it in
 dblink. This reduces the number of executing try-catch block from
 the number of rows to one per query in dblink.

This implementation is wrong - you must not simply call PQgetResult()
when connection is in async mode.  And the resulting PGresult must
be freed.

Please just make PGsetRowProcessorErrMsg() callable outside from
callback.  That also makes clear to code that sees final PGresult
what happened.  As a bonus, this will simply make the function
more robust and less special.

Although it's easy to create some PQsetRowSkipFlag() function
that will silently skip remaining rows, I think such logic
is best left to user to handle.  It creates unnecessary special
case when handling final PGresult, so in the big picture
it creates confusion.

 This patch is based on the patch above and composed in the same
 manner - main three patches include all modifications and the '2'
 patch separately.

I think there is not need to carry the early-exit patch separately.
It is visible in archives and nobody screamed about it yet,
so I guess it's acceptable.  Also it's so simple, there is no
point in spending time rebasing it.

 diff --git a/src/interfaces/libpq/#fe-protocol3.c# 
 b/src/interfaces/libpq/#fe-protocol3.c#

There is some backup file in your git repo. 

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-15 Thread Kyotaro HORIGUCHI

Hello, sorry for long absense.

As far as I see, on an out-of-memory in getAnotherTuple() makes
conn-result-resultStatus = PGRES_FATAL_ERROR and
qpParseInputp[23]() skips succeeding 'D' messages consequently.

When exception raised within row processor, pg_conn-inCursor
always positioned in consistent and result-resultStatus ==
PGRES_TUPLES_OK.

The choices of the libpq user on that point are,

- Continue to read succeeding tuples.

  Call PQgetResult() to read 'D' messages and hand it to row
  processor succeedingly.

- Throw away the remaining results.

  Call pqClearAsyncResult() and pqSaveErrorResult(), then call
  PQgetResult() to skip over the succeeding 'D' messages. (Of
  course the user can't do that on current implement.)

To make the users able to select the second choice (I think this
is rather major), we should only provide and export the new PQ*
function to do that, I think.

void
PQskipRemainingResult(PGconn *conn)
{
  pqClearAsyncResult(conn);
  
  /* conn-result is always NULL here */
  pqSaveErrorResult(conn);

  /* Skip over remaining 'D' messages. * /
  PQgetResult(conn);
}

User may write code with this function.

...
PG_TRY();
{
  ...
  res = PQexec(conn, );
  ...
}
PG_CATCH();
{
  PQskipRemainingResult(conn);
  goto error;
}
PG_END_TRY();


Of cource, this is applicable to C++ client in the same manner.

try {
  ...
  res = PQexec(conn, );
  ...
} catch (const myexcep ex) {
  PQskipRemainingResult(conn);
  throw ex;
}
 


By the way, where should I insert this function ?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-13 Thread Marko Kreen

On Tue, Feb 07, 2012 at 04:44:09PM +0200, Marko Kreen wrote:
 Although it seems we could allow exceptions, at least when we are
 speaking of Postgres backend, as the connection and result are
 internally consistent state when the handler is called, and the
 partial PGresult is stored under PGconn-result.  Even the
 connection is in consistent state, as the row packet is
 fully processed.
 
 So we have 3 variants, all work, but which one do we want to support?
 
 1) Exceptions are not allowed.
 2) Exceptions are allowed, but when it happens, user must call
PQfinish() next, to avoid processing incoming data from
potentially invalid state.
 3) Exceptions are allowed, and further row processing can continue
with PQisBusy() / PQgetResult()
 3.1) The problematic row is retried.  (Current behaviour)
 3.2) The problematic row is skipped.

I converted the patch to support 3.2), that is - skip row on exception.
That required some cleanups of getAnotherTuple() API, plus I made some
other cleanups.  Details below.

But during this I also started to think what happens if the user does
*not* throw exceptions.  Eg. Python does exceptions via regular return
values, which means complications when passing them upwards.

The main case I'm thinking of is actually resultset iterator in high-level
language.  Current callback-only style API requires that rows are
stuffed into temporary buffer until the network blocks and user code
gets control again.  This feels clumsy for a performance-oriented API.
Another case is user code that wants to do complex processing.  Doing
lot of stuff under callback seems dubious.  And what if some part of
code calls PQfinish() during some error recovery?

I tried imaging some sort of getFoo() style API for fetching in-flight
row data, but I always ended up with rewrite libpq step, so I feel
it's not productive to go there.

Instead I added simple feature: rowProcessor can return '2',
in which case getAnotherTuple() does early exit without setting
any error state.  In user side it appears as PQisBusy() returned
with TRUE result.  All pointers stay valid, so callback can just
stuff them into some temp area.  ATM there is not indication though
whether the exit was due to callback or other reasons, so user
must detect it based on whether new temp pointers appeares,
which means those must be cleaned before calling PQisBusy() again.
This actually feels like feature, those must not stay around
after single call.

It's included in main patch, but I also attached it as separate patch
so that it can be examined separately and reverted if not acceptable.

But as it's really simple, I recommend including it.

It's usage might now be obvious though, should we include
example code in doc?



New feature:

* Row processor can return 2, then PQisBusy() does early exit.
  Supportde only when connection is in non-blocking mode.

Cleanups:

* Row data is tagged as processed when rowProcessor is called,
  so exceptions will skip the row.  This simplifies non-exceptional
  handling as well.

* Converted 'return EOF' in V3 getAnotherTuple() to set error instead.
  AFAICS those EOFs are remnants from V2 getAnotherTuple()
  not something that is coded deliberately.  Note that when
  v3 getAnotherTuple() is called the row packet is fully in buffer.
  The EOF on broken packet to signify 'give me more data' does not
  result in any useful behaviour, instead you can simply get
  into infinite loop.

Fix bugs in my previous changes:

* Split the OOM error handling from custom error message handling,
  previously the error message in PGresult was lost due to OOM logic
  early free of PGresult.

* Drop unused goto label from experimental debug code.

-- 
marko

*** a/src/interfaces/libpq/exports.txt
--- b/src/interfaces/libpq/exports.txt
***
*** 160,162  PQconnectStartParams  157
--- 160,164 
  PQping158
  PQpingParams  159
  PQlibVersion  160
+ PQsetRowProcessor	  161
+ PQsetRowProcessorErrMsg	  162
*** a/src/interfaces/libpq/fe-connect.c
--- b/src/interfaces/libpq/fe-connect.c
***
*** 2693,2698  makeEmptyPGconn(void)
--- 2693,2701 
  	conn-wait_ssl_try = false;
  #endif
  
+ 	/* set default row processor */
+ 	PQsetRowProcessor(conn, NULL, NULL);
+ 
  	/*
  	 * We try to send at least 8K at a time, which is the usual size of pipe
  	 * buffers on Unix systems.  That way, when we are sending a large amount
***
*** 2711,2718  makeEmptyPGconn(void)
--- 2714,2726 
  	initPQExpBuffer(conn-errorMessage);
  	initPQExpBuffer(conn-workBuffer);
  
+ 	/* set up initial row buffer */
+ 	conn-rowBufLen = 32;
+ 	conn-rowBuf = (PGrowValue *)malloc(conn-rowBufLen * sizeof(PGrowValue));
+ 
  	if (conn-inBuffer == NULL ||
  		conn-outBuffer == NULL ||
+ 		conn-rowBuf == NULL ||
  		PQExpBufferBroken(conn-errorMessage) ||
  		PQExpBufferBroken(conn-workBuffer))
  	{
***
*** 2814,2819  freePGconn(PGconn *conn)

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-08 Thread Marko Kreen

On Wed, Feb 08, 2012 at 02:44:13PM +0900, Shigeru Hanada wrote:
 (2012/02/07 23:44), Marko Kreen wrote:
  On Tue, Feb 07, 2012 at 07:25:14PM +0900, Shigeru Hanada wrote:
  - What is the right (or recommended) way to prevent from throwing
  exceptoin in row-processor callback function?  When author should use
  PG_TRY block to catch exception in the callback function?
  
  When it calls backend functions that can throw exceptions?
  As all handlers running in backend will want to convert data
  to Datums, that means always wrap handler code in PG_TRY?
 
 ISTM that telling a) what happens to PGresult and PGconn when row
 processor ends without return, and b) how to recover them would be
 necessary.  We can't assume much about caller because libpq is a client
 library.  IMHO, it's most important to tell authors of row processor
 clearly what should be done on error case.

Yes.

 In such case, must we call PQfinish, or is calling PQgetResult until it
 returns NULL enough to reuse the connection?  IIUC calling
 pqClearAsyncResult seems sufficient, but it's not exported for client...

Simply dropping -result is not useful as there are still rows
coming in.  Now you cannot process them anymore.

The rule of after exception it's valid to close the connection with PQfinish()
or continue processing with PQgetResult()/PQisBusy()/PQconsumeInput() seems
quite simple and clear.  Perhaps only clarification whats valid on sync
connection and whats valid on async connection would be useful.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-07 Thread Shigeru Hanada

(2012/02/02 23:30), Marko Kreen wrote:
 On Thu, Feb 02, 2012 at 04:51:37PM +0900, Kyotaro HORIGUCHI wrote:
 Hello, This is new version of dblink.c

 - Memory is properly freed when realloc returns NULL in storeHandler().

 - The bug that free() in finishStoreInfo() will be fed with
garbage pointer when malloc for sinfo-valbuflen fails is
fixed.
 
 Thanks, merged.  I also did some minor coding style cleanups.
 
 Tagging it Ready For Committer.  I don't see any notable
 issues with the patch anymore.
 
 There is some potential for experimenting with more aggressive
 optimizations on dblink side, but I'd like to get a nod from
 a committer for libpq changes first.

I tried to use this feature in pgsql_fdw patch, and found some small issues.

- Typos
- mes - msg
- funcion - function
- overritten - overwritten
- costom - custom
- What is the right (or recommended) way to prevent from throwing
exceptoin in row-processor callback function?  When author should use
PG_TRY block to catch exception in the callback function?
- It would be better to describe how to determine whether a column
result is NULL should be described clearly.  Perhaps result value is
NULL when PGrowValue.len is less than 0, right?

Regards,
-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-07 Thread Marko Kreen

On Tue, Feb 07, 2012 at 07:25:14PM +0900, Shigeru Hanada wrote:
 (2012/02/02 23:30), Marko Kreen wrote:
  On Thu, Feb 02, 2012 at 04:51:37PM +0900, Kyotaro HORIGUCHI wrote:
  Hello, This is new version of dblink.c
 
  - Memory is properly freed when realloc returns NULL in storeHandler().
 
  - The bug that free() in finishStoreInfo() will be fed with
 garbage pointer when malloc for sinfo-valbuflen fails is
 fixed.
  
  Thanks, merged.  I also did some minor coding style cleanups.
  
  Tagging it Ready For Committer.  I don't see any notable
  issues with the patch anymore.
  
  There is some potential for experimenting with more aggressive
  optimizations on dblink side, but I'd like to get a nod from
  a committer for libpq changes first.
 
 I tried to use this feature in pgsql_fdw patch, and found some small issues.
 
 - Typos
 - mes - msg
 - funcion - function
 - overritten - overwritten
 - costom - custom

Fixed.

 - What is the right (or recommended) way to prevent from throwing
 exceptoin in row-processor callback function?  When author should use
 PG_TRY block to catch exception in the callback function?

When it calls backend functions that can throw exceptions?
As all handlers running in backend will want to convert data
to Datums, that means always wrap handler code in PG_TRY?

Although it seems we could allow exceptions, at least when we are
speaking of Postgres backend, as the connection and result are
internally consistent state when the handler is called, and the
partial PGresult is stored under PGconn-result.  Even the
connection is in consistent state, as the row packet is
fully processed.

So we have 3 variants, all work, but which one do we want to support?

1) Exceptions are not allowed.
2) Exceptions are allowed, but when it happens, user must call
   PQfinish() next, to avoid processing incoming data from
   potentially invalid state.
3) Exceptions are allowed, and further row processing can continue
   with PQisBusy() / PQgetResult()
3.1) The problematic row is retried.  (Current behaviour)
3.2) The problematic row is skipped.

No clue if that is ok for handler written in C++, I have no idea
whether you can throw C++ exception when part of the stack
contains raw C calls.

 - It would be better to describe how to determine whether a column
 result is NULL should be described clearly.  Perhaps result value is
 NULL when PGrowValue.len is less than 0, right?

Eh, seems it's documented everywhere except in sgml doc.  Fixed.
[ Is it better to document that it is -1 or  0? ]

Also I removed one remaining dynamic stack array in dblink.c

Current state of patch attached.

-- 
marko

*** a/src/interfaces/libpq/exports.txt
--- b/src/interfaces/libpq/exports.txt
***
*** 160,162  PQconnectStartParams  157
--- 160,164 
  PQping158
  PQpingParams  159
  PQlibVersion  160
+ PQsetRowProcessor	  161
+ PQsetRowProcessorErrMsg	  162
*** a/src/interfaces/libpq/fe-connect.c
--- b/src/interfaces/libpq/fe-connect.c
***
*** 2693,2698  makeEmptyPGconn(void)
--- 2693,2701 
  	conn-wait_ssl_try = false;
  #endif
  
+ 	/* set default row processor */
+ 	PQsetRowProcessor(conn, NULL, NULL);
+ 
  	/*
  	 * We try to send at least 8K at a time, which is the usual size of pipe
  	 * buffers on Unix systems.  That way, when we are sending a large amount
***
*** 2711,2718  makeEmptyPGconn(void)
--- 2714,2726 
  	initPQExpBuffer(conn-errorMessage);
  	initPQExpBuffer(conn-workBuffer);
  
+ 	/* set up initial row buffer */
+ 	conn-rowBufLen = 32;
+ 	conn-rowBuf = (PGrowValue *)malloc(conn-rowBufLen * sizeof(PGrowValue));
+ 
  	if (conn-inBuffer == NULL ||
  		conn-outBuffer == NULL ||
+ 		conn-rowBuf == NULL ||
  		PQExpBufferBroken(conn-errorMessage) ||
  		PQExpBufferBroken(conn-workBuffer))
  	{
***
*** 2814,2819  freePGconn(PGconn *conn)
--- 2822,2829 
  		free(conn-inBuffer);
  	if (conn-outBuffer)
  		free(conn-outBuffer);
+ 	if (conn-rowBuf)
+ 		free(conn-rowBuf);
  	termPQExpBuffer(conn-errorMessage);
  	termPQExpBuffer(conn-workBuffer);
  
***
*** 5078,5080  PQregisterThreadLock(pgthreadlock_t newhandler)
--- 5088,5091 
  
  	return prev;
  }
+ 
*** a/src/interfaces/libpq/fe-exec.c
--- b/src/interfaces/libpq/fe-exec.c
***
*** 66,71  static PGresult *PQexecFinish(PGconn *conn);
--- 66,72 
  static int PQsendDescribe(PGconn *conn, char desc_type,
  			   const char *desc_target);
  static int	check_field_number(const PGresult *res, int field_num);
+ static int	pqAddRow(PGresult *res, void *param, PGrowValue *columns);
  
  
  /* 
***
*** 160,165  PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
--- 161,167 
  	result-curBlock = NULL;
  	result-curOffset = 0;
  	result-spaceLeft = 0;
+ 	result-rowProcessorErrMsg = NULL;
  
  	if (conn)
  	{
***
*** 701,707

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-07 Thread Robert Haas

On Tue, Feb 7, 2012 at 9:44 AM, Marko Kreen mark...@gmail.com wrote:
 - What is the right (or recommended) way to prevent from throwing
 exceptoin in row-processor callback function?  When author should use
 PG_TRY block to catch exception in the callback function?

 When it calls backend functions that can throw exceptions?
 As all handlers running in backend will want to convert data
 to Datums, that means always wrap handler code in PG_TRY?

I would hate to have such a rule.  PG_TRY isn't free, and it's prone
to subtle bugs, like failing to mark enough stuff in the same function
volatile.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-07 Thread Shigeru Hanada

(2012/02/07 23:44), Marko Kreen wrote:
 On Tue, Feb 07, 2012 at 07:25:14PM +0900, Shigeru Hanada wrote:
 - What is the right (or recommended) way to prevent from throwing
 exceptoin in row-processor callback function?  When author should use
 PG_TRY block to catch exception in the callback function?
 
 When it calls backend functions that can throw exceptions?
 As all handlers running in backend will want to convert data
 to Datums, that means always wrap handler code in PG_TRY?

ISTM that telling a) what happens to PGresult and PGconn when row
processor ends without return, and b) how to recover them would be
necessary.  We can't assume much about caller because libpq is a client
library.  IMHO, it's most important to tell authors of row processor
clearly what should be done on error case.

In such case, must we call PQfinish, or is calling PQgetResult until it
returns NULL enough to reuse the connection?  IIUC calling
pqClearAsyncResult seems sufficient, but it's not exported for client...

Regards,
-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-02 Thread Marko Kreen

On Thu, Feb 02, 2012 at 04:51:37PM +0900, Kyotaro HORIGUCHI wrote:
 Hello, This is new version of dblink.c
 
 - Memory is properly freed when realloc returns NULL in storeHandler().
 
 - The bug that free() in finishStoreInfo() will be fed with
   garbage pointer when malloc for sinfo-valbuflen fails is
   fixed.

Thanks, merged.  I also did some minor coding style cleanups.

Tagging it Ready For Committer.  I don't see any notable
issues with the patch anymore.

There is some potential for experimenting with more aggressive
optimizations on dblink side, but I'd like to get a nod from
a committer for libpq changes first.

-- 
marko

*** a/src/interfaces/libpq/exports.txt
--- b/src/interfaces/libpq/exports.txt
***
*** 160,162  PQconnectStartParams  157
--- 160,164 
  PQping158
  PQpingParams  159
  PQlibVersion  160
+ PQsetRowProcessor	  161
+ PQsetRowProcessorErrMsg	  162
*** a/src/interfaces/libpq/fe-connect.c
--- b/src/interfaces/libpq/fe-connect.c
***
*** 2693,2698  makeEmptyPGconn(void)
--- 2693,2701 
  	conn-wait_ssl_try = false;
  #endif
  
+ 	/* set default row processor */
+ 	PQsetRowProcessor(conn, NULL, NULL);
+ 
  	/*
  	 * We try to send at least 8K at a time, which is the usual size of pipe
  	 * buffers on Unix systems.  That way, when we are sending a large amount
***
*** 2711,2718  makeEmptyPGconn(void)
--- 2714,2726 
  	initPQExpBuffer(conn-errorMessage);
  	initPQExpBuffer(conn-workBuffer);
  
+ 	/* set up initial row buffer */
+ 	conn-rowBufLen = 32;
+ 	conn-rowBuf = (PGrowValue *)malloc(conn-rowBufLen * sizeof(PGrowValue));
+ 
  	if (conn-inBuffer == NULL ||
  		conn-outBuffer == NULL ||
+ 		conn-rowBuf == NULL ||
  		PQExpBufferBroken(conn-errorMessage) ||
  		PQExpBufferBroken(conn-workBuffer))
  	{
***
*** 2812,2817  freePGconn(PGconn *conn)
--- 2820,2827 
  		free(conn-inBuffer);
  	if (conn-outBuffer)
  		free(conn-outBuffer);
+ 	if (conn-rowBuf)
+ 		free(conn-rowBuf);
  	termPQExpBuffer(conn-errorMessage);
  	termPQExpBuffer(conn-workBuffer);
  
***
*** 5076,5078  PQregisterThreadLock(pgthreadlock_t newhandler)
--- 5086,5089 
  
  	return prev;
  }
+ 
*** a/src/interfaces/libpq/fe-exec.c
--- b/src/interfaces/libpq/fe-exec.c
***
*** 66,71  static PGresult *PQexecFinish(PGconn *conn);
--- 66,72 
  static int PQsendDescribe(PGconn *conn, char desc_type,
  			   const char *desc_target);
  static int	check_field_number(const PGresult *res, int field_num);
+ static int	pqAddRow(PGresult *res, void *param, PGrowValue *columns);
  
  
  /* 
***
*** 160,165  PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
--- 161,167 
  	result-curBlock = NULL;
  	result-curOffset = 0;
  	result-spaceLeft = 0;
+ 	result-rowProcessorErrMsg = NULL;
  
  	if (conn)
  	{
***
*** 701,707  pqClearAsyncResult(PGconn *conn)
  	if (conn-result)
  		PQclear(conn-result);
  	conn-result = NULL;
- 	conn-curTuple = NULL;
  }
  
  /*
--- 703,708 
***
*** 756,762  pqPrepareAsyncResult(PGconn *conn)
  	 */
  	res = conn-result;
  	conn-result = NULL;		/* handing over ownership to caller */
- 	conn-curTuple = NULL;		/* just in case */
  	if (!res)
  		res = PQmakeEmptyPGresult(conn, PGRES_FATAL_ERROR);
  	else
--- 757,762 
***
*** 828,833  pqInternalNotice(const PGNoticeHooks *hooks, const char *fmt,...)
--- 828,900 
  }
  
  /*
+  * PQsetRowProcessor
+  *   Set function that copies column data out from network buffer.
+  */
+ void
+ PQsetRowProcessor(PGconn *conn, PQrowProcessor func, void *param)
+ {
+ 	conn-rowProcessor = (func ? func : pqAddRow);
+ 	conn-rowProcessorParam = param;
+ }
+ 
+ /*
+  * PQsetRowProcessorErrMsg
+  *Set the error message pass back to the caller of RowProcessor.
+  *
+  *You can replace the previous message by alternative mes, or clear
+  *it with NULL.
+  */
+ void
+ PQsetRowProcessorErrMsg(PGresult *res, char *msg)
+ {
+ 	if (msg)
+ 		res-rowProcessorErrMsg = pqResultStrdup(res, msg);
+ 	else
+ 		res-rowProcessorErrMsg = NULL;
+ }
+ 
+ /*
+  * pqAddRow
+  *	  add a row to the PGresult structure, growing it if necessary
+  *	  Returns TRUE if OK, FALSE if not enough memory to add the row.
+  */
+ static int
+ pqAddRow(PGresult *res, void *param, PGrowValue *columns)
+ {
+ 	PGresAttValue *tup;
+ 	int			nfields = res-numAttributes;
+ 	int			i;
+ 
+ 	tup = (PGresAttValue *)
+ 		pqResultAlloc(res, nfields * sizeof(PGresAttValue), TRUE);
+ 	if (tup == NULL)
+ 		return FALSE;
+ 
+ 	for (i = 0 ; i  nfields ; i++)
+ 	{
+ 		tup[i].len = columns[i].len;
+ 		if (tup[i].len == NULL_LEN)
+ 		{
+ 			tup[i].value = res-null_field;
+ 		}
+ 		else
+ 		{
+ 			bool isbinary = (res-attDescs[i].format != 0);
+ 			tup[i].value = (char *)pqResultAlloc(res, tup[i].len + 1, isbinary);
+ 			if (tup[i].value == NULL)
+ return FALSE;
+

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-02 Thread Kyotaro HORIGUCHI

Hello,

 Thanks, merged.  I also did some minor coding style cleanups.

Thank you for editing many comments and some code I'd left
unchanged from my carelessness, and lack of understanding your
comments. I'll be more careful about that...

 There is some potential for experimenting with more aggressive
 optimizations on dblink side, but I'd like to get a nod from
 a committer for libpq changes first.

I'm looking forward to the aggressiveness of that:-)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-01 Thread Kyotaro HORIGUCHI

Hello,

 Another thing: if realloc() fails, the old pointer stays valid.
 So it needs to be assigned to temp variable first, before
 overwriting old pointer.

 mmm. I've misunderstood of the realloc.. I'll fix there in the
next patch.

 And seems malloc() is preferable to palloc() to avoid
 exceptions inside row processor.  Although latter
 one could be made to work, it might be unnecessary
 complexity.  (store current pqresult into remoteConn?)

Hmm.. palloc may throw ERRCODE_OUT_OF_MEMORY so I must catch it
and return NULL. That seems there is no difference to using
malloc after all.. However, the inhibition of throwing exceptions
in RowProcessor is not based on any certain fact, so palloc here
may make sense if we can do that.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-01 Thread Marko Kreen

On Wed, Feb 1, 2012 at 10:32 AM, Kyotaro HORIGUCHI
horiguchi.kyot...@oss.ntt.co.jp wrote:
 Another thing: if realloc() fails, the old pointer stays valid.
 So it needs to be assigned to temp variable first, before
 overwriting old pointer.

  mmm. I've misunderstood of the realloc.. I'll fix there in the
 next patch.

Please wait a moment, I started doing small cleanups,
and now have some larger ones too.  I'll send it soon.

OTOH, if you have already done something, you can send it,
I have various states in GIT so it should not be hard
to merge things.

 And seems malloc() is preferable to palloc() to avoid
 exceptions inside row processor.  Although latter
 one could be made to work, it might be unnecessary
 complexity.  (store current pqresult into remoteConn?)

 Hmm.. palloc may throw ERRCODE_OUT_OF_MEMORY so I must catch it
 and return NULL. That seems there is no difference to using
 malloc after all.. However, the inhibition of throwing exceptions
 in RowProcessor is not based on any certain fact, so palloc here
 may make sense if we can do that.

No, I was thinking about storing the result in connection
struct and then letting the exception pass, as the
PGresult can be cleaned later.  Thus we could get rid
of TRY/CATCH in per-row handler.  (At that point
the PGresult is already under PGconn, so maybe
it's enough to clean that one later?)

But for now, as the TRY is already there, it should be
also simple to move palloc usage inside it.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-02-01 Thread Kyotaro HORIGUCHI

Hello, This is new version of dblink.c

- Memory is properly freed when realloc returns NULL in storeHandler().

- The bug that free() in finishStoreInfo() will be fed with
  garbage pointer when malloc for sinfo-valbuflen fails is
  fixed.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 36a8e3e..28c967c 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -63,11 +63,23 @@ typedef struct remoteConn
 	bool		newXactForCursor;		/* Opened a transaction for a cursor */
 } remoteConn;
 
+typedef struct storeInfo
+{
+	Tuplestorestate *tuplestore;
+	int nattrs;
+	MemoryContext oldcontext;
+	AttInMetadata *attinmeta;
+	char** valbuf;
+	int *valbuflen;
+	bool error_occurred;
+	bool nummismatch;
+	ErrorData *edata;
+} storeInfo;
+
 /*
  * Internal declarations
  */
 static Datum dblink_record_internal(FunctionCallInfo fcinfo, bool is_async);
-static void materializeResult(FunctionCallInfo fcinfo, PGresult *res);
 static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
@@ -90,6 +102,10 @@ static char *escape_param_str(const char *from);
 static void validate_pkattnums(Relation rel,
    int2vector *pkattnums_arg, int32 pknumatts_arg,
    int **pkattnums, int *pknumatts);
+static void initStoreInfo(storeInfo *sinfo, FunctionCallInfo fcinfo);
+static void finishStoreInfo(storeInfo *sinfo);
+static int storeHandler(PGresult *res, void *param, PGrowValue *columns);
+
 
 /* Global */
 static remoteConn *pconn = NULL;
@@ -503,6 +519,7 @@ dblink_fetch(PG_FUNCTION_ARGS)
 	char	   *curname = NULL;
 	int			howmany = 0;
 	bool		fail = true;	/* default to backward compatible */
+	storeInfo   storeinfo;
 
 	DBLINK_INIT;
 
@@ -559,15 +576,36 @@ dblink_fetch(PG_FUNCTION_ARGS)
 	appendStringInfo(buf, FETCH %d FROM %s, howmany, curname);
 
 	/*
+	 * Result is stored into storeinfo.tuplestore instead of
+	 * res-result retuned by PQexec below
+	 */
+	initStoreInfo(storeinfo, fcinfo);
+	PQregisterRowProcessor(conn, storeHandler, storeinfo);
+
+	/*
 	 * Try to execute the query.  Note that since libpq uses malloc, the
 	 * PGresult will be long-lived even though we are still in a short-lived
 	 * memory context.
 	 */
 	res = PQexec(conn, buf.data);
+	finishStoreInfo(storeinfo);
+
 	if (!res ||
 		(PQresultStatus(res) != PGRES_COMMAND_OK 
 		 PQresultStatus(res) != PGRES_TUPLES_OK))
 	{
+		/* finishStoreInfo saves the fields referred to below. */
+		if (storeinfo.nummismatch)
+		{
+			/* This is only for backward compatibility */
+			ereport(ERROR,
+	(errcode(ERRCODE_DATATYPE_MISMATCH),
+	 errmsg(remote query result rowtype does not match 
+			the specified FROM clause rowtype)));
+		}
+		else if (storeinfo.edata)
+			ReThrowError(storeinfo.edata);
+
 		dblink_res_error(conname, res, could not fetch from cursor, fail);
 		return (Datum) 0;
 	}
@@ -579,8 +617,8 @@ dblink_fetch(PG_FUNCTION_ARGS)
 (errcode(ERRCODE_INVALID_CURSOR_NAME),
  errmsg(cursor \%s\ does not exist, curname)));
 	}
+	PQclear(res);
 
-	materializeResult(fcinfo, res);
 	return (Datum) 0;
 }
 
@@ -640,6 +678,7 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 	remoteConn *rconn = NULL;
 	bool		fail = true;	/* default to backward compatible */
 	bool		freeconn = false;
+	storeInfo   storeinfo;
 
 	/* check to see if caller supports us returning a tuplestore */
 	if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
@@ -715,164 +754,225 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 	rsinfo-setResult = NULL;
 	rsinfo-setDesc = NULL;
 
+
+	/*
+	 * Result is stored into storeinfo.tuplestore instead of
+	 * res-result retuned by PQexec/PQgetResult below
+	 */
+	initStoreInfo(storeinfo, fcinfo);
+	PQregisterRowProcessor(conn, storeHandler, storeinfo);
+
 	/* synchronous query, or async result retrieval */
 	if (!is_async)
 		res = PQexec(conn, sql);
 	else
-	{
 		res = PQgetResult(conn);
-		/* NULL means we're all done with the async results */
-		if (!res)
-			return (Datum) 0;
-	}
 
-	/* if needed, close the connection to the database and cleanup */
-	if (freeconn)
-		PQfinish(conn);
+	finishStoreInfo(storeinfo);
 
-	if (!res ||
-		(PQresultStatus(res) != PGRES_COMMAND_OK 
-		 PQresultStatus(res) != PGRES_TUPLES_OK))
+	/* NULL res from async get means we're all done with the results */
+	if (res || !is_async)
 	{
-		dblink_res_error(conname, res, could not execute query, fail);
-		return (Datum) 0;
+		if (freeconn)
+			PQfinish(conn);
+
+		if (!res ||
+			(PQresultStatus(res) != PGRES_COMMAND_OK 
+			 PQresultStatus(res) != PGRES_TUPLES_OK))
+		{
+			/* finishStoreInfo saves the fields referred to below. */
+			if (storeinfo.nummismatch)
+			{
+/* This is only for backward compatibility */
+ereport(ERROR,
+		(errcode(ERRCODE_DATATYPE_MISMATCH),
+		 errmsg(remote query result rowtype does

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-30 Thread Kyotaro HORIGUCHI

Thank you for comments, this is revised version of the patch.

The gain of performance is more than expected. Measure script now
does query via dblink ten times for stability of measuring, so
the figures become about ten times longer than the previous ones.

   sec% to Original
Original : 31.5 100.0%
RowProcessor patch   : 31.3  99.4%
dblink patch : 24.6  78.1%

RowProcessor patch alone makes no loss or very-little gain, and
full patch gives us 22% gain for the benchmark(*1).


The modifications are listed below.


- No more use of PGresAttValue for this mechanism, and added
  PGrowValue instead. PGresAttValue has been put back to
  libpq-int.h

- pqAddTuple() is restored as original and new function
  paAddRow() to use as RowProcessor. (Previous pqAddTuple
  implement had been buggily mixed the two usage of
  PGresAttValue)

- PQgetRowProcessorParam has been dropped. Contextual parameter
  is passed as one of the parameters of RowProcessor().

- RowProcessor() returns int (as bool, is that libpq convension?)
  instead of void *. (Actually, void * had already become useless
  as of previous patch)

- PQsetRowProcessorErrMes() is changed to do strdup internally.

- The callers of RowProcessor() no more set null_field to
  PGrowValue.value. Plus, the PGrowValue[] which RowProcessor()
  receives has nfields + 1 elements to be able to make rough
  estimate by cols-value[nfields].value - cols-value[0].value -
  something.  The somthing here is 4 * nfields for protocol3 and
  4 * (non-null fields) for protocol2. I fear that this applies
  only for textual transfer usage...

- PQregisterRowProcessor() sets the default handler when given
  NULL. (pg_conn|pg_result).rowProcessor cannot be NULL for its
  lifetime.

- initStoreInfo() and storeHandler() has been provided with
  malloc error handling.


And more..

- getAnotherTuple()@fe-protocol2.c is not tested utterly.

- The uniformity of the size of columns in the test data prevents
  realloc from execution in dblink... More test should be done.


 regards,

=
(*1) The benchmark is done as follows,

==test.sql
select dblink_connect('c', 'host=localhost dbname=test');
select * from dblink('c', 'select a,c from foo limit 200') as (a text b 
bytea) limit 1;
...(repeat 9 times more)
select dblink_disconnect('c');
==

$ for i in $(seq 1 10); do time psql test -f t.sql; done

The environment is
  CentOS 6.2 on VirtualBox on Core i7 965 3.2GHz
  # of processor  1
  Allocated mem   2GB
  
Test DB schema is
   Column | Type  | Modifiers 
  +---+---
   a  | text  | 
   b  | text  | 
   c  | bytea | 
  Indexes:
  foo_a_bt btree (a)
  foo_c_bt btree (c)

test=# select count(*),
   min(length(a)) as a_min, max(length(a)) as a_max,
   min(length(c)) as c_min, max(length(c)) as c_max from foo;

  count  | a_min | a_max | c_min | c_max 
-+---+---+---+---
 200 |29 |29 |29 |29
(1 row)

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1af8df6..5ed083c 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -160,3 +160,5 @@ PQconnectStartParams  157
 PQping158
 PQpingParams  159
 PQlibVersion  160
+PQregisterRowProcessor	  161
+PQsetRowProcessorErrMes	  162
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index d454538..4fe2f41 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2692,6 +2692,8 @@ makeEmptyPGconn(void)
 	conn-allow_ssl_try = true;
 	conn-wait_ssl_try = false;
 #endif
+	conn-rowProcessor = pqAddRow;
+	conn-rowProcessorParam = NULL;
 
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe
@@ -5076,3 +5078,10 @@ PQregisterThreadLock(pgthreadlock_t newhandler)
 
 	return prev;
 }
+
+void
+PQregisterRowProcessor(PGconn *conn, RowProcessor func, void *param)
+{
+	conn-rowProcessor = (func ? func : pqAddRow);
+	conn-rowProcessorParam = param;
+}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b743566..82914fd 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -66,6 +66,7 @@ static PGresult *PQexecFinish(PGconn *conn);
 static int PQsendDescribe(PGconn *conn, char desc_type,
 			   const char *desc_target);
 static int	check_field_number(const PGresult *res, int field_num);
+static int	pqAddTuple(PGresult *res, PGresAttValue *tup);
 
 
 /* 
@@ -160,6 +161,9 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
 	result-curBlock = NULL;
 	result-curOffset = 0;
 	result-spaceLeft = 0;
+	result-rowProcessor = pqAddRow;
+	result-rowProcessorParam = NULL;
+	result-rowProcessorErrMes = NULL;
 
 	if (conn)
 	{
@@ -194,6 +198,10 @@ PQmakeEmptyPGresult(PGconn

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-30 Thread horiguchi . kyotaro

I'm sorry.

 Thank you for comments, this is revised version of the patch.

The malloc error handling in dblink.c of the patch is broken. It
is leaking memory and breaking control.

i'll re-send the properly fixed patch for dblink.c later.

# This severe back pain should have made me stupid :-p

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-30 Thread Marko Kreen

On Mon, Jan 30, 2012 at 06:06:57PM +0900, Kyotaro HORIGUCHI wrote:
 The gain of performance is more than expected. Measure script now
 does query via dblink ten times for stability of measuring, so
 the figures become about ten times longer than the previous ones.
 
sec% to Original
 Original : 31.5 100.0%
 RowProcessor patch   : 31.3  99.4%
 dblink patch : 24.6  78.1%
 
 RowProcessor patch alone makes no loss or very-little gain, and
 full patch gives us 22% gain for the benchmark(*1).

Excellent!

 - The callers of RowProcessor() no more set null_field to
   PGrowValue.value. Plus, the PGrowValue[] which RowProcessor()
   receives has nfields + 1 elements to be able to make rough
   estimate by cols-value[nfields].value - cols-value[0].value -
   something.  The somthing here is 4 * nfields for protocol3 and
   4 * (non-null fields) for protocol2. I fear that this applies
   only for textual transfer usage...

Excact estimate is not important here.  And (nfields + 1) elem
feels bit too much magic, considering that most users probably
do not need it.  Without it, the logic would be:

 total = last.value - first.value + ((last.len  0) ? last.len : 0)

which isn't too complex.  So I think we can remove it.


= Problems =

* Remove the dubious memcpy() in pqAddRow()

* I think the dynamic arrays in getAnotherTuple() are not portable enough,
  please do proper allocation for array.  I guess in PQsetResultAttrs()?


= Minor notes =

These can be argued either way, if you don't like some
suggestion, you can drop it.

* Move PQregisterRowProcessor() into fe-exec.c, then we can make
  pqAddRow static.

* Should PQclear() free RowProcessor error msg?  It seems
  it should not get outside from getAnotherTuple(), but
  thats not certain.  Perhaps it would be clearer to free
  it here too.

* Remove the part of comment in getAnotherTuple():
   * Buffer content may be shifted on reloading additional
   * data. So we must set all pointers on every scan.

  It's confusing why it needs to clarify that, as there
  is nobody expecting it.

* PGrowValue documentation should mention that -value pointer
  is always valid.

* dblink: Perhaps some of those mallocs() could be replaced
  with pallocs() or even StringInfo, which already does
  the realloc dance?  I'm not familiar with dblink, and
  various struct lifetimes there so I don't know it that
  actually makes sense or not.


It seems this patch is getting ReadyForCommitter soon...

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-30 Thread Kyotaro HORIGUCHI

This is fixed version of dblink.c for row processor.

 i'll re-send the properly fixed patch for dblink.c later.

- malloc error in initStoreInfo throws ERRCODE_OUT_OF_MEMORY. (new error)

- storeHandler() now returns FALSE on malloc failure. Garbage
  cleanup is done in dblink_fetch() or dblink_record_internal().
  The behavior that this dblink displays this error as 'unkown
  error/could not execute query' on the user session is as it did
  before.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 36a8e3e..7a82ea1 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -63,11 +63,23 @@ typedef struct remoteConn
 	bool		newXactForCursor;		/* Opened a transaction for a cursor */
 } remoteConn;
 
+typedef struct storeInfo
+{
+	Tuplestorestate *tuplestore;
+	int nattrs;
+	MemoryContext oldcontext;
+	AttInMetadata *attinmeta;
+	char** valbuf;
+	int *valbuflen;
+	bool error_occurred;
+	bool nummismatch;
+	ErrorData *edata;
+} storeInfo;
+
 /*
  * Internal declarations
  */
 static Datum dblink_record_internal(FunctionCallInfo fcinfo, bool is_async);
-static void materializeResult(FunctionCallInfo fcinfo, PGresult *res);
 static remoteConn *getConnectionByName(const char *name);
 static HTAB *createConnHash(void);
 static void createNewConnection(const char *name, remoteConn *rconn);
@@ -90,6 +102,10 @@ static char *escape_param_str(const char *from);
 static void validate_pkattnums(Relation rel,
    int2vector *pkattnums_arg, int32 pknumatts_arg,
    int **pkattnums, int *pknumatts);
+static void initStoreInfo(storeInfo *sinfo, FunctionCallInfo fcinfo);
+static void finishStoreInfo(storeInfo *sinfo);
+static int storeHandler(PGresult *res, void *param, PGrowValue *columns);
+
 
 /* Global */
 static remoteConn *pconn = NULL;
@@ -503,6 +519,7 @@ dblink_fetch(PG_FUNCTION_ARGS)
 	char	   *curname = NULL;
 	int			howmany = 0;
 	bool		fail = true;	/* default to backward compatible */
+	storeInfo   storeinfo;
 
 	DBLINK_INIT;
 
@@ -559,15 +576,36 @@ dblink_fetch(PG_FUNCTION_ARGS)
 	appendStringInfo(buf, FETCH %d FROM %s, howmany, curname);
 
 	/*
+	 * Result is stored into storeinfo.tuplestore instead of
+	 * res-result retuned by PQexec below
+	 */
+	initStoreInfo(storeinfo, fcinfo);
+	PQregisterRowProcessor(conn, storeHandler, storeinfo);
+
+	/*
 	 * Try to execute the query.  Note that since libpq uses malloc, the
 	 * PGresult will be long-lived even though we are still in a short-lived
 	 * memory context.
 	 */
 	res = PQexec(conn, buf.data);
+	finishStoreInfo(storeinfo);
+
 	if (!res ||
 		(PQresultStatus(res) != PGRES_COMMAND_OK 
 		 PQresultStatus(res) != PGRES_TUPLES_OK))
 	{
+		/* finishStoreInfo saves the fields referred to below. */
+		if (storeinfo.nummismatch)
+		{
+			/* This is only for backward compatibility */
+			ereport(ERROR,
+	(errcode(ERRCODE_DATATYPE_MISMATCH),
+	 errmsg(remote query result rowtype does not match 
+			the specified FROM clause rowtype)));
+		}
+		else if (storeinfo.edata)
+			ReThrowError(storeinfo.edata);
+
 		dblink_res_error(conname, res, could not fetch from cursor, fail);
 		return (Datum) 0;
 	}
@@ -579,8 +617,8 @@ dblink_fetch(PG_FUNCTION_ARGS)
 (errcode(ERRCODE_INVALID_CURSOR_NAME),
  errmsg(cursor \%s\ does not exist, curname)));
 	}
+	PQclear(res);
 
-	materializeResult(fcinfo, res);
 	return (Datum) 0;
 }
 
@@ -640,6 +678,7 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 	remoteConn *rconn = NULL;
 	bool		fail = true;	/* default to backward compatible */
 	bool		freeconn = false;
+	storeInfo   storeinfo;
 
 	/* check to see if caller supports us returning a tuplestore */
 	if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
@@ -715,164 +754,217 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
 	rsinfo-setResult = NULL;
 	rsinfo-setDesc = NULL;
 
+
+	/*
+	 * Result is stored into storeinfo.tuplestore instead of
+	 * res-result retuned by PQexec/PQgetResult below
+	 */
+	initStoreInfo(storeinfo, fcinfo);
+	PQregisterRowProcessor(conn, storeHandler, storeinfo);
+
 	/* synchronous query, or async result retrieval */
 	if (!is_async)
 		res = PQexec(conn, sql);
 	else
-	{
 		res = PQgetResult(conn);
-		/* NULL means we're all done with the async results */
-		if (!res)
-			return (Datum) 0;
-	}
 
-	/* if needed, close the connection to the database and cleanup */
-	if (freeconn)
-		PQfinish(conn);
+	finishStoreInfo(storeinfo);
 
-	if (!res ||
-		(PQresultStatus(res) != PGRES_COMMAND_OK 
-		 PQresultStatus(res) != PGRES_TUPLES_OK))
+	/* NULL res from async get means we're all done with the results */
+	if (res || !is_async)
 	{
-		dblink_res_error(conname, res, could not execute query, fail);
-		return (Datum) 0;
+		if (freeconn)
+			PQfinish(conn);
+
+		if (!res ||
+			(PQresultStatus(res) != PGRES_COMMAND_OK 
+			 PQresultStatus(res) != PGRES_TUPLES_OK))
+		{
+			/* finishStoreInfo saves the fields referred

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-30 Thread Marko Kreen

On Tue, Jan 31, 2012 at 4:59 AM, Kyotaro HORIGUCHI
horiguchi.kyot...@oss.ntt.co.jp wrote:
 This is fixed version of dblink.c for row processor.

 i'll re-send the properly fixed patch for dblink.c later.

 - malloc error in initStoreInfo throws ERRCODE_OUT_OF_MEMORY. (new error)

 - storeHandler() now returns FALSE on malloc failure. Garbage
  cleanup is done in dblink_fetch() or dblink_record_internal().
  The behavior that this dblink displays this error as 'unkown
  error/could not execute query' on the user session is as it did
  before.

Another thing: if realloc() fails, the old pointer stays valid.
So it needs to be assigned to temp variable first, before
overwriting old pointer.

And seems malloc() is preferable to palloc() to avoid
exceptions inside row processor.  Although latter
one could be made to work, it might be unnecessary
complexity.  (store current pqresult into remoteConn?)

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-27 Thread Kyotaro HORIGUCHI

Hello, This is a new version of the patch formerly known as
'alternative storage for libpq'.

- Changed the concept to 'Alternative Row Processor' from
  'Storage handler'. Symbol names are also changed.

- Callback function is modified following to the comment.

- From the restriction of time, I did minimum check for this
  patch. The purpose of this patch is to show the new implement.

- Proformance is not measured for this patch for the same
  reason. I will do that on next monday.

- The meaning of PGresAttValue is changed. The field 'value' now
  contains a value withOUT terminating zero. This change seems to
  have no effect on any other portion within the whole source
  tree of postgresql from what I've seen.


  I would like to propose better one-shot API with:
  
  void *(*RowStoreHandler)(PGresult *res, PGresAttValue *columns);
...
  1) Pass-through processing do not need to care about unnecessary
 per-row allocations.
  
  2) Handlers that want to copy of the row (like regular libpq),
 can optimize allocations by having global view of the row.
 (Eg. One allocation for row header + data).

I expect the new implementation is far more better than the
orignal.

regargs,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1af8df6..c47af3a 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -160,3 +160,6 @@ PQconnectStartParams  157
 PQping158
 PQpingParams  159
 PQlibVersion  160
+PQregisterRowProcessor	  161
+PQgetRowProcessorParam	  163
+PQsetRowProcessorErrMes	  164
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index d454538..93803d5 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2692,6 +2692,7 @@ makeEmptyPGconn(void)
 	conn-allow_ssl_try = true;
 	conn-wait_ssl_try = false;
 #endif
+	conn-rowProcessor = NULL;
 
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe
@@ -5076,3 +5077,10 @@ PQregisterThreadLock(pgthreadlock_t newhandler)
 
 	return prev;
 }
+
+void
+PQregisterRowProcessor(PGconn *conn, RowProcessor func, void *param)
+{
+	conn-rowProcessor = func;
+	conn-rowProcessorParam = param;
+}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b743566..5d78b39 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -66,7 +66,7 @@ static PGresult *PQexecFinish(PGconn *conn);
 static int PQsendDescribe(PGconn *conn, char desc_type,
 			   const char *desc_target);
 static int	check_field_number(const PGresult *res, int field_num);
-
+static void *pqAddTuple(PGresult *res, PGresAttValue *columns);
 
 /* 
  * Space management for PGresult.
@@ -160,6 +160,9 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
 	result-curBlock = NULL;
 	result-curOffset = 0;
 	result-spaceLeft = 0;
+	result-rowProcessor = pqAddTuple;
+	result-rowProcessorParam = NULL;
+	result-rowProcessorErrMes = NULL;
 
 	if (conn)
 	{
@@ -194,6 +197,12 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
 			}
 			result-nEvents = conn-nEvents;
 		}
+
+		if (conn-rowProcessor)
+		{
+			result-rowProcessor = conn-rowProcessor;
+			result-rowProcessorParam = conn-rowProcessorParam;
+		}
 	}
 	else
 	{
@@ -445,7 +454,7 @@ PQsetvalue(PGresult *res, int tup_num, int field_num, char *value, int len)
 		}
 
 		/* add it to the array */
-		if (!pqAddTuple(res, tup))
+		if (pqAddTuple(res, tup) == NULL)
 			return FALSE;
 	}
 
@@ -701,7 +710,6 @@ pqClearAsyncResult(PGconn *conn)
 	if (conn-result)
 		PQclear(conn-result);
 	conn-result = NULL;
-	conn-curTuple = NULL;
 }
 
 /*
@@ -756,7 +764,6 @@ pqPrepareAsyncResult(PGconn *conn)
 	 */
 	res = conn-result;
 	conn-result = NULL;		/* handing over ownership to caller */
-	conn-curTuple = NULL;		/* just in case */
 	if (!res)
 		res = PQmakeEmptyPGresult(conn, PGRES_FATAL_ERROR);
 	else
@@ -829,12 +836,17 @@ pqInternalNotice(const PGNoticeHooks *hooks, const char *fmt,...)
 
 /*
  * pqAddTuple
- *	  add a row pointer to the PGresult structure, growing it if necessary
- *	  Returns TRUE if OK, FALSE if not enough memory to add the row
+ *	  add a row to the PGresult structure, growing it if necessary
+ *	  Returns the pointer to the new tuple if OK, NULL if not enough
+ *	  memory to add the row.
  */
-int
-pqAddTuple(PGresult *res, PGresAttValue *tup)
+void *
+pqAddTuple(PGresult *res, PGresAttValue *columns)
 {
+	PGresAttValue *tup;
+	int nfields = res-numAttributes;
+	int i;
+
 	if (res-ntups = res-tupArrSize)
 	{
 		/*
@@ -858,13 +870,39 @@ pqAddTuple(PGresult *res, PGresAttValue *tup)
 			newTuples = (PGresAttValue **)
 realloc(res-tuples, newSize * sizeof(PGresAttValue *));
 		if (!newTuples)
-			return FALSE;		/* malloc or realloc failed */
+			return NULL;		/* malloc or realloc failed */

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-27 Thread Merlin Moncure

On Fri, Jan 27, 2012 at 2:57 AM, Kyotaro HORIGUCHI
horiguchi.kyot...@oss.ntt.co.jp wrote:
 Hello, This is a new version of the patch formerly known as
 'alternative storage for libpq'.

I took a quick look at the patch and the docs.  Looks good and agree
with rationale and implementation.   I see you covered the pqsetvalue
case which is nice.  I expect libpq C api clients coded for
performance will immediately gravitate to this api.

 - The meaning of PGresAttValue is changed. The field 'value' now
  contains a value withOUT terminating zero. This change seems to
  have no effect on any other portion within the whole source
  tree of postgresql from what I've seen.

This is a minor point of concern.  This function was exposed to
support libpqtypes (which your stuff compliments very nicely by the
way) and I quickly confirmed removal of the null terminator didn't
cause any problems there.  I doubt anyone else is inspecting the
structure directly (also searched the archives and didn't find
anything).

This needs to be advertised very loudly in the docs -- I understand
why this was done but it's a pretty big change in the way the api
works.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-27 Thread Marko Kreen

On Fri, Jan 27, 2012 at 05:57:01PM +0900, Kyotaro HORIGUCHI wrote:
 Hello, This is a new version of the patch formerly known as
 'alternative storage for libpq'.
 
 - Changed the concept to 'Alternative Row Processor' from
   'Storage handler'. Symbol names are also changed.
 
 - Callback function is modified following to the comment.
 
 - From the restriction of time, I did minimum check for this
   patch. The purpose of this patch is to show the new implement.
 
 - Proformance is not measured for this patch for the same
   reason. I will do that on next monday.
 
 - The meaning of PGresAttValue is changed. The field 'value' now
   contains a value withOUT terminating zero. This change seems to
   have no effect on any other portion within the whole source
   tree of postgresql from what I've seen.


I think we have general structure in place.  Good.

Minor notes:

= rowhandler api =

* It returns bool, so void* is wrong.  Instead libpq style is to use int,
  with 1=OK, 0=Failure.  Seems that was also old pqAddTuple() convention.

* Drop PQgetRowProcessorParam(), instead give param as argument.

* PQsetRowProcessorErrMes() should strdup() the message.  That gets rid of
  allocator requirements in API.  This also makes safe to pass static
  strings there.  If strdup() fails, fall back to generic no-mem message.

* Create new struct to replace PGresAttValue for rowhandler usage.
  RowHandler API is pretty unique and self-contained.  It should have
  it's own struct.  Main reason is that it allows to properly document it.
  Otherwise the minor details get lost as they are different from
  libpq-internal usage.  Also this allows two structs to be
  improved separately.  (PGresRawValue?)

* Stop storing null_value into -value.  It's libpq internal detail.
  Instead the -value should always point into buffer where the value
  info is located, even for NULL.  This makes safe to simply subtract
  pointers to get row size estimate. Seems pqAddTuple() already
  does null_value logic, so no need to do it in rowhandler api.

= libpq =

Currently its confusing whether rowProcessor can be NULL, and what
should be done if so.  I think its better to fix usage so that
it is always set.

* PQregisterRowProcessor() should use default func if func==NULL.
  and set default handler if so.
* Never set rowProcessor directly, always via PQregisterRowProcessor()
* Drop all if(rowProcessor) checks.

= dblink =

* There are malloc failure checks missing in initStoreInfo()  storeHandler().


-- 
marko


PS.  You did not hear it from me, but most raw values are actually
nul-terminated in protocol.  Think big-endian.  And those which
are not, you can make so, as the data is not touched anymore.
You cannot do it for last value, as next byte may not be allocated.
But you could memmove() it lower address so you can null-terminate.

I'm not suggesting it for official patch, but it would be fun to know
if such hack is benchmarkable, and benchmarkable on realistic load.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-27 Thread Marko Kreen

On Fri, Jan 27, 2012 at 09:35:04AM -0600, Merlin Moncure wrote:
 On Fri, Jan 27, 2012 at 2:57 AM, Kyotaro HORIGUCHI
  - The meaning of PGresAttValue is changed. The field 'value' now
   contains a value withOUT terminating zero. This change seems to
   have no effect on any other portion within the whole source
   tree of postgresql from what I've seen.
 
 This is a minor point of concern.  This function was exposed to
 support libpqtypes (which your stuff compliments very nicely by the
 way) and I quickly confirmed removal of the null terminator didn't
 cause any problems there.  I doubt anyone else is inspecting the
 structure directly (also searched the archives and didn't find
 anything).
 
 This needs to be advertised very loudly in the docs -- I understand
 why this was done but it's a pretty big change in the way the api
 works.

Note that the non-NUL-terminated PGresAttValue is only used for row
handler.  So no existing usage is affected.

But I agree using same struct in different situations is confusing,
thus the request for separate struct for row handler usage.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-26 Thread Kyotaro HORIGUCHI

Thank you for the comment,

 First, my priority is one-the-fly result processing,
 not the allocation optimizing.  And this patch seems to make
 it possible, I can process results row-by-row, without the
 need to buffer all of them in PQresult.  Which is great!
 
 But the current API seems clumsy, I guess its because the
 patch grew from trying to replace the low-level allocator.

 Exactly.

 I would like to propose better one-shot API with:
 
 void *(*RowStoreHandler)(PGresult *res, PGresAttValue *columns);
 
 where the PGresAttValue * is allocated once, inside PQresult.
 And the pointers inside point directly to network buffer.

 Good catch, thank you. The patch is dragging too much from the
old implementation. It is no need to copy the data inside
getAnotherTuple to do it, as you say.

 Ofcourse this requires replacing the current per-column malloc+copy
 pattern with per-row parse+handle pattern, but I think resulting
 API will be better:
 
 1) Pass-through processing do not need to care about unnecessary
per-row allocations.
 
 2) Handlers that want to copy of the row (like regular libpq),
can optimize allocations by having global view of the row.
(Eg. One allocation for row header + data).
 
 This also optimizes call patterns - first libpq parses packet,
 then row handler processes row, no unnecessary back-and-forth.
 
 
 Summary - current API has various assumptions how the row is
 processed, let's remove those.

 Thank you, I rewrite the patch to make it realize.

 regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-23 Thread Marko Kreen

On Sat, Jan 21, 2012 at 1:52 PM, Marc Mamin m.ma...@intershop.de wrote:
 
  c. Refine error handling of dblink.c. I think it preserves the
     previous behavior for column number mismatch and type
     conversion exception.

 Hello,

 I don't know if this cover following issue.
 I just mention it for the case you didn't notice it and would like to
 handle this rather cosmetic issue as well.

 http://archives.postgresql.org/pgsql-bugs/2011-08/msg00113.php

It is not relevant to this thread, but seems good idea to implement indeed.
It should be simple matter of creating handler that uses dblink_res_error()
to report the notice.

Perhaps you could create and submit the patch by yourself?

For reference, here it the full flow in PL/Proxy:

1) PQsetNoticeReceiver:
https://github.com/markokr/plproxy-dev/blob/master/src/execute.c#L422
2) handle_notice:
https://github.com/markokr/plproxy-dev/blob/master/src/execute.c#L370
3) plproxy_remote_error:
https://github.com/markokr/plproxy-dev/blob/master/src/main.c#L82

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-21 Thread Marc Mamin

 
  c. Refine error handling of dblink.c. I think it preserves the
 previous behavior for column number mismatch and type
 conversion exception.

Hello,

I don't know if this cover following issue.
I just mention it for the case you didn't notice it and would like to
handle this rather cosmetic issue as well.

http://archives.postgresql.org/pgsql-bugs/2011-08/msg00113.php

best regards,

Marc Mamin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-20 Thread Marko Kreen

On Tue, Jan 17, 2012 at 05:53:33PM +0900, Kyotaro HORIGUCHI wrote:
 Hello,  This is revised and rebased version of the patch.
 
 a. Old term `Add Tuple Function' is changed to 'Store
Handler'. The reason why not `storage' is simply length of the
symbols.
 
 b. I couldn't find the place to settle PGgetAsCString() in. It is
removed and storeHandler()@dblink.c touches PGresAttValue
directly in this new patch. Definition of PGresAttValue stays
in lipq-fe.h and provided with comment.
 
 c. Refine error handling of dblink.c. I think it preserves the
previous behavior for column number mismatch and type
conversion exception.
 
 d. Document is revised.

First, my priority is one-the-fly result processing,
not the allocation optimizing.  And this patch seems to make
it possible, I can process results row-by-row, without the
need to buffer all of them in PQresult.  Which is great!

But the current API seems clumsy, I guess its because the
patch grew from trying to replace the low-level allocator.

I would like to propose better one-shot API with:

void *(*RowStoreHandler)(PGresult *res, PGresAttValue *columns);

where the PGresAttValue * is allocated once, inside PQresult.
And the pointers inside point directly to network buffer.
Ofcourse this requires replacing the current per-column malloc+copy
pattern with per-row parse+handle pattern, but I think resulting
API will be better:

1) Pass-through processing do not need to care about unnecessary
   per-row allocations.

2) Handlers that want to copy of the row (like regular libpq),
   can optimize allocations by having global view of the row.
   (Eg. One allocation for row header + data).

This also optimizes call patterns - first libpq parses packet,
then row handler processes row, no unnecessary back-and-forth.


Summary - current API has various assumptions how the row is
processed, let's remove those.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Speed dblink using alternate libpq tuple storage

2012-01-17 Thread Kyotaro HORIGUCHI

Hello,  This is revised and rebased version of the patch.

a. Old term `Add Tuple Function' is changed to 'Store
   Handler'. The reason why not `storage' is simply length of the
   symbols.

b. I couldn't find the place to settle PGgetAsCString() in. It is
   removed and storeHandler()@dblink.c touches PGresAttValue
   directly in this new patch. Definition of PGresAttValue stays
   in lipq-fe.h and provided with comment.

c. Refine error handling of dblink.c. I think it preserves the
   previous behavior for column number mismatch and type
   conversion exception.

d. Document is revised.

 It jumped from 332K tuples/sec to 450K, a 35% gain, and had a
 lower memory footprint too.  Test methodology and those results
 are at
 http://archives.postgresql.org/pgsql-hackers/2011-12/msg8.php

It is a disappointment that I found that the gain had become
lower than that according to the re-measuring.

For CentOS6.2 and other conditions are the same to the previous
testing, the overall performance became hihger and the loss of
libpq patch was 1.8% and the gain of full patch had been fallen
to 5.6%. But the reduction of the memory usage was not changed.

Original : 3.96s  100.0%
w/libpq patch: 4.03s  101.8%
w/libpq+dblink patch : 3.74s   94.4%


The attachments are listed below.

libpq_altstore_20120117.patch
  - Allow alternative storage for libpql.

dblink_perf_20120117.patch
  - Modify dblink to use alternative storage mechanism.
 
libpq_altstore_doc_20120117.patch
  - Document for libpq_altstore. Shows in 31.19. Alternatie result storage


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1af8df6..83525e1 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -160,3 +160,6 @@ PQconnectStartParams  157
 PQping158
 PQpingParams  159
 PQlibVersion  160
+PQregisterStoreHandler	  161
+PQgetStoreHandlerParam	  163
+PQsetStoreHandlerErrMes	  164
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index d454538..5559f0b 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2692,6 +2692,7 @@ makeEmptyPGconn(void)
 	conn-allow_ssl_try = true;
 	conn-wait_ssl_try = false;
 #endif
+	conn-storeHandler = NULL;
 
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe
@@ -5076,3 +5077,10 @@ PQregisterThreadLock(pgthreadlock_t newhandler)
 
 	return prev;
 }
+
+void
+PQregisterStoreHandler(PGconn *conn, StoreHandler func, void *param)
+{
+	conn-storeHandler = func;
+	conn-storeHandlerParam = param;
+}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b743566..96e5974 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -67,6 +67,10 @@ static int PQsendDescribe(PGconn *conn, char desc_type,
 			   const char *desc_target);
 static int	check_field_number(const PGresult *res, int field_num);
 
+static void *pqDefaultStoreHandler(PGresult *res, PQStoreFunc func,
+   int id, size_t len);
+static void *pqAddTuple(PGresult *res, PGresAttValue *tup);
+
 
 /* 
  * Space management for PGresult.
@@ -160,6 +164,9 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
 	result-curBlock = NULL;
 	result-curOffset = 0;
 	result-spaceLeft = 0;
+	result-storeHandler = pqDefaultStoreHandler;
+	result-storeHandlerParam = NULL;
+	result-storeHandlerErrMes = NULL;
 
 	if (conn)
 	{
@@ -194,6 +201,12 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
 			}
 			result-nEvents = conn-nEvents;
 		}
+
+		if (conn-storeHandler)
+		{
+			result-storeHandler = conn-storeHandler;
+			result-storeHandlerParam = conn-storeHandlerParam;
+		}
 	}
 	else
 	{
@@ -487,6 +500,33 @@ PQresultAlloc(PGresult *res, size_t nBytes)
 	return pqResultAlloc(res, nBytes, TRUE);
 }
 
+void *
+pqDefaultStoreHandler(PGresult *res, PQStoreFunc func, int id, size_t len)
+{
+	void *p;
+
+	switch (func)
+	{
+		case PQSF_ALLOC_TEXT:
+			return pqResultAlloc(res, len, TRUE);
+
+		case PQSF_ALLOC_BINARY:
+			p = pqResultAlloc(res, len, FALSE);
+
+			if (id == -1)
+res-storeHandlerParam = p;
+
+			return p;
+
+		case PQSF_ADD_TUPLE:
+			return pqAddTuple(res, res-storeHandlerParam);
+
+		default:
+			/* Ignore */
+			break;
+	}
+	return NULL;
+}
 /*
  * pqResultAlloc -
  *		Allocate subsidiary storage for a PGresult.
@@ -830,9 +870,9 @@ pqInternalNotice(const PGNoticeHooks *hooks, const char *fmt,...)
 /*
  * pqAddTuple
  *	  add a row pointer to the PGresult structure, growing it if necessary
- *	  Returns TRUE if OK, FALSE if not enough memory to add the row
+ *	  Returns tup if OK, NULL if not enough memory to add the row.
  */
-int
+static void *
 pqAddTuple(PGresult *res, PGresAttValue *tup)
 {
 	if (res-ntups = res-tupArrSize)
@@ -858,13 +898,13 @@ pqAddTuple(PGresult *res,

91 matches

Mail list logo