Re: The ultimate extension hook.

2020-10-25 Thread Daniel Wood
> On 10/23/2020 9:31 AM Jehan-Guillaume de Rorthais  wrote:
> [...]
> * useless with encrypted traffic
> 
> So, +1 for such hooks.
> 
> Regards,

Ultimately Postgresql is supposed to be extensible.
I don't see an API hook as being some crazy idea even if some may not like what 
I might want to use it for.  It can be useful for a number of things.

- Dan




Re: The ultimate extension hook.

2020-10-23 Thread Jehan-Guillaume de Rorthais
On Thu, 24 Sep 2020 17:08:44 +1200
David Rowley  wrote:
[...]
> I wondered if there was much in the way of use-cases like a traffic
> filter, or statement replication. I wasn't sure if it was a solution
> looking for a problem or not, but it seems like it could be productive
> to talk about possibilities here and make a judgement call based on if
> any alternatives exist today that will allow that problem to be solved
> sufficiently in another way.

If I understand correctly the proposal, this might enable traffic capture using
a loadable extension.

This kind of usage would allows to replay and validate any kind of traffic from
a major version to another one. Eg. to look for regressions from the application
point of view, before a major upgrade.

I did such regression tests in past. We were capturing production traffic
using libpcap and replay it using pgshark on upgraded test env. Very handy.
However:

* libpcap can drop network packet during high load. This make the capture
  painful to recover past the hole.
* useless with encrypted traffic

So, +1 for such hooks.

Regards,




Re: The ultimate extension hook.

2020-09-23 Thread Daniel Wood


> On 09/23/2020 9:26 PM Tom Lane  wrote:
> ...
> > The hook I'd like to see would be in the PostgresMain() loop
> > for the API "firstchar" messages.
> 
> What, to invent your own protocol?  Where will you find client libraries
> buying into that?

No API/client changes are needed for:
1) API tracing/filtering; or
3) custom SQL like commands through a trivial modification to  Simple Query 
'Q'.  Purely optional as you'll see at the end.

Yes, (2) API extension "case 'A'" could be used to roll ones own protocol.  
When pondering API hooking, in general, I thought of this also but don't let it 
be a distraction.

> I'm not really convinced that any of the specific use-cases you suggest
> are untenable to approach via the existing function fastpath mechanism,
> anyway.

Certainly (3) is just a command level way to execute a function instead of 
'select myfunc()'.  But it does go through the SQL machinery and SQL argument 
type lookup and processing.  I like fast and direct things.  And (3) is so 
trivial to implement.

However, even fastpath doesn't provide a protocol hook function where tracing 
could be done.  If I had that alone I could do my own 'Q' hook and do the 
"!cmd" processing in my extension even if I sold the idea just based on 
tracing/filtering.

We hook all kinds of things in PG.  Think big.  Why should the protocol 
processing not have a hook?  I'll bet some others will think of things I 
haven't even yet thought of that would leverage this.

- Dan Wood




Re: The ultimate extension hook.

2020-09-23 Thread David Rowley
On Thu, 24 Sep 2020 at 16:26, Tom Lane  wrote:
>
> Daniel Wood  writes:
> > Hooks exist all over PG for extensions to cover various specific usages.
> > The hook I'd like to see would be in the PostgresMain() loop
> > for the API "firstchar" messages.
>
> What, to invent your own protocol?  Where will you find client libraries
> buying into that?

Well, Dan did mention other use cases.  It's certainly questionable if
people wanted to use it to invent their own message types as they'd
need client support.  However, when it comes to newly proposed hooks,
I thought we should be asking ourself questions like, are there
legitimate use cases for this?  Is it safe to expose this?  It seems a
bit backwards to consider illegitimate uses of a hook unless they
relate to security.

> I'm not really convinced that any of the specific use-cases you suggest
> are untenable to approach via the existing function fastpath mechanism,
> anyway.

I wondered if there was much in the way of use-cases like a traffic
filter, or statement replication. I wasn't sure if it was a solution
looking for a problem or not, but it seems like it could be productive
to talk about possibilities here and make a judgement call based on if
any alternatives exist today that will allow that problem to be solved
sufficiently in another way.

David




Re: The ultimate extension hook.

2020-09-23 Thread Tom Lane
Daniel Wood  writes:
> Hooks exist all over PG for extensions to cover various specific usages.
> The hook I'd like to see would be in the PostgresMain() loop
> for the API "firstchar" messages.

What, to invent your own protocol?  Where will you find client libraries
buying into that?

I'm not really convinced that any of the specific use-cases you suggest
are untenable to approach via the existing function fastpath mechanism,
anyway.

regards, tom lane




The ultimate extension hook.

2020-09-23 Thread Daniel Wood
Hooks exist all over PG for extensions to cover various specific usages.

The hook I'd like to see would be in the PostgresMain() loop
for the API "firstchar" messages.

While I started just wanting the hook for the absolute minimum overhead to 
execute a function, even faster than fastpath, and in brainstorming with David 
Rowley other use cases became apparent.

API tracing within the engine.  I've heard of client tools for this.
API filtering.  Block/ignore manual checkpoints for instance.
API message altering.
Anything you want to hook into at the highest level well above ExecutorRun.

Originally I just wanted a lightweight mechanism to capture some system 
counters like /proc/stat without going through the SQL execution machinery.  
I'm picky about implementing stuff in the absolute fastest way.  :-)  But I 
think there are other practical things that I haven't even thought of yet.

There are a few implementation mechanisms which achieve slightly different 
possibilities:

1) The generic mechanism would let one or more API filters be installed to 
directly call functions in an extension.  There would be no SQL arg processing 
overhead based on the specific function.   You'd just pass it the 
StringInfoData 'msg' itself. Multiple extensions might use the hook so you'd 
need to rewind the StringInfo buffer.  Maybe I return a boolean to indicate no 
further processing of this message or fall through to the normal "switch 
(firstchar)" processing.

2) switch (firstchar) { case 'A': // New trivial API message for extensions
which would call a single extension installed function to do whatever I wanted 
based on the message payload.  And, yes, I know this can be done just using 
SQL.  It is simply a variation.  But this would require client support and I 
prefer the below.

3) case 'Q':  /* simple query */
if (pq_peekbyte() == '!' && APIHook != NULL) {
(*APIHook)(msg);

...
continue;
}

I've use this last technique to do things like:
if (!strncmp(query_string, "DIEDIEDIE", 9) {
char *np = NULL;
*np = 1;
} else if (!strncmp(query_string, "PING", 4) {
static const char *pong = "PONG";
pq_putmessage('C', pong, strlen(pong) + 1);
send_ready_for_query = true;
continue;
} else if (...)

Then I can simple type PING into psql and get back a PONG.
Or during a stress test on a remote box I can execute the simple query 
"DIEDIEDIE" and crash the server.  I did this inline for experimentation before 
but it would be nice if I had the mechanism to use a "statement" to invoke a 
hook function in an extension.  A single check for "!" in the 'Q' processing 
would allow user defined commands in extensions.  The dispatcher would be in 
the extension.  I just need the "!" check.

Another example where ultimate performance might be a goal, if  you are 
familiar with why redis/memcached/etc. exists then imagine loading SQL results 
into a cache in an extension and executing as a 'simple' query something like:  
!LOOKUP 
and getting the value faster than SQL could do.

Before I prototype I want to get some feedback.  Why not have a hook at the API 
level?