Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Alvaro Hernandez



On 26/09/17 17:50, Craig Ringer wrote:
On 26 September 2017 at 22:14, Magnus Hagander <mailto:mag...@hagander.net>> wrote:




On Tue, Sep 26, 2017 at 2:16 PM, Alvaro Hernandez mailto:a...@ongres.com>> wrote:




    But what about earlier versions? Any chance it could be
backported down to 9.4? If that would be acceptable, I could
probably help/do that...


The likelihood is zero if you mean backported into core of earlier
versions.


Right. We don't add features to back branches.


    Yeah, I know the policy. But asking is free ;) and in my opinion 
this would be a very good reason to have an exception, if there would be 
a clear desire to have a single, unified, production quality output 
plugin across all PostgreSQL versions. At least I tried ;)





If you mean backported as a standalone extension that could be
installed on a previous version, probably. I'm not sure if it
relies on any internals not present before that would make it
harder, but it would probably at least be possible.


All the pub/sub stuff is new and hooked into syscache etc. So you'd be 
doing a bunch of emulation/shims using user catalogs. Not impossible, 
but probably irritating and verbose. And you'd have none of the DDL 
required to manage it, so you'd need SQL-function equivalents.


I suspect you'd be better off tweaking pglogical to speak the same 
protocol as pg10, since the pgoutput protocol is an evolution of 
pglogical's protocol. Then using pglogical on older versions.



    Given all this, if I would be doing an app based on logical 
decoding, I think I will stick to test_decoding for <10....



    Thanks,

    Álvaro


--

Alvaro Hernandez


---
OnGres



Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Alvaro Hernandez



On 26/09/17 10:55, Craig Ringer wrote:
On 26 September 2017 at 15:26, Alvaro Hernandez <mailto:a...@ongres.com>> wrote:



    That's better than nothing. But as much as interoperable json
may be, people still need to talk the (binary) replication
protocol to use it.


No, they don't.

They can use the SQL interface to logical decoding.


    AFAIK the SQL interface was also designed for testing, not for 
production use...




We could enhance that with a few small changes to make it a lot more 
useful too. Most importantly, allow a long-polling model, where you 
can wait if there's nothing to consume rather than getting an 
immediate empty result-set.


    Oh, that's a completely different perspective! Do we have any 
support for long-polling style queries (not that I know of). It's indeed 
a cool thing: SQL queries that return "live" results as soon as they 
happen. I don't see this restricted to logical decoding. Actually, this 
is what PipelineDB, among other things, seem to do. +1 for that, but I 
believe it's a different story.




I expect the walsender protocol to be dealt with by client drivers, 
not user applications, much like you never really deal with the 
regular libpq protocol in your app. PgJDBC and psycopg2 already 
support it. It'd be nice if libpq offered some helper interfaces for C 
apps, and I'd help review any such patch.


    Fair enough.


    Álvaro


--

Alvaro Hernandez


---
OnGres



Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Alvaro Hernandez



On 26/09/17 12:57, Petr Jelinek wrote:

On 26/09/17 09:26, Alvaro Hernandez wrote:

On 26/09/17 10:03, Craig Ringer wrote:

On 26 September 2017 at 14:08, Alvaro Hernandez mailto:a...@ongres.com>> wrote:
 - If you stick to in-core plugins, then you need to support at
 least three different output formats if you want to support 9.4+:
 test_decoding (and pray it works!), pgoutput, and the "new"
 in-core plugin that was proposed at the beginning of this thread,
 if that would see the light.


The only practical way will IMO be to have whatever new plugin it also
have an out-of-core version maintained for older Pg versions, where it
can be installed.
   


 But only in-core plugins help for general-purpose solutions.


I still don't agree there. If there's enough need/interest/adoption
you can get cloud vendors on board, they'll feel the customer
pressure. It's not our job to create that pressure and do their work
for them.

     Don't want to get into a loop, but as I said before it's
chicken-and-egg. But nobody is asking core to do their work. As much as
I love it, I think logical decoding is a bit half-baked until there is a
single, quality, in-core plugin, as it discourages its usage, because of
the reasons I stated.


Well, in that case it's all good as PG10 has that.



    Even though it's not fully documented, I agree this could fulfill 
this gap for 10+ (I assume this plugin will be maintained onwards, at 
least to support logical replication).


    But what about earlier versions? Any chance it could be backported 
down to 9.4? If that would be acceptable, I could probably help/do that...



    Álvaro

--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-26 Thread Alvaro Hernandez



On 26/09/17 10:03, Craig Ringer wrote:
On 26 September 2017 at 14:08, Alvaro Hernandez <mailto:a...@ongres.com>> wrote:




    OK, let me try to do that. I believe data integration is a
priority.


Definitely agree so far.

- If you want to develop your own output plugin, then your market
is reduced as you have to exclude all the managed solutions
(until, and only if, you would convince them to include your
plugin... highly unlikely, very difficult). And probably another %
of enterprise environments which will hesitate to link your own
plugin to the running production database. Last but not least, you
need to compile and test (and testing this is far from easy) on
multiple OSes/architectures.


Right. You probably don't want one output plugin per application.

This doesn't mean it has to be in-core, just flexible and share-able 
by many, and adopted by cloud vendors. Like say PostGIS.


    That would be nice. But this is chicken-and-egg: an out-of-core 
plugin won't probably be used by many if applications like the ones I 
was mentioning if they do not exist. And developing such an application 
is so much less interesting if a significant part of your market is 
excluded from your app.


    However, it could work the other way around: a sufficiently good 
enough in-core base plugin could foster applications/ecosystem, which 
once adopted by users could push much more easily for other more 
advanced out-of-core plugins, that would be more easily accepted by 
pressure as those tools would already be with significant traction. But 
I don't see it the other way around.




- If you stick to in-core plugins, then you need to support at
least three different output formats if you want to support 9.4+:
test_decoding (and pray it works!), pgoutput, and the "new"
in-core plugin that was proposed at the beginning of this thread,
if that would see the light.


The only practical way will IMO be to have whatever new plugin it also 
have an out-of-core version maintained for older Pg versions, where it 
can be installed.


But only in-core plugins help for general-purpose solutions.


I still don't agree there. If there's enough need/interest/adoption 
you can get cloud vendors on board, they'll feel the customer 
pressure. It's not our job to create that pressure and do their work 
for them.


    Don't want to get into a loop, but as I said before it's 
chicken-and-egg. But nobody is asking core to do their work. As much as 
I love it, I think logical decoding is a bit half-baked until there is a 
single, quality, in-core plugin, as it discourages its usage, because of 
the reasons I stated.





I see nothing wrong with a plugin starting off out of core and being 
adopted+adapted later, assuming it's done well.


That said, I'm all in favour of a generic json output plugin that 
shares infrastructure with logical replication, so people who are on 
inflexible environments have a fallback option. I just don't care to 
write it.


    That's better than nothing. But as much as interoperable json may 
be, people still need to talk the (binary) replication protocol to use 
it. So once you talk binary protocol, why not talk binary also for the 
output plugin and have a much more efficient output? Again, nothing 
against json, but if a new plugin would be included in-core, I'd say 
json + binary also. Or just document pgoutput, as it could be good enough.


    Cheers,

    Álvaro


--

Alvaro Hernandez


---
OnGres



Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-25 Thread Alvaro Hernandez



On 25/09/17 21:38, Andres Freund wrote:

On 2017-09-25 21:31:11 +0300, Alvaro Hernandez wrote:

- Distribution and testing are non-trivial: many OS/archs combinations.


Yes, it is. Why would we want to increase that burden to this community?


     That's a different story, and one I cannot argue against. If easying
postgresql integration with other tools is not something of a priority or
something the core group cannot add to all the stuff on their shoulders, all
my due respect. PostgreSQL users will do without, someway, somehow. But IMHO
this should be a really high priority, and saying that this would turn
PostgreSQL into an Oracle code base is going too far ;)

Well, we can certainly use more help doing maintenance-y stuff, which
will in turn allow to get more stuff into core in a good state medium+
term... ;)

Less jokingly: I'm doubtful that the "not a priority" take is really
fair - there's a lot of priorities, and they compete for scant
resources. Which means people have to argue convincingly if they want to
add to that burden - just actually asking the question whether it's a
good use of resources doesn't mean it's not. Just that it's not yet
clear.




    OK, let me try to do that. I believe data integration is a 
priority. World is no longer an isolated database where many apps talk 
to it. Nowdays heterogeneous architectures are almost the norm. CDC is 
often the best solution for many of the data integration tasks like data 
warehousing, data consolidation, stream processing, etc. From this 
perspective, it would be key to have a general tool or good starting 
point for CDC or even higher level functionality tools. Think of 
PostgreSQL's Golden Gate.


    Now, developing any software like this faces two significant 
challenges, which turn into deterrents for developing such software:


- If you want to develop your own output plugin, then your market is 
reduced as you have to exclude all the managed solutions (until, and 
only if, you would convince them to include your plugin... highly 
unlikely, very difficult). And probably another % of enterprise 
environments which will hesitate to link your own plugin to the running 
production database. Last but not least, you need to compile and test 
(and testing this is far from easy) on multiple OSes/architectures.


- If you stick to in-core plugins, then you need to support at least 
three different output formats if you want to support 9.4+: 
test_decoding (and pray it works!), pgoutput, and the "new" in-core 
plugin that was proposed at the beginning of this thread, if that would 
see the light.



    Both are strong enough arguments to make building this kind 
software far less interesting. Actually, I don't know of any software 
like this, and this may already be a consequence of what I'm saying. And 
I truly believe PostgreSQL should offer this, as part of its ecosystem 
(not necessarily in core).


    On the other hand, in-core may help encourage building this 
solutions. If there would be an in-core, uniform, flexible, output 
plugin, and ideally backported to 9.4 (I know, I know), included in 
very PostgreSQL... it would really open the doors to many integration 
applications. Output plugins are great for your own use, for controlled 
environments, for specific set of users. But only in-core plugins help 
for general-purpose solutions.



    Álvaro

--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-25 Thread Alvaro Hernandez



On 25/09/17 22:08, Jignesh Shah wrote:



On Mon, Sep 25, 2017 at 11:37 AM, Joshua D. Drake 
mailto:j...@commandprompt.com>> wrote:


On 09/25/2017 11:31 AM, Alvaro Hernandez wrote:



Whether or not they are included in a managed environment
is generally based on two things:

1. Safety (why RDS doesn't allow certain C extensions)
2. Community/Popularity (Exactly why RDS has PostGIS)
    A. Demand with a prerequisite of #1


 This is very clear. Now tell me: how many output plugins
do you see included in RDS. And in GCP's PostgreSQL? Azure
Postgres? Heroku?


From RDS:

Logical Replication for PostgreSQL on Amazon RDS

Beginning with PostgreSQL version 9.4, PostgreSQL supports the
streaming of WAL changes using logical replication slots. Amazon
RDS supports logical replication for a PostgreSQL DB instance
version 9.4.9 and higher and 9.5.4 and higher. Using logical
replication, you can set up logical replication slots on your
instance and stream database changes through these slots to a
client like pg_recvlogical. Logical slots are created at the
database level and support replication connections to a single
database.

PostgreSQL logical replication on Amazon RDS is enabled by a new
parameter, a new replication connection type, and a new security
role. The client for the replication can be any client that is
capable of establishing a replication connection to a database on
a PostgreSQL DB instance.

The most common clients for PostgreSQL logical replication are AWS
Database Migration Service or a custom-managed host on an AWS EC2
instance. The logical replication slot knows nothing about the
receiver of the stream; there is no requirement that the target be
a replica database. Note that if you set up a logical replication
slot and do not read from the slot, data can be written to your DB
instance's storage and you can quickly fill up the storage on your
instance.

"""

I don't see why others wouldn't be available either. In fact, I am
not sure why you couldn't use the JSON ones now. (Although I have
not tested it).

JD





Also to add, Amazon RDS for PostgreSQL does supports non-core 
plugins.  Wal2json output plugin for logical decoding is supported for 
versions 9.6.3+ and 9.5.7+  (link 
<http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html#PostgreSQL.Concepts.General.version963>) 
.




    I think that's awesome. Now... back to my original question: what 
is the *list* of output plugins supported by managed PostgreSQL 
solutions? So far it looks like wal2json for 9.5-9.6 on RDS, and nothing 
else (it may just be not complete, but in the best case this list won't 
be unfortunately long...).



    Álvaro


--

Alvaro Hernandez


---
OnGres



Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-25 Thread Alvaro Hernandez



On 25/09/17 22:13, Magnus Hagander wrote:
On Mon, Sep 25, 2017 at 8:20 PM, Alvaro Hernandez <mailto:a...@ongres.com>> wrote:




On 25/09/17 20:18, Andres Freund wrote:

On 2017-09-24 13:36:56 +0300, Alvaro Hernandez wrote:

 However, if DMS uses it for what I'd call production
use, I assume it is
actually production quality. I bet they do enough testing,
and don't ship
software to potentially millions of customers if it
doesn't work well. So...
first, I'd consider this a a sign of robustness.

You've been in software for how long? ... ;)  There's quite mixed
experiences with DMS.


    Actually long enough to understand that if someone "big" calls
it production quality, we should not be pickier and assume it is
--whether it is or not. People will accept it as such, and that's
good enough.


Historically the fact that we have been pickier than many of the 
"someone big":s is exactly how we ended up with the codebase and 
relative stability we have today.


Just because someone is big doesn't mean they know what's right. In 
fact, more often than not the opposite turns out to be true.





    Note that I'm not here supporting test_decoding. I'm just saying is 
all what is available in-core for 9.4-9.6, and it seems someone with 
potentially a lot of users tested it and is using it in its own 
solution. Ask me if I would like an in-core, well tested, performant, 
with an easy to parse format, and efficient, for 9.4-9.6? My answer 
would be an immediate 'yes'. But since this is not going to happen, 
test_decoding is good that is good enough, lucky us, because otherwise 
there would not be a good solution on this front.


    Álvaro

--

Alvaro Hernandez


---
OnGres



Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-25 Thread Alvaro Hernandez



On 25/09/17 20:31, Joshua D. Drake wrote:

On 09/25/2017 10:19 AM, Petr Jelinek wrote:

On 25/09/17 18:48, Alvaro Hernandez wrote:





 In my opinion, logical decoding plugins that don't come with core
are close to worthless (don't get me wrong):



I respectfully disagree.


As do I.


    But Petr, without saying why, it does not help much to the 
discussion ;)







- They very unlikely will be installed in managed environments (an area
growing significantly).


Whether or not they are included in a managed environment is generally 
based on two things:


1. Safety (why RDS doesn't allow certain C extensions)
2. Community/Popularity (Exactly why RDS has PostGIS)
    A. Demand with a prerequisite of #1


    This is very clear. Now tell me: how many output plugins do you see 
included in RDS. And in GCP's PostgreSQL? Azure Postgres? Heroku?


    I'm looking at this from the practical perspective: if you would 
want to write a middleware for PostgreSQL that would rely on logical 
decoding, you definitely want to run on this platforms, or you are out 
of the game. If we want PostgreSQL to integrate more easily in nowadays 
very heterogeneous environments, this is key. And relying on 
non-included-or-acceptable-in-many environments output plugins is not, 
IMHO, a viable nor sensible option. I'd rather stick to test_decoding or 
pgoutput, no question.





- As anything that is not in core, raises concerns by users.


I find this a rather failing argument in today's market. If they are 
willing to migrate to Postgres, they are more than likely willing to 
use other open source software. Especially when combined with an 
expert telling them to.




- Distribution and testing are non-trivial: many OS/archs combinations.



Yes, it is. Why would we want to increase that burden to this community?



    That's a different story, and one I cannot argue against. If 
easying postgresql integration with other tools is not something of a 
priority or something the core group cannot add to all the stuff on 
their shoulders, all my due respect. PostgreSQL users will do without, 
someway, somehow. But IMHO this should be a really high priority, and 
saying that this would turn PostgreSQL into an Oracle code base is going 
too far ;)



    Álvaro

--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-25 Thread Alvaro Hernandez



On 25/09/17 20:18, Andres Freund wrote:

On 2017-09-24 13:36:56 +0300, Alvaro Hernandez wrote:

     However, if DMS uses it for what I'd call production use, I assume it is
actually production quality. I bet they do enough testing, and don't ship
software to potentially millions of customers if it doesn't work well. So...
first, I'd consider this a a sign of robustness.

You've been in software for how long? ... ;)  There's quite mixed
experiences with DMS.


    Actually long enough to understand that if someone "big" calls it 
production quality, we should not be pickier and assume it is --whether 
it is or not. People will accept it as such, and that's good enough.


;)




FWIW, I don't think there's a huge problem w/ using test_decoding - the
output isn't pretty but it's parseable. It's too verbose due to
repeating column & type names (which also slows down), but...


    Everything is parseable. I don't have a big problem with that. 
Stability is another issue: as long as it supports high volume 
operations and doesn't break, it's acceptable enough.



    Álvaro


--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-25 Thread Alvaro Hernandez



On 25/09/17 19:56, Andrew Dunstan wrote:


On 09/25/2017 12:48 PM, Alvaro Hernandez wrote:


On 25/09/17 19:39, Petr Jelinek wrote:

Well, test_decoding is not meant for production use anyway, no need for
middleware to support it. The pgoutput is primarily used for internal
replication purposes, which is why we need something with more
interoperability in mind in the first place. The new plugin should still
support publications etc though IMHO.


  However, having said that, and while json is a great output format
for interoperability, if there's a discussion on which plugin to
include
next, I'd also favor one that has some more compact representation
format (or that supports several formats, not only json).


JSON is indeed great for interoperability, if you want more compact
format, use either pgoutput or write something of your own or do
conversion to something else in your consumer. I don't think postgres
needs to provide 100 different formats out of the box when there is an
API. The JSON output does not have to be extremely chatty either btw.


     In my opinion, logical decoding plugins that don't come with core
are close to worthless (don't get me wrong):

- They very unlikely will be installed in managed environments (an
area growing significantly).
- As anything that is not in core, raises concerns by users.
- Distribution and testing are non-trivial: many OS/archs combinations.

     Given the above, I believe having a general-purpose output plugin
in-core is critical to the use of logical decoding. As for 9.4-9.6
there is test_decoding, and given that AWS uses it for production,
that's kind of fine. For 10 there is at least pgoutput, which could be
used (even though it was meant for replication). But if a new plugin
is to be developed for 11+, one really general purpose one, I'd say
json is not a good choice if it is the only output it would support.
json is too verbose, and replication, if anything, needs performance
(it is both network heavy and serialization/deserialization is quite
expensive). Why not, if one and only one plugin would be developed for
11+, general purpose, do something that is, indeed, more general,
i.e., that supports high-performance scenarios too?


   


A general purpose lower bandwidth plugin might one supporting Protocol
Buffers. The downside is that unlike json it's not self-contained, you
need the message definitions to interpret the stream, AIUI.


    Sure. But that's just a matter of documenting them, or even better, 
providing the .proto files, which are language-independent.


    There are also many other efficient serialization formats to 
explore, some self-contained, some not.



    Álvaro

--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-25 Thread Alvaro Hernandez



On 25/09/17 19:39, Petr Jelinek wrote:


Well, test_decoding is not meant for production use anyway, no need for
middleware to support it. The pgoutput is primarily used for internal
replication purposes, which is why we need something with more
interoperability in mind in the first place. The new plugin should still
support publications etc though IMHO.


     However, having said that, and while json is a great output format
for interoperability, if there's a discussion on which plugin to include
next, I'd also favor one that has some more compact representation
format (or that supports several formats, not only json).


JSON is indeed great for interoperability, if you want more compact
format, use either pgoutput or write something of your own or do
conversion to something else in your consumer. I don't think postgres
needs to provide 100 different formats out of the box when there is an
API. The JSON output does not have to be extremely chatty either btw.



    In my opinion, logical decoding plugins that don't come with core 
are close to worthless (don't get me wrong):


- They very unlikely will be installed in managed environments (an area 
growing significantly).

- As anything that is not in core, raises concerns by users.
- Distribution and testing are non-trivial: many OS/archs combinations.

    Given the above, I believe having a general-purpose output plugin 
in-core is critical to the use of logical decoding. As for 9.4-9.6 there 
is test_decoding, and given that AWS uses it for production, that's kind 
of fine. For 10 there is at least pgoutput, which could be used (even 
though it was meant for replication). But if a new plugin is to be 
developed for 11+, one really general purpose one, I'd say json is not a 
good choice if it is the only output it would support. json is too 
verbose, and replication, if anything, needs performance (it is both 
network heavy and serialization/deserialization is quite expensive). Why 
not, if one and only one plugin would be developed for 11+, general 
purpose, do something that is, indeed, more general, i.e., that supports 
high-performance scenarios too?



    Álvaro

--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-24 Thread Alvaro Hernandez



On 24/09/17 02:41, Euler Taveira wrote:

2017-09-23 14:01 GMT-03:00 Alvaro Hernandez :

 However, AFAIK, AWS's DMS uses it for production purposes (see
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html).


It seems a bad idea. AFAICS test_decoding was not designed to be a
ready-for-production plugin. It is just a proof of concept for logical
decoding.


    Yes, this is what I heard and read.

    However, if DMS uses it for what I'd call production use, I assume 
it is actually production quality. I bet they do enough testing, and 
don't ship software to potentially millions of customers if it doesn't 
work well. So... first, I'd consider this a a sign of robustness. 
Second. my hats off for the plugin code ;)



 I would be happy to see another logical decoding plugin into core
starting on 11. However, this also poses a bit of a challenge for middleware
implementors: you need to support one for 9.4-9.5 (test_decoding), another
for 10 (pgoutput) and maybe another for 11 onwards. The idea of asking users
to install a binary plugin is very unsexy, so these are the options
available.


wal2json works for 9.4+ (besides the WAL messages I committed a month
ago). Since this boat was already shipped we can arrange some packages
for 9.4-10 (an external project) and ask vendors to support the
backward-compatible plugin. The middleware implementor will have to
support this new plugin format. Being JSON a widespread format, it is
easier to refactor the code to parse JSON.


    I agree its far better to parse JSON than the test_decoding output. 
But asking any potential user to install a dynamic library, from a third 
party website, which will need to be compiled for many potential 
OSes/Archs, or even impossible if running on a managed environment... is 
not a great experience. Unless PostgreSQL would backport a plugin and 
ship it in newer releases, if test_decoding is good enough, I'd rather 
stick to it.





 However, having said that, and while json is a great output format for
interoperability, if there's a discussion on which plugin to include next,
I'd also favor one that has some more compact representation format (or that
supports several formats, not only json).


We could certainly extend pgoutput to support more than one format
(like pglogical did AFAIR), however, we wouldn't reuse code (different
formats) and will have a fat plugin (I don't foresee a plugin using
different formats in the same connection. It is difficult to
coordinate a change like that having only one-way communication).



    I think pgoutput is what it is. Maybe instead than growing it, my 
+1 would be to add a new plugin that rather than being json only, would 
also support other formats, like an efficient binary serialization.



    Álvaro


--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-23 Thread Alvaro Hernandez



On 23/09/17 18:42, Euler Taveira wrote:

2017-09-22 19:28 GMT-03:00 Gregory Brail :

We have been working on a project that makes extensive use of logical
replication for use inside Apigee (which is a very small part of Google):

https://github.com/apigee-labs/transicator

In order to do this, we had to write our own logical replication plugin
because the supplied "test_decoding" plugin from the "contrib" directory was
hard for us to work with. Primarily:


test_decoding is a proof of concept to illustrate the logical decoding
potential. It is not intended for production.


    However, AFAIK, AWS's DMS uses it for production purposes (see 
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html).



I developed wal2json [1]
for general use. It outputs changes in JSON. It was one of the first
logical decoding plugins.


1) It doesn't include all the fields that we need for Transicator (most
importantly we need the LSN and the 64-bit transaction ID).


wal2json includes both.


2) It outputs a text format that is hard to parse.


There are a lot of JSON parsers.


I imagine that other users of logical decoding are facing similar problems.

Would the community support the development of another plugin that is
distributed as part of "contrib" that addresses these issues? I'd be happy
to submit a patch, or GitHub repo, or whatever works best as an example.
(Also, although Transicator uses protobuf, I'm happy to have it output a
simple binary format as well.)


There was a prior discussion and it was suggestted that we have a
ready-for-production plugin in core (besides pgoutput). It was
suggested [1] that I submit wal2json for 11. I'm in process to clean
up the code and hope to submit it to CF2.


    I would be happy to see another logical decoding plugin into core 
starting on 11. However, this also poses a bit of a challenge for 
middleware implementors: you need to support one for 9.4-9.5 
(test_decoding), another for 10 (pgoutput) and maybe another for 11 
onwards. The idea of asking users to install a binary plugin is very 
unsexy, so these are the options available.


    However, having said that, and while json is a great output format 
for interoperability, if there's a discussion on which plugin to include 
next, I'd also favor one that has some more compact representation 
format (or that supports several formats, not only json).



    Regards,

    Álvaro

--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-22 Thread Alvaro Hernandez



On 23/09/17 00:28, Gregory Brail wrote:
We have been working on a project that makes extensive use of logical 
replication for use inside Apigee (which is a very small part of Google):


https://github.com/apigee-labs/transicator

In order to do this, we had to write our own logical replication 
plugin because the supplied "test_decoding" plugin from the "contrib" 
directory was hard for us to work with. Primarily:


1) It doesn't include all the fields that we need for Transicator 
(most importantly we need the LSN and the 64-bit transaction ID).

2) It outputs a text format that is hard to parse.

I imagine that other users of logical decoding are facing similar 
problems.


Would the community support the development of another plugin that is 
distributed as part of "contrib" that addresses these issues? I'd be 
happy to submit a patch, or GitHub repo, or whatever works best as an 
example. (Also, although Transicator uses protobuf, I'm happy to have 
it output a simple binary format as well.)


As a side note, doing this would also help making logical decoding a 
useful feature for customers of Amazon and Google's built-in Postgres 
hosting options. In those environments, there's no way to add a custom 
plugin to Postgres, so anything not built in the product tends to be 
harder for someone to consume.


If anyone is interested in looking more:

The plugin code is here:
https://github.com/apigee-labs/transicator/tree/master/pgoutput

and produces output defined by the "ChangePb" structure defined here:
https://github.com/apigee-labs/transicator/blob/master/common/transicator.proto


    How about using pgoutput, which is included by default in 
PostgreSQL 10, as the basis for logical replication?



    Cheers,

    Álvaro


--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SCRAM in the PG 10 release notes

2017-09-14 Thread Alvaro Hernandez



On 14/09/17 18:06, Dave Cramer wrote:


On 14 September 2017 at 02:21, Alvaro Hernandez <mailto:a...@ongres.com>> wrote:




On 14/09/17 08:57, Heikki Linnakangas wrote:

On 09/12/2017 04:09 AM, Noah Misch wrote:

On Wed, May 10, 2017 at 10:50:51PM -0400, Bruce Momjian wrote:

On Mon, May  1, 2017 at 08:12:51AM -0400, Robert Haas
wrote:

On Tue, Apr 25, 2017 at 10:16 PM, Bruce Momjian
mailto:br...@momjian.us>> wrote:

Well, we could add "MD5 users are encouraged
to switch to
SCRAM-SHA-256".  Now whether we want to list
this as something on the
SCRAM-SHA-256 description, or mention it as an
incompatibility, or
under Migration.  I am not clear that MD5 is
in such terrible shape that
this is warranted.


I think it's warranted.  The continuing use of MD5
has been a headache
for some EnterpriseDB customers who have
compliance requirements which
they must meet.  It's not that they themselves
necessarily know or
care whether MD5 is secure, although in some cases
they do; it's that
if they use it, they will be breaking laws or
regulations to which
their business or agency is subject.  I imagine
customers of other
PostgreSQL companies have similar issues.  But
leaving that aside, the
advantage of SCRAM isn't merely that it uses a
better algorithm to
hash the password.  It has other advantages also,
like not being
vulnerable to replay attacks.  If you're doing
password
authentication, you should really be using SCRAM,
and encouraging
people to move to SCRAM after upgrading is a good
idea.

That having been said, SCRAM is a wire protocol
break.  You will not
be able to upgrade to SCRAM unless and until the
drivers you use to
connect to the database add support for it. The
only such driver
that's part of libpq; other drivers that have
reimplemented the
PostgreSQL wire protocol will have to be updated
with SCRAM support
before it will be possible to use SCRAM with those
drivers. I think
this should be mentioned in the release notes,
too.  I also think it
would be great if somebody would put together a
wiki page listing all
the popular drivers and (1) whether they use libpq
or reimplement the
wire protocol, and (2) if the latter, the status
of any efforts to
implement SCRAM, and (3) if those efforts have
been completed, the
version from which they support SCRAM.  Then, I
think we should reach
out to all of the maintainers of those driver
authors who aren't
moving to support SCRAM and encourage them to do so.


I have added this as an open item because we will have
to wait to see
where we are with driver support as the release gets
closer.


With the release near, I'm promoting this to the regular
open issues section.


Thanks.

I updated the list of drivers on the wiki
(https://wiki.postgresql.org/wiki/List_of_drivers
<https://wiki.postgresql.org/wiki/List_of_drivers>), adding a
column for whether the driver supports SCRAM authentication.
Currently, the only non-libpq driver that has implemented
SCRAM is the JDBC driver. I submitted a patch for the Go
driver, but it hasn't been committed yet.


On the JDBC driver, strictly speaking, code has not been
released yet. It is scheduled for v 42.2.0, and maybe the wiki
should also mention from what version of the driver it is
supported (I guess for all cases, unless their versioning would be
synced with PostgreSQL's).


We won't by syncing our version numbers with Postgres


Of course, I wanted to mean wi

Re: [HACKERS] SCRAM in the PG 10 release notes

2017-09-14 Thread Alvaro Hernandez



On 14/09/17 08:57, Heikki Linnakangas wrote:

On 09/12/2017 04:09 AM, Noah Misch wrote:

On Wed, May 10, 2017 at 10:50:51PM -0400, Bruce Momjian wrote:

On Mon, May  1, 2017 at 08:12:51AM -0400, Robert Haas wrote:
On Tue, Apr 25, 2017 at 10:16 PM, Bruce Momjian  
wrote:

Well, we could add "MD5 users are encouraged to switch to
SCRAM-SHA-256".  Now whether we want to list this as something on the
SCRAM-SHA-256 description, or mention it as an incompatibility, or
under Migration.  I am not clear that MD5 is in such terrible 
shape that

this is warranted.


I think it's warranted.  The continuing use of MD5 has been a headache
for some EnterpriseDB customers who have compliance requirements which
they must meet.  It's not that they themselves necessarily know or
care whether MD5 is secure, although in some cases they do; it's that
if they use it, they will be breaking laws or regulations to which
their business or agency is subject.  I imagine customers of other
PostgreSQL companies have similar issues.  But leaving that aside, the
advantage of SCRAM isn't merely that it uses a better algorithm to
hash the password.  It has other advantages also, like not being
vulnerable to replay attacks.  If you're doing password
authentication, you should really be using SCRAM, and encouraging
people to move to SCRAM after upgrading is a good idea.

That having been said, SCRAM is a wire protocol break.  You will not
be able to upgrade to SCRAM unless and until the drivers you use to
connect to the database add support for it.  The only such driver
that's part of libpq; other drivers that have reimplemented the
PostgreSQL wire protocol will have to be updated with SCRAM support
before it will be possible to use SCRAM with those drivers. I think
this should be mentioned in the release notes, too.  I also think it
would be great if somebody would put together a wiki page listing all
the popular drivers and (1) whether they use libpq or reimplement the
wire protocol, and (2) if the latter, the status of any efforts to
implement SCRAM, and (3) if those efforts have been completed, the
version from which they support SCRAM.  Then, I think we should reach
out to all of the maintainers of those driver authors who aren't
moving to support SCRAM and encourage them to do so.


I have added this as an open item because we will have to wait to see
where we are with driver support as the release gets closer.


With the release near, I'm promoting this to the regular open issues 
section.


Thanks.

I updated the list of drivers on the wiki 
(https://wiki.postgresql.org/wiki/List_of_drivers), adding a column 
for whether the driver supports SCRAM authentication. Currently, the 
only non-libpq driver that has implemented SCRAM is the JDBC driver. I 
submitted a patch for the Go driver, but it hasn't been committed yet.


On the JDBC driver, strictly speaking, code has not been released 
yet. It is scheduled for v 42.2.0, and maybe the wiki should also 
mention from what version of the driver it is supported (I guess for all 
cases, unless their versioning would be synced with PostgreSQL's).



Álvaro


--

Alvaro Hernandez


---
OnGres



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers