Re: [HACKERS] PG-MQ?
[EMAIL PROTECTED] (Steve Atkins) writes: Is there any existing work out there on this? Or should I maybe be looking at prototyping something? The skype tools have some sort of decent-looking publish/subscribe thing, PgQ, then they layer their replication on top of. It's multi consumer and producer, with delivered at least once semantics. Looks nice. I had not really noticed that - I need to take a look at their connection pooler too, so I guess that puts more skype items on my ToDo list ;-). Thanks for pointing it out... -- let name=cbbrowne and tld=linuxdatabases.info in String.concat @ [name;tld];; http://cbbrowne.com/info/advocacy.html Signs of a Klingon Programmer #1: Our users will know fear and cower before our software. Ship it! Ship it and let them flee like the dogs they are! ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] PG-MQ?
On Wed, June 20, 2007 04:45, Chris Browne wrote: I'm seeing some applications where it appears that there would be value in introducing asynchronous messaging, ala message queueing. http://en.wikipedia.org/wiki/Message_queue The granddaddy of message queuing systems is IBM's MQ-Series, and I don't see particular value in replicating its functionality. I'm quite interested in this. Maybe I'm thinking of something too complex, but I do think there are some oh it'll need to do that too pitfalls that are best considered up front. The big thing about MQ is that it participates as a resource manager in two-phase commits (and optionally a transaction manager as well). That means that you get atomic processing steps: application takes message off a queue, processes it, commits its changes to the database, replies to message. The queue manager then does a second-phase commit for all of those steps, and that's when the reply really goes out. If the application fails, none of this will have happened so you get ACID over the complete cycle. That's something we should have free software for. Perhaps the time is right for something new. A lot of the complexity inside MQ comes from data representation issues like encodings and fixed-length strings, as I recall, and things have changed since MQ was designed. I agree it could be useful (and probably not hard either) to have a transactional messaging system inside the database. It saves you from having to do two-phase commits. But it does tie everything to postgres to some extent, and you lose the interesting features—atomicity and assured, single delivery—as soon as anything in the chain does anything persistent that does not participate in the postgres transaction. Perhaps what we really need is more mature components, with a unified control layer on top. That's how a lot of successful free software grows. See below. On the other side, the big names these days are: a) The Java Messaging Service, which seems to implement *way* more options than I'm even vaguely interested in having (notably, lots that involve data stores or lack thereof that I do not care to use); Far as I know, JMS is an API, not a product. You'd still slot some messaging middleware underneath, such as MQ. That is why MQSeries was renamed: it fits into the WebSphere suite as the implementing engine underneath the JMS API. From what I understand MQ is one of the best-of-breed products that JMS was designed around. (Sun's term, bit hypey for my taste). In one way, Java is easy: the last thing you want to get into is yet another marshaling standard. There are plenty of standards to choose from already, each married to one particular communications mechanism: RPC, EDI, CORBA, D-Bus, XMLRPC, what have you. Even postgres has its own. I'd say the most successful mechanism is TCP itself, because it isolates itself from content representation so effectively. It's hard not to get into marshaling: someone has to do it, and it's often a drag to do it in the application, but the way things stand now *any* choice limits the usefulness of what you're building. That's something I'd like to see change. Personally I'd love to see marshaling or low-level data representation isolated into a mature component that speaks multiple programming languages on the one hand and multiple data representation formats on the other. Something the implementers of some of these messaging standards would want to use to compose their messages, isolating their format definitions into plugins. Something that would make application writers stop composing messages in finicky ad-hoc code that fails with unexpected locales or has trouble with different line breaks. If we had a component like that, combining it with existing transactional variants of TCP and [S]HTTP might even be enough to build a usable messaging system. I haven't looked at them enough to know. Of course we'd need implementations of those protocols; see http://ttcplinux.sourceforge.net/ and http://www.csn.ul.ie/~heathclf/fyp/ for example. Another box of important tools, and I have no idea where we stand with this one, is transaction management. We have 2-phase commit in postgres now. But do we have interoperability with existing transaction managers? Is there a decent free, portable, everything-agnostic transaction manager? With those, the sphere of reliability of a database-driven messaging package could extend much further. A free XA-capable filesystem would be great too, but I guess I'm daydreaming. There tend to be varying semantics out there: - Some queues may represent subscriptions where a whole bunch of listeners want to get all the messages; The two simplest models that offer something more than TCP/UDP are 1:n reliable publish-subscribe without persistence, and 1:1 request-reply with persistent storage. D-Bus does them both; IIRC MQ does 1:1 and has add-ons on top for publish-subscribe. I could imagine
Re: [HACKERS] PG-MQ?
Hi Chris, Chris Browne wrote: I'm seeing some applications where it appears that there would be value in introducing asynchronous messaging, ala message queueing. http://en.wikipedia.org/wiki/Message_queue ISTM that 'message queue' is a way too general term. There are hundreds of different queues at different levels on a standard server. So I'm somewhat unsure about what problem you want to solve. c) There are lesser names, like isectd http://isectd.sf.net and the (infamous?) Spread Toolkit which both implement memory-based messaging systems. If a GCS is about what you're looking for, then you also might want to consider these: ensemble, appia or jGroups. There's a Java layer called jGCS, which supports even more, similar systems. Another commonly used term is 'reliable multicast', which guarantees that messages are delivered to a group of recipients. These algorithms often are the basis for a GCS. My bias would be to have something that can basically run as a thin set of stored procedures atop PostgreSQL :-). It would be trivial to extend that to support SOAP/XML-RPC, if desired. Hm.. in Postgres-R I currently have (partial) support for ensemble and spread. Exporting that interface via stored procedures could be done, but you would probably need a manager process, as you certainly want your connections to persist across transactions (or not?). Together with that process, we already have half of what Postgres-R is: an additional process which connects to the GCS. Thus I'm questioning, if there's value for exporting the interface. Can you think of other use cases than database replication? Why do you want to do that via the database, then, and not directly with the GCS? It would be nice to achieve 'higher availability' by having queues where you might replicate the contents (probably using the MQ system itself ;-)) to other servers. Uhm.. sorry, but I fail to see the big news here. Which replication solution does *not* work that way? Regards Markus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] PG-MQ?
On 6/20/07, Jeroen T. Vermeulen [EMAIL PROTECTED] wrote: On Wed, June 20, 2007 04:45, Chris Browne wrote: - Sometimes you have the semantics where: - messages need to be delivered at least once - messages need to be delivered no more than once - messages need to be delivered exactly once IMHO, if you're not doing exactly once, or something very close to it, you might as well stay with ad-hoc code. You can ensure single delivery by having the sender re-send when in doubt, and keeping track of duplications in the recipient. In case of PGQ, the at least once semantics is related to batch-based processing it does - in case of failure, full batch is delivered again, so if consumer had managed to process some of the items already, it gets them double. As it is responsponsible only for delivering events from database, it has no way of guaranteeing exactly once behaviour, that needs to be built on top of PGQ. Simplest case would be if the events are processed in same database that the queue resides. Then you can just fetch, process, close batch in one transaction and immidiately you get exactly once behaviour. To achieve exactly once behaviour with different databases, look at the pgq_ext module for sample. Basically it just requires storing batch_id/event_id on remote db and committing there first. Later it can be checked if the batch/event is already processed. It's tricky only if you want to achieve full transactionality for event processing. As I understand, JMS does not have a concept of transactions, probably also other solutions mentioned before, so to use PgQ as backend for them should be much simpler... To Chris: you should like PgQ, its just stored procs in database, plus it's basically just generalized Slony-I, with some optimizations, so should be familiar territory ;) -- marko ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] PG-MQ?
Marko Kreen wrote: As I understand, JMS does not have a concept of transactions, probably also other solutions mentioned before, so to use PgQ as backend for them should be much simpler... JMS certainly does have the concept of transactions. Both distributed ones through XA and two-phase commit, and local involving just one JMS provider. I don't know about others, but would be surprised if they didn't. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] PG-MQ?
On 6/20/07, Heikki Linnakangas [EMAIL PROTECTED] wrote: Marko Kreen wrote: As I understand, JMS does not have a concept of transactions, probably also other solutions mentioned before, so to use PgQ as backend for them should be much simpler... JMS certainly does have the concept of transactions. Both distributed ones through XA and two-phase commit, and local involving just one JMS provider. I don't know about others, but would be surprised if they didn't. Ah, sorry, my mistake then. Shouldn't trust hearsay :) -- marko ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] PG-MQ?
On Wed, June 20, 2007 18:18, Heikki Linnakangas wrote: Marko Kreen wrote: As I understand, JMS does not have a concept of transactions, probably also other solutions mentioned before, so to use PgQ as backend for them should be much simpler... JMS certainly does have the concept of transactions. Both distributed ones through XA and two-phase commit, and local involving just one JMS provider. I don't know about others, but would be surprised if they didn't. Wait... I thought XA did two-phase commit, and then there was XA+ for *distributed* two-phase commit, which is much harder? Jeroen ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PG-MQ?
Jeroen T. Vermeulen wrote: On Wed, June 20, 2007 18:18, Heikki Linnakangas wrote: Marko Kreen wrote: As I understand, JMS does not have a concept of transactions, probably also other solutions mentioned before, so to use PgQ as backend for them should be much simpler... JMS certainly does have the concept of transactions. Both distributed ones through XA and two-phase commit, and local involving just one JMS provider. I don't know about others, but would be surprised if they didn't. Wait... I thought XA did two-phase commit, and then there was XA+ for *distributed* two-phase commit, which is much harder? Well, I meant distributed as in one transaction manager, multiple resource managers, all participating in a single atomic transaction. I don't know what XA+ adds on top of that. To be precise, being a Java-thing, JMS actually supports two-phase commit through JTA (Java Transaction API), not XA. It's the same design and interface, just defined as Java interfaces instead of at native library level. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] PG-MQ?
Do you guys need something PG specific or built into PG? ActiveMQ is very nice, speaks multiple languages, protocols and supports a ton of features. Could you simply use that? http://activemq.apache.org/ Rob Get the free Yahoo! toolbar and rest assured with the added security of spyware protection. http://new.toolbar.yahoo.com/toolbar/features/norton/index.php ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] PG-MQ?
On Wed, June 20, 2007 19:42, Rob Butler wrote: Do you guys need something PG specific or built into PG? ActiveMQ is very nice, speaks multiple languages, protocols and supports a ton of features. Could you simply use that? http://activemq.apache.org/ Looks very nice indeed! Jeroen ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] PG-MQ?
On 6/20/07, Rob Butler [EMAIL PROTECTED] wrote: Do you guys need something PG specific or built into PG? Yes, we need it usable from inside the DB, thus the PgQ. That means the events are also transactional with other things happening in the DB. ActiveMQ is very nice, speaks multiple languages, protocols and supports a ton of features. Could you simply use that? I guess that if you need standalone message broker, the ActiveMQ may be good choice. At least, any solution that avoids the database when passing messages should outperform solutions that pipe stuff thru (general-purpose) database. OTOH, if you _do_ need to transport the events via database it should be very hard to outperform PgQ. :) As it uses the user-level xid/snapshot trick introduced by rserv/erserver/slony, which is not possible with other databases other than PostgreSQL. -- marko ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] PG-MQ?
[EMAIL PROTECTED] (Marko Kreen) writes: To Chris: you should like PgQ, its just stored procs in database, plus it's basically just generalized Slony-I, with some optimizations, so should be familiar territory ;) Looks interesting... Random ideas - insert_event in C (way to get rid of plpython) Yeah, I'm with that... Ever tried building [foo] on AIX, where foo in ('perl', 'python', ...)??? :-( It seems rather excessive to add in a whole stored procedure language simply for one function... -- (format nil [EMAIL PROTECTED] cbbrowne linuxdatabases.info) http://www3.sympatico.ca/cbbrowne/sgml.html I always try to do things in chronological order. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] PG-MQ?
On 6/20/07, Chris Browne [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] (Marko Kreen) writes: To Chris: you should like PgQ, its just stored procs in database, plus it's basically just generalized Slony-I, with some optimizations, so should be familiar territory ;) Looks interesting... Thanks :) Random ideas - insert_event in C (way to get rid of plpython) Yeah, I'm with that... Ever tried building [foo] on AIX, where foo in ('perl', 'python', ...)??? :-( It seems rather excessive to add in a whole stored procedure language simply for one function... Well, it's standard in our installations as we use it for other stuff too. It's much easier to prototype in PL/Python than in C... As it has not been performance problem I have not bothered to rewrite it. But now the interface has been stable some time, it could be done. -- marko ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] PG-MQ?
On Jun 19, 2007, at 2:45 PM, Chris Browne wrote: I'm seeing some applications where it appears that there would be value in introducing asynchronous messaging, ala message queueing. http://en.wikipedia.org/wiki/Message_queue Me too. My bias would be to have something that can basically run as a thin set of stored procedures atop PostgreSQL :-). It would be trivial to extend that to support SOAP/XML-RPC, if desired. It would be nice to achieve 'higher availability' by having queues where you might replicate the contents (probably using the MQ system itself ;-)) to other servers. There tend to be varying semantics out there: - Some queues may represent subscriptions where a whole bunch of listeners want to get all the messages; - Sometimes you have the semantics where: - messages need to be delivered at least once - messages need to be delivered no more than once - messages need to be delivered exactly once Is there any existing work out there on this? Or should I maybe be looking at prototyping something? The skype tools have some sort of decent-looking publish/subscribe thing, PgQ, then they layer their replication on top of. It's multi consumer and producer, with delivered at least once semantics. Looks nice. Cheers, Steve ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings