Re: [zeromq-dev] pyzmq issue: _pickle.UnpicklingError: pickle data was truncated
Hi, On 10/12/23 16:05, CZ wrote: I am running some python project based on client/server framework. It is something like rpclient/rpcserver. pyzmq is a very important part of this project. Now we are running into some trouble because of some errors in pyzmq. Please refer to the following error msg: Exception in thread Thread-1: Traceback (most recent call last): File "C:\Users\DELL\AppData\Local\Programs\Python\Python37\lib\threading.py", line 926, in _bootstrap_inner self.run() File "C:\Users\DELL\AppData\Local\Programs\Python\Python37\lib\threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "D:\DellProjs\oly\Oly\olympos\q_core\rpc\__init__.py", line 372, in run topic, data = self.__socket_sub.recv_pyobj(flags=NOBLOCK) File "C:\Dev\Py37venv\lib\site-packages\zmq\sugar\socket.py", line 976, in recv_pyobj return self._deserialize(msg, pickle.loads) File "C:\Dev\Py37venv\lib\site-packages\zmq\sugar\socket.py", line 834, in _deserialize return load(recvd) _pickle.UnpicklingError: pickle data was truncated I am using Python3.7.9 (actually I tried python3.7.9 and. python3.11, the same error msg) in windows 11, and pyzmq 25.1.1. The msg I am sending is really small. No way it would blow the buffer. I am using PUB at server side and SUB at client side (I always saw XPUB and XSUB in the zmq.constants. Could they possibly be another choices). I saw flags have 3 different values (corret me if I was wrong), NOWAIT, NOBLOCK, SNDMORE. I am not sure if the choice of flags value could be the reason. Any of your input would be highly appreciated. Thanks! CZ. You should print the string you are sending and the string you received. You also need to make sure the client and server use the same version of the project. That said: You should never ever use pickle with network data. That's an instand remote code execution exploit as the string you unpickle can contain arbitrary code that you will just execute. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org https://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] How to close a pipes peer?
Hi, I'm trying to fix the FD leaks in router sockets with HANDOVER. This happens in zmq::router_t::identify (pipe_t *pipe_, bool locally_initiated_) right at the end with a call to old_pipe->terminate (true); This terminates the obsolete old_pipe. But it does nothing to the peer of that pipe, which actually holds the underlying socket FD. The FD is then ignored and not closed until the router socket gets closed. My question now is: How do I get the peer of old_pipe to terminate? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org https://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] CZMQ community red cards
On Thu, Oct 16, 2014 at 05:37:54PM +0200, Benjamin wrote: For me, misleading and hiding are quite strong words, and griwes rant did not help. They are just words. Strong, weak? I'm not a walking theasurus or whatever that thing is called that lists lots of words meaning the same thing. I write mails like I talk, not like writing an essay for university admission that you rewrite 20 times until it is perfect. English is not my native language and I've always been bad a languages so I have even less words to choose from than other people. So I didn't pick those exact words because I wanted to put any emphasis on them. And I hope you do see a difference between the title is misleading and you are misleading. Maybe a better wording would have been: I noticed that XX has code changes while the pull request says documentation changes. Did that get included there by accident? But that's too late now. Linus has some strong views about Github which are related to your points, especially how pull requests work [1]. But that's more of a Github issue in general. In the end you need many people actively engaged in checking code, right? So in that case checking the source code changes (on the level commits or PR's?) https://github.com/torvalds/linux/pull/17#issuecomment-5654674 Linus is also verry good (as in efficient and no nonsense) in rejecting commits that are bad or questionable. Because of that people get angry. But also all commits are sane and uniformly documented. Something that makes maintaining and fixing the code so much easier. That is something the czmq maintainers are exceedingly bad at. The sad thing is the we merge everything and then fix it if its broken behaviour is in direct opposition to the C4.1 rules. The C4.1 has some good rules but they only work if everybody follows them. That includes me but that also include him. One problem, one patch, one pull request. Not two unrelated problems in a single pull request. The smaller and more specific you make each pull request the easier it gets to read them even with githubs interface. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] CurveZMQ Enhancements
On Wed, Oct 01, 2014 at 08:36:28PM +0200, Pieter Hintjens wrote: On Wed, Oct 1, 2014 at 7:21 PM, Matthew Hawn matth...@donaanacounty.org wrote: Message Size Limits It behooves a pubic server to not accept large messages until authentication happens. Good idea. Does not affect the protocols afaics, it's an implementation concern. Can be done in the curve client/server. Memory Usage There is currently several large buffers that are only used during handshakes but kept for the whole session. An implementation concern. Zero State Initial Connections The protocol enables the server to keep zero state until the client has authenticated. This would allow public servers to mitigate DOS attacks much more effectively. However, implementing this would require DEEP changes in ZMQ: socket, session, stream_engine, mechanism, and so forth. Ditto, right? Also does that make sense with TCP/IP? The connection already passed the tcp handshake and from there on it is statefull. You have kernel and application resources allocated for the socket. Does the ZMQ/CURVE state add a significant amount there? Invalid Message Handling After the handshake is completed, it seems to make sense to drop invalid messages rather than killing the entire connection. Ditto, though I think invalid messages = invalid client = kill it with fire. Message Command Currently, after the handshake, CurveZMQ sends messages back and forth prefixed with a \7MESSAGE. Do we really need to have this for EVERY message? Shrug. We could reduce this to 2 bytes. Is this worth it? Arguable, given the CPU cost of handling CURVE traffic. Certificate Exchange It would be nice to have a protocol level mechanism for exchanging certificates before authentication. Yes, but cannot be done in ZMTP. Requires (afaik) an out-of-band bootstrap over some existing secure transport, to avoid MIM attacks. So not a CurveZMQ issue. I realize this is a cheap answer. Doesn't this fall into the same category as the idea for the client to send metadata to the server on connect? That would happen before the connection is authenticated via ZAP, too. This is not about certificate format or validation mechanisms, just a way to exchange the cert blobs and call on the application to do validation. I am currently working on a proof-of-concept and will share more with the group later. I'm curious how you avoid MIM substituting its own permanent keys for the ones you are trying to exchange. I think he means to have something like ssh does. When you connect to an unknown host it exchanges keys, prints the hosts fingerprint and asks the user to validate before accepting. If you blindly accept the fingerprint then MIT is totaly possible. But if you are security aware then you verify the fingerprint over an alternative channel. Crypto Library Agreed, not a protocol concern though. Feel free to push tweetnacl into the limelight. -Pieter MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Friendly IDs
Hi, you have to implement some form of greeting where a connecting client says hello to the server. The server then sees the new client and can remember its zmq identity. In my case I use the identity directly. But if you want to use names (and not use them as zmq identities) then include the name in the greeting and have the server keep a map of name to zmq identity. In pseudocode: Client Qeueue: Hello, my name is Jerry. Queue: recv: (from 0x4567357) Hello, my name is Jerry. Queue: Remember: New client 0x4567357 is Jerry Queue App: (from Jerry) Jerry says hello App Qeueue: (to Jerry) Nice to meet you Jerry Queue: Lookup: Jerry is 0x4567357 Queue 0x4567357: Nice to meet you Jerry MfG Goswin On Mon, Sep 29, 2014 at 09:02:24AM -0700, Roberto Ostinelli wrote: Hi Goswin, Thank you for your reply. How can the QUEUE know which one of sockets is the recipient that corresponds to Jerry, and how can it route to it? On Mon, Sep 29, 2014 at 2:21 AM, Goswin von Brederlow goswin-...@web.de wrote: On Sun, Sep 28, 2014 at 07:17:39PM -0700, Roberto Ostinelli wrote: Hello 0mq'ers! I'm investigating 0MQ and up until know I'm enjoying what I'm seeing. As an academical exercise, I'm trying to understand how to build a simple server that can route a REQ to a very specific Socket based on a friendly name, and receive a RES from it. For instance, let's say I have two clients, Tom and Jerry, that somehow authenticate on a server. I want to be able to have Tom send a REQ specifically to Jerry, and receive a RES, via a server (i.e. not by a direct connection). I've seen the examples on how to create a QUEUE device to route XREQ and XRES, however in these examples there are clients on one side and servers on the other, and when a client sends a request, any server on the other side can provide the response. I'd like to have a way for the client Tom to specify that it wants its response from the server Jerry. Can a kind soul point me in the right direction? Thank you in advance. r. What I do in my code is to use the same syntax as ROUTER sockets. You send the recipient Jerry as first frame of the message and have the QUEUE device use that frame to direct the rest of the message. In my case the outgoing end is a ROUTER socket to I just pass the whole message on, including the recipient frame and the ROUTER socket sends it the right way. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Friendly IDs
On Sun, Sep 28, 2014 at 07:17:39PM -0700, Roberto Ostinelli wrote: Hello 0mq'ers! I'm investigating 0MQ and up until know I'm enjoying what I'm seeing. As an academical exercise, I'm trying to understand how to build a simple server that can route a REQ to a very specific Socket based on a friendly name, and receive a RES from it. For instance, let's say I have two clients, Tom and Jerry, that somehow authenticate on a server. I want to be able to have Tom send a REQ specifically to Jerry, and receive a RES, via a server (i.e. not by a direct connection). I've seen the examples on how to create a QUEUE device to route XREQ and XRES, however in these examples there are clients on one side and servers on the other, and when a client sends a request, any server on the other side can provide the response. I'd like to have a way for the client Tom to specify that it wants its response from the server Jerry. Can a kind soul point me in the right direction? Thank you in advance. r. What I do in my code is to use the same syntax as ROUTER sockets. You send the recipient Jerry as first frame of the message and have the QUEUE device use that frame to direct the rest of the message. In my case the outgoing end is a ROUTER socket to I just pass the whole message on, including the recipient frame and the ROUTER socket sends it the right way. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ELI5: Why can't I get the IP address of the machine that sent a message?
On Sat, Sep 27, 2014 at 08:36:47AM -0700, Scott wrote: Understand and agree with all of the above re: IP address. However, some sort of 'source transport specific information' (there's always some unique identifier associated w/ the transport) could be useful (sometimes it would be IP address). That said - I think the logging / debugging support satisfies our requirements. What does zmq put into the address part of the ZAP request for non tcp/ip connections? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] [g...@debian.org: Re: Bug#743508: libzmq3: upgrading from 3.2.3 to 4.0.4 breaks python-pytango]
- Forwarded message from László Böszörményi (GCS) g...@debian.org - Date: Thu, 25 Sep 2014 09:48:03 +0200 From: László Böszörményi (GCS) g...@debian.org To: Goswin von Brederlow goswin-...@web.de, 743...@bugs.debian.org Subject: Re: Bug#743508: libzmq3: upgrading from 3.2.3 to 4.0.4 breaks python-pytango Hi, On Thu, Sep 25, 2014 at 4:06 AM, Goswin von Brederlow goswin-...@web.de wrote: do we realy need a libzmq.so.3 in Jessie? Upstream is preparing a new stable version now with libzmq.so.4. Given that the breakage between 3 and 4 is minimal (easy to port your software, most just works) do we need to maintain 2 versions of zeromq? Just for the record, we are talking v2 (v2.2.0 to be more precise) to v4 upgrade and not from a v3 one. Yes, only the Python bindings use v2 in the archive, but supports v4 as well (see compatibility[1]). I don't know the compatibility level of the C/C++ API between v2 and v4, but open to discussion if the former should be removed. At least no other package uses it. Do you know if upstream supports v2 (important and security fixes) or not? Regards, Laszlo/GCS [1] http://zeromq.org/bindings:python - End forwarded message - ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] DEALER-ROUTER question
On Fri, Sep 26, 2014 at 09:51:50AM +0200, Pieter Hintjens wrote: On Fri, Sep 26, 2014 at 3:10 AM, Goswin von Brederlow goswin-...@web.de wrote: Pieter: Would it be possible to put all the examples of the guide into a git project and have them included in the auto compile done for every pull request? The examples are in a subdirectory of the Guide git project, and we could activate Travis CI on that, though with a lot of languages it's unclear how we'd proceed. For C and C++ sure. I'll get around to updating the Guide for V4 at some stage. I've been collecting a lot of the material on my blog already so it's not a huge job, just fiddly. I'm kind of waiting to release CZMQ v3 first so that the C examples can use that. I think it's time to start burying the old ZeroMQ API. This has been a long goal of mine, to raise the high level semantics and get that working among all languages. What is the general opinion about moving classes from CZMQ to libzmq? We're seeing a lot of projects wrapping CZMQ now. If we eventually merge the two projects, we get back to a single target for languages to bind to. This is already where JeroMQ/JZMQ got to as well (a single common high level API) I love that things are no longer void * but have proper distinct types. What I don't like is the double or tripple indirection the CZMQ classes add. I prefer type safety over polymorphism there. If the two get merged then why not merge the zsock class with the libzqm socket structure. Use it as is for the zsock interface and cast it to void * for the sake of the old API. A true merge and not just layering the czmq API over the old one. This would be worth a ZeroMQ v5 version upgrade in my opinion. At the same time we can kill the deprecated CZMQ classes, review some of the cute details like reference counting in frames, and so on... it also means we can start to rewrite parts of libzmq itself using the CZMQ style and containers. Sorry, getting ahead of myself here. Problem: we have two separate C APIs which is confusing and harder to package/ship Solution: merge the CZMQ v3 classes into libzmq and deliver as a single package. Discussion: do this after both CZMQ v3 and ZeroMQ v4.1 are released, and stamp this with a V5 label. Leave the old ZeroMQ v2 and v3 APIs in there (forever as far as I'm concerned). -Pieter I would like to have a stable libzmq v4.1 soon because we at Q-Leap Networks want to use curve for authentication (requires zmq_msg_gets from 4.1) soon and I would rather have a stable version in the distribution than a git snapshot. Similary Debian Jessie would benefit from a new stable libzmq. I wouldn't feel comfortable in rushing a release with czmq merged into libzmq. This sounds like a fairly large change with lots of breakage till all the details are ironed out. Better to make what we have stable and releasing it before radically changing the lib. My 2c. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ_POLLIN received on WRITE event
On Fri, Sep 26, 2014 at 09:51:40AM +0200, Dorvin wrote: W dniu 2014-09-25 13:54, Goswin von Brederlow pisze: It might help if you would post a testcase that: a) actualy compiles It actualy compiles on Windows. I think I should state it more clearly in first post. With the includes for FD_SET missing and Sleep() undefined? b) doesn't involve undefined behaviour: (ii) select() may update the timeout argument to indicate how much time was left. pselect() does not change this argument. On Linux, select() modifies timeout to reflect the amount of time not slept; most other implementations do not do this. (POSIX.1-2001 per- mits either behavior.) This causes problems both when Linux code which reads timeout is ported to other operating systems, and when code is ported to Linux that reuses a struct timeval for multiple select()s in a loop without reinitializing it. Consider timeout to be undefined after select() returns. Again, it's not undefined on Windows. However, your point is worth remembering in case I'll need to port my code in the future. Resetting timeval is cheap enough to do it on regular basis on every attempt to use select() on any platform. I set nfds (which is ignored on Windows) but omitted timeval. I'll remember that. All I see is that you use select with 0 timeout, which makes the select basically pointless. You also wrongly select the FD for writability (which is always possible, so a NOP) and then break when a message was received. Since messages arrive asynchronously you eventually hit the case where a message gets received just at that time. Nothing wrong there. Of course 0 timeout is pointless. The same as using select() on just one descriptor. This was naive method to simulate event without actualy introducing external library. Anyway, both of you gave me some valuable input to reconsider differences between 0mq and BSD sockets. I still have to figure out some elegant way to deal with REQ-REP sockets but it should be easier now. With regards, Jarek MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ_POLLIN received on WRITE event
On Fri, Sep 26, 2014 at 09:55:59AM +0200, Dorvin wrote: W dniu 2014-09-26 02:05, KIU Shueng Chuan pisze: Your code could be stripped down a lot: 1) PUB socket is only for sending. There's no need to test it for ZMQ_POLLIN. 2) PUB socket never blocks when sending. There's no need to test it for ZMQ_POLLOUT. 3) SUB socket is only for receiving. There's no need to test it for ZMQ_POLLOUT. 4) The ZMQ_FDs are only to be tested for readability. They are NOT to be tested for writability (and in fact should always test positive for writability.) You are right. My case isn't really minimal but allows to test on different types of sockets with just minimal changes. I still feel not right about receiving POLLIN on write but it might just be my habit from old times of low level socket programming. You both made me start to think about 0mq in different way. Thanks, Jarek You aren't receiving POLLIN on write. You receive POLLIN on event, whatever that may be. That's just how an eventfd works. There is no way to make a FD emit a POLLOUT event like you can with a POLLIN. Reading from an FD doesn't generate the event due to buffering. Remember that a zmq socket has a fan in/out structure and the FD has to signal all events for all lowlevel sockets: / socket 1 |/--- socket 2 FD ---* socket 3 |\--- socket 4 \ socket 5 MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] DEALER-ROUTER question
On Fri, Sep 26, 2014 at 11:06:29AM +0200, Pieter Hintjens wrote: This is one of those rare roadmap / vision threads. Concretely all this has to happen first: - release 4.0.5 sometime very soon - release 4.1.0 RC sometime later - update Guide for 4.1 - release CZMQ v3.0 RC I think CZMQ can be sliced in several ways, e.g. the project modeling should go together with zproto into a new project (I think we agreed on a name but I forget what it was now...). Then libzmq should use that. One thing at a time. -Pieter One last thing and then I will get back to work. The zring class in CZMQ I feel is still experimental and I would like to keep that in flux some more, maybe throw it out alltogether and replace it with the equipotent API I mentioned. You didn't give any ETAs but if CZMQ is still a few weeks away that will give plenty of time to get zring and ztimeout into shape and tested. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ_POLLIN received on WRITE event
On Fri, Sep 26, 2014 at 11:25:37AM +0200, Dorvin wrote: W dniu 2014-09-26 10:39, Goswin von Brederlow pisze: a) actualy compiles It actualy compiles on Windows. I think I should state it more clearly in first post. With the includes for FD_SET missing and Sleep() undefined? I don't know what Windows version and which build environment you are using but zmq.h includes winsock2.h (which defines fd_set). Winsock2.h includes windows.h (which includes winbase.h). Winbase.h defines Sleep(). I just created empty MSVC project and it worked like a charm. This is interesting, so if you would like we could investigate your situation but I would not like to spam this group so let's continue in private. With regards, Jarek Ok, problem solved then. Under linux the zmq.h includes far less and yoiu have to actually include the proper headers for the stuff you use. I assume Sleep() is another windows thing. Portability is hard. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] DEALER-ROUTER question
On Fri, Sep 26, 2014 at 01:46:20PM +0200, Pieter Hintjens wrote: On Fri, Sep 26, 2014 at 11:36 AM, Goswin von Brederlow goswin-...@web.de wrote: One last thing and then I will get back to work. Ah, work, that mysterious thing we do when we're not talking about it :-) The zring class in CZMQ I feel is still experimental and I would like to keep that in flux some more, maybe throw it out alltogether and replace it with the equipotent API I mentioned. I like the zring API and will be using it a lot to replace the old zhash/zlist mix that lives in various servers. Do you actually need it? I think the only place where you need a zhash/zlist combo is when the order in which items are added is relevant. If the order isn't relevant then zhash already provides all you need. Can you give some examples? Because in czmq only czertstore used the zhash/zlist combo and didn't even need it. I'm thinking, end of the year for CZMQ v3. What do you think about biting the snake's head off, and replacing ztimeout with zserver as a container for zpeer's? I mean, if this is our use case, why not raise the semantic level until we actually get something tasty? -Pieter ztimeout is another building block for my protocol. It has a somewhat more complex peer structure because it handles heartbeats, message ack, message nack and message resends. There could be a zserver structure that handles just heartbeating by using ztimeout and zpeer. But that probably wouldn't be usefull for since my heartbeats also function as ACKs/NACKs and have to be in a specific format. But maybe that could be handled via callbacks. Still, I like having simple building blocks that are self contained. That way each block can be tested on its own and reimplemented if a better design comes along. They can also be used in different ways. E.g. the gossip protocol should use ztimeout too. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] DEALER-ROUTER question
On Fri, Sep 26, 2014 at 06:34:32PM +0200, Pieter Hintjens wrote: On Fri, Sep 26, 2014 at 6:20 PM, Goswin von Brederlow goswin-...@web.de wrote: Do you actually need it? After all the work, that's a sad question. What I see are too-large patches done in I want to try this yet I've not defined a clean problem statement directions, which stress other people, create lots of discussion and other patches, and then ultimately in do we really need it? Having to ask this question is totally, horribly wrong. I've personally spent something like 30-40 hours on the containers, mostly driven by the original zdlist patch. That seems to be about 75% wasted time. I'm not talking about the doubly linked list part. Just about the doubly linked list + zhash combo part. Both doubly linked list and zhash are usefull on their own as are all the other changes we made with the global destructor, duplicator and comparator. How much of the time was spend on merging zhash support into zring? How much was spend on improving the containers overall? I don't think you spend 75% of the time on the merging and only that small part would be wasted. This is what the whole process is about. I'm going to ask people who contribute to CZMQ to read the C4 document again and again until this kind of thrashing ends. I think I followed that: If the Platform implements pull requests as issues, a Contributor MAY directly send a pull request without logging a separate issue. I think what doesn't quite work right is this: Maintainers SHOULD ask for improvements to incorrect patches and SHOULD reject incorrect patches if the Contributor does not respond constructively. For example in a recent pull request the zlist_goto was objected to, accoring with: To discuss a patch, people MAY comment on the Platform pull request, on the commit, or elsewhere. But minutes later the patch was merged instead of waiting for the discussion to conclude and the pull request getting updated with a better patch. I'm not sure how one is supposed to go from a first draft to solve a problem to a release ready solution under the C4 process without this kind of trashing. New features and ideas don't pop up perfect. My solution might not be to your liking. Like you didn't like the zlist_node_t approach or the zring_goto wasn't liked. But my stated problem needs exactly that feature. Finding a solution that is acceptable by everyone takes a few tries. When I try to discuss a possible sollution before submitting code you say: Show me the code. When I show you code you complain about trashing. So I'm a bit lost how you think this should work. Or should I go off on my own, write 10 new classes to solve a complex problem and then after a year submitt them all in one go as finished solution without asking for any feedback along the way? I thought splitting the problem into managable chunks and sending pull request as chunks become ready was better. Release early, release often. Anyhow, please don't remove zring, if it ends up being unused, it'll die naturally. -Pieter You want to add a new but totaly unused class to the next stable release so it can die naturally? If you have a use for the combo of them then that is fine. You added that part to zring so you must have had a reason. I just don't see one at the moment now that zcertstore simply iterates over zhash. I didn't like mixing zdlist and zhash into a single zring class from the start. It wasn't needed with my original patch. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] Reimplementation of the zring class with equipotent API
Hi, I just send a pull request for a reimplementation of the zring class with a new API: https://github.com/zeromq/czmq/pull/698 Note: I renamed the existing zring class to zhashring in case that is still wanted by others. The ztimeout class, for which I write the initial zring class, needs a method do directly remove/move elements of the ring. With the inital API this exposes internas of the implementation and needs a dangerous zring_goto mothod. So it was time to try something different. The new zring API uses the zring_t type for each element in the ring. In fact each element is a zring_t, they are all equipotent. Given any element you can add, remove, iterate, ... to the ring. And elements can be moved safely from one ring to another (or within the same ring). Instead of a special head structure every element of the ring can function as head. That means you have to have an item in the ring for there to be a ring. An empty ring is just a NULL pointer. But sometimes one needs to have a head for an empty ring (to set the destructor and duplicator even though the ring is empty). For this I defined guard elements. A guard element is simply an element in the ring that has NULL as item pointer. Otherwise they are like any other element of the ring. Except for zring_first/zring_next. Both of them skip over guard elements and only return elements with a real item. And yes, I said elements. You can have more than one guard in a ring, splitting them into sections. Zring_first and zring_next allow iterating until the next guard, once around the ring or until any specific element is reached. For an example of how zring can be used look at ztimeout. This includes use of multiple guard elements and the different iterator modes. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] libzmq v4.0.5
On Thu, Sep 25, 2014 at 09:00:28AM +0200, Pieter Hintjens wrote: On Thu, Sep 25, 2014 at 4:22 AM, Goswin von Brederlow goswin-...@web.de wrote: Well, and a third time now. Both the now blocks and changed monitor events are API changes that breaks existing stuff. One could even say the ABI hasn't changed, only the API. The changes to the monitor API broke the rules. We should have defined a new API and left the old one alone and deprecated it. Which makes me wonder. How could we prevent this in the future? We know how to prevent it. Right now a C program can say zsock_new () or zmq_sock () and get new or old behavior. It seems fairly logical to eventually make CZMQ a part of libzmq to reduce building and packaging costs. That's a detail. Can you identify any reason that doesn't work? That is fine for major changes where a completly new function makes sense. But not for minor behaviour changes that only conern corner cases. It doesn't make sense to end up with 20 zmq_sock() like functions that all return a slightly differently behaving socket. Calling 'zmq_init_ver (1, 3);' would give the old behaviour while 'zmq_init_ver (1, 4);' would activate the behaviour we currently have and remain there. When more changes are added a API version 5 would be added, then 6 and so on. On the plus side, it lets the caller explicitly demand one version or the other. On the minus side, it is an insane solution to a non-problem. The ideal solution for users is to see no versions at all, and for developers to be able to add and remove pieces without shifting gears. Can you imagine if HTTP had such a scheme, and in order to get a new content header you had to change versions and ask the entire Internet to upgrade? Sorry, you can't watch Youtube because your browser is using Internet v12.0.3.1, you have to upgrade. HTTP does do exactly that. When you connect you say: GET /index.html HTTP/1.1 ... The 1.1 at the end specifies the protocol version to use and determines what features and capabilities you have. And a while ago there was talk about a new HTTP/2.0. That is what this scheme aims at. Public contracts should not be versioned. They should be frameworks Which is fine if you follow your own contract. Zeromq has broken it 3 times now so that isn't the best track record. And there are a number of things I would like to change eventually. Like ZMQ_STREAM sockets shouldn't require the MORE flag for every frame, alternating between destination and data frames. Instead they should work like ROUTER sockets and take the first frame of a multi-part message as the destination address and send the rest onwards. into which pieces can be added and removed asynchronously, where each piece has a rolling contract, and where contract violations are visible upfront (as compile errors, in C). That I think is the advantage of the FUSE solution. By defining a higher FUSE_USE_VERSION you can not only get new features enabled but also deprecated features removed. So while FUSE_USE_VERSION 21 would still have foobar(), with compatibility code, a FUSE_USE_VERSION 26 would give a compile error when you try to use foobar(). The lib would still have foobar() in both cases, the header file just doesn't expose it anymore if you specify the new version. You can even have the function there in version X, emit a deprecated warning on use with version Y and hide it with version Z. Insanity is what you often get when you try to solve non-problems. That's why I'm so adamant about starting with problem statements, not we could do X or Y discussions. Problem: You have 2 differently behaving APIs and want to support both, but not necessarily at the same time. Current solution: Ignore the API change and bump the SOVERSION to alow existing code to function with an old libzmq3 and new code with libzmq4. Which also means changing the packaging for all reverse dependencies (for Build-Depends: libzmq4-dev) under Debian and recompiling them even though only one of them is affected by the change. Idealy I would like to just have libzmq-dev and libzmq3 and have that work for both old and new sources and binaries. Our process is to identify a problem with the current situation, then solve that minimally and plausibly, then deliver that to real users, then repeat. Cheers Pieter Fuse has something similar but at a compile time property. /usr/include/fuse/fuse.h: * This file defines the library interface of FUSE * * IMPORTANT: you should define FUSE_USE_VERSION before including this * header. To use the newest API define it to 26 (recommended for any * new application), to use the old API define it to 21 (default) 22 * or 25, to use the even older 1.X API define it to 11. */ #ifndef FUSE_USE_VERSION #define FUSE_USE_VERSION 21 #endif Defining the FUSE_USE_VERSION in the application actually changes
Re: [zeromq-dev] ZMQ_POLLIN received on WRITE event
On Thu, Sep 25, 2014 at 11:09:15AM +0200, Dorvin wrote: Thank you for your replies so far. I'll clarify if it does matter that I'm using Windows and libzmq 4.0.4. W dniu 2014-09-25 04:38, Goswin von Brederlow pisze: the ØMQ library shall signal any pending events on the socket in an edge-triggered fashion by making the file descriptor become ready for reading. This is exactly what bothers me. According to documentation I'm supposed to get most socket events when FD gets ready for reading. But when I do: sel_ret = select(sub_fd + 1, sub_set, NULL, NULL, time ); if (sel_ret == 1) { zmq_getsockopt(subscriber, ZMQ_EVENTS, zevents, zevents_len); if(zevents ZMQ_POLLIN) { //retrieve and do something with message } } I receive only some or none messages at all. You may see it in testcase. For now I found 3 options to work this out: 1. Use zmq_poll() periodically - this seems to get read and write notifications at the same time and therefore not lose events. This does not look like event-driven way of doing things, though. 2. Use getsockopt with ZMQ_EVENTS after each send or receive - this is like small version of zmq_poll invoked almost constantly. 3. Keeping write notifier active all time which is not a good idea because it fires on every pass of event loop. In first reply I already was discouraged from this option. If you would be so kind I would ask you to execute my code, see results and tell me if it's expected behavior (some kind of optimization or something), I miss something in how 0mq sockets should work, or if I should file bug report. You may remove line if (occurences 0) {break;}; to see that what I describe is not single event but it repeats. With regards, Jarek It might help if you would post a testcase that: a) actualy compiles b) doesn't involve undefined behaviour: (ii) select() may update the timeout argument to indicate how much time was left. pselect() does not change this argument. On Linux, select() modifies timeout to reflect the amount of time not slept; most other implementations do not do this. (POSIX.1-2001 per- mits either behavior.) This causes problems both when Linux code which reads timeout is ported to other operating systems, and when code is ported to Linux that reuses a struct timeval for multiple select()s in a loop without reinitializing it. Consider timeout to be undefined after select() returns. c) gives an error where the problem occurs or describe what the expected outcome should be and what it acutally is All I see is that you use select with 0 timeout, which makes the select basically pointless. You also wrongly select the FD for writability (which is always possible, so a NOP) and then break when a message was received. Since messages arrive asynchronously you eventually hit the case where a message gets received just at that time. Nothing wrong there. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Is my use case viable?
On Thu, Sep 25, 2014 at 01:49:13PM +0200, Arnaud Kapp wrote: Hello everyone, Not too long ago I asked about POLLPRI being handled by ZMQ for a particular project i'm working on. I am still figuring out if ZMQ would be a good fit in this project. I would like your feedback about my use case, and your insight about performance problems I may encounter. I've read that when using ZMQ one shouldn't have to spawn many sockets. However, when thinking about my design I realized I'd be having a lot of socket talking to each other (locally, through inproc://). Each component of the software would somehow be like a ZActor. The a pipe socket back to parent thread, a REP socket for other component to connect to, and a PUB socket to publish event. (With an average of 100 component -- each with no cpu intensive task). The breakdown in multiple actor thread would be for clarity, not performance (not all target computer have more than 1 thread anyway). I'd say a worst case scenario would around 500 - 1000 sockets. It should be less than that most of the time though. The messages activity is not stable. Most of the time I wouldn't expect more than a dozen message per second being sent, but sometimes it would be more. (300 - 400msg per seconds being published, plus the control traffic through REQ/REP). This software target low performance ARM device (raspberry pi mostly). I know its not easy to tell based on what I just wrote, but do you think I risk performance problem? Do you think the design is zmq approved ? You probably should have a master thread that binds a few sockets and then start your 100 components as worker threads. Each component connects to the master and communicated over that with the rest of the world. That way you have more like 100 sockets instead of 1000. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] DEALER-ROUTER question
On Thu, Sep 25, 2014 at 02:08:41PM +0100, Riskybiz wrote: Dear Pieter, As a figurehead of the zeromq project I think you should know that the Maybe you ment spokesperson or prominent programmers instead of figurehead? I assume you didn't ment Pieter is just a puppet. zguide and its examples, whilst a worthy effort, is actually a barrier to the progress of a learner. As a newcomer to networking my interactions and attempts to learn and use zeromq over the past year has been an unyielding source of issues to be overcome. Anyone else would probably have given up, but I recognise the benefits of the superb technology offered by zeromq. Part of the problem is perhaps that the zguide has been written by experts, where seemingly minor details are glossed over. For example in the hwserver Now is the time for you to change that. You still see yourself as beginer and you experience the beginner problems but seem to have figured some of them out. Write your problems, questions and solutions down and send patches to the guide to improve it. As experts it is often increadibly hard to write examples. A lot of stuff comes just so natural that one doesn't even think about it. It's so obvious, after having done it that way for years, that one doesn't think t explain or comment it. Only newbies can accurately point out where newbies have problems understanding things. and hwclient code it is not demonstrated how actually to read a message payload from the socket and extract the received message string. It cheats and prints; printf (Received World %d\n, request_nbr); When someone tries this early example the first thing they will experiment with is: How can I customise the message payload and get my own message sent and received? They will be disappointed. They might read on try to figure it out and be baffled by the multiple language bindings, apis and helper files and left wondering; what actually needs to be written to make zeromq work? What is the core underlying zeromq commands which need to be called? What needs to be installed? How do I do that? Where do I find the downloads? I encountered a situation last year when I needed to use the common technique of serialization to pass custom C++ objects over zeromq sockets. While the zguide alludes to the possibility there was nothing to offer assistance in the practical implementation. It took some considerable time, weeks, to sort this out into a functioning prototype. Where a simple functional guide example could have saved time and questions; not just for me but also for any number of other users. One of the problems here is that every language is different. The bindings are different to better fit in with each language environment. For example the c++ bindings (I think there are 2 competing ones) have functions to send and receive std::string. The python3 bindings simply send/receive bytes (byte arrays) and you have to encode/decode to convert from/to strings. Or you use send_pyobj/recv_pyobj with nearly any kind of python object. I think the best would be to have a mini guide for each language/binding that explains how they make zmq fit into the language. So now is the chance for you to start one. I find that the zguide code examples are in themselves problematic. The example code is very sparsely commented. Every line which does something non-trivial or zeromq specific should be explained. I have found examples to be zeromq version specific, operating system specific and requiring modifications to work on Windows. All of these factors just consume time in endeavouring to make them work or debug them with limited understanding of what is actually supposed to be happening. This causes questions and frustrations. CZMQ was recommended as the api to use with 'reference' C language code examples, I lost more time trying to compile CZMQ before realising the practical impossibility of this on Windows despite alluringly providing Visual Studio project files. The lack of working installation instructions was also a barrier. Then afterwards I discover that ROUTER sockets in updated zeromq versions no longer use UUID identities thus anyway invalidating the code example I was endeavouring to get working. More lost time, more questions. No progress. The guide is saddly for an older version of zmq and desperately need someone to update it. Every bit helps and if you find things that simply no longer work then do file an issue and potentially a patch. Pieter: Would it be possible to put all the examples of the guide into a git project and have them included in the auto compile done for every pull request? My suggestion is that if you want fewer basic questions asked in the community then please take time to revisit the zguide, its examples and necessary zeromq code resources and make it such that people can easily find the resources they need, confidently learn and demonstrate the examples and
Re: [zeromq-dev] libzmq v4.0.5
On Thu, Sep 25, 2014 at 02:20:07PM +0200, Pieter Hintjens wrote: On Thu, Sep 25, 2014 at 1:39 PM, Goswin von Brederlow goswin-...@web.de wrote: That is fine for major changes where a completly new function makes sense. But not for minor behaviour changes that only conern corner cases. It doesn't make sense to end up with 20 zmq_sock() like functions that all return a slightly differently behaving socket. We've already done this, and no-one complains: - zmq_send Lowlevel send - zmq_sendmsg - zmq_msg_send Same behaviour with different name. And I would call the msg structure a major change to the first function. And since you asked for it let me complain about this here. :) Maybe http://api.zeromq.org/4-1:zmq-sendmsg could be removed from the table of contents (or moved to a deprecated section in the table of contents so it remains for reference). When I started with zmq I read a few examples and then started to experiment. For this I keept the API reference open and when I wanted to do something I would look for a function that sounded right and open its page. That easily leads to learning deprecated functions instead of finding the proper ones. Ok, after a while I learned to first scan for the this API method is deprecated boxes. But still, wasted time. On the opposite site the new version has some new functions. Somone coming from an old version might want to read up on the new features. In the online ocaml reference changes are marked: http://www.askra.de/software/ocaml-doc/4.00/index.html That makes it easy to find and read up on changes. Maybe something to think about too. - zmsg_send - zstr_send - zframe_send - zsock_send - zactor_send Different library altogether. :) And czmq has more of an OO design. So every data object has a send function. One isn't preferable over the other. They are all ment to coexist. All do kind of the same thing. The only time people complained (rightly) was when 3.0 changed zmq_send. In practice you don't get 20 variations. You get 2 or 3, one for each stable generation. Remember we can change draft APIs at will. We make a stable release maybe once a year. We save old APIs for a few generations, then delete them. HTTP does do exactly that. When you connect you say: GET /index.html HTTP/1.1 Yes, except it's used for almost nothing significant, and there was never a successful 1.2 or 2.0. That kind of proves my point. (I've written many complete HTTP server stacks.) Which is fine if you follow your own contract. Zeromq has broken it 3 times now so that isn't the best track record. Indeed. However this isn't my project, it's *our* project and we all enforce the rules, or we don't. My job has been to define workable rules. I'll enforce them on aspects of the project I use and thus care about. Anyone using those pieces that got broken had the tools to stop that happening, and didn't say anything. We can investigate why, and fix that, if someone cares. And there are a number of things I would like to change eventually. Like ZMQ_STREAM sockets shouldn't require the MORE flag for every frame, alternating between destination and data frames. Instead they should work like ROUTER sockets and take the first frame of a multi-part message as the destination address and send the rest onwards. Indeed. You can already do this in several ways. You don't need to expose ABI versions for this. Indeed we have a good, existing mechanism for feature updates to socket types. Check the many ROUTER socket options. Two points here: 1) Then why isn't this done for the now blocks issue to preserve API/ABI compatibility? Or for the monitor socket? 2) Check the *many* ROUTER socket options. That was my point. Over time such options get more and more and more. So your code gets longer and longer. You can't just open a socket and get to work anymore. You have to set 20 options first because the default must remain compatible with a stoneage version and that for every socket. The idea was that by specifying an API version the source expects the defaults could be adjusted accordingly. By setting one version all those 20 options that everyone wants to be changed get set in one go resulting in improved defaults for all sockets. That I think is the advantage of the FUSE solution. By defining a higher FUSE_USE_VERSION you can not only get new features enabled but also deprecated features removed. OK, I do accept the problem statement of multiple versions of an API. There are many solutions. Remember our focus is distributed software and a distributed community. This is important. It means we work on many pieces at the same time, and they evolve independently over a long time, and at a distance. What this means is that we do not have a single API contract stream. Yes, you can say this is version X however you cannot use the version number as a mechanism for changing semantics. Here's
Re: [zeromq-dev] libzmq v4.0.5
On Mon, Sep 22, 2014 at 12:13:48AM +0200, Pieter Hintjens wrote: Hi all, This is a pre-release announcement, to give time for backporting. If there are any fixes on libzmq master that you'd like to see in 4.0.5, please let us know. For info, this is what's currently in that release: * Fixed #1191; CURVE mechanism does not verify short term nonces. * Fixed #1190; stream_engine is vulnerable to downgrade attacks. * Fixed #1088; assertion failure for WSAENOTSOCK on Windows. * Fixed #1015; race condition while connecting inproc sockets. * Fixed #994; bump so library number to 4.0.0 * Fixed #939, assertion failed: !more (fq.cpp:99) after many ZAP requests. * Fixed #872; lost first part of message over inproc://. * Fixed #797, keep-alive on Windows. -Pieter If you bump the SOVERSION would it be possible to fix the void * for everything problem? Define a abstract ctx_t, sock_t, msg_t, ...? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] libzmq v4.0.5
On Wed, Sep 24, 2014 at 05:46:14PM +0200, Pieter Hintjens wrote: Sorry for using the word criminal. I meant, just, that this would be a contract violation. Since ZeroMQ v3.2, we've held to the principle that new functionality may not break existing apps. We broke this rule once or twice afaik. On Wed, Sep 24, 2014 at 12:07 PM, Pieter Hintjens p...@imatix.com wrote: No, we cannot and will not break existing public APIs. Please read the C4 if you're not clear on this. Bumping the ABI version does not excuse criminal behavior. We can at any point add new classes, and new methods. This is what CZMQ is experimenting with. These classes can move into libzmq if that's valuable. On Wed, Sep 24, 2014 at 8:33 AM, Goswin von Brederlow goswin-...@web.de wrote: On Mon, Sep 22, 2014 at 12:13:48AM +0200, Pieter Hintjens wrote: Hi all, This is a pre-release announcement, to give time for backporting. If there are any fixes on libzmq master that you'd like to see in 4.0.5, please let us know. For info, this is what's currently in that release: * Fixed #1191; CURVE mechanism does not verify short term nonces. * Fixed #1190; stream_engine is vulnerable to downgrade attacks. * Fixed #1088; assertion failure for WSAENOTSOCK on Windows. * Fixed #1015; race condition while connecting inproc sockets. * Fixed #994; bump so library number to 4.0.0 * Fixed #939, assertion failed: !more (fq.cpp:99) after many ZAP requests. * Fixed #872; lost first part of message over inproc://. * Fixed #797, keep-alive on Windows. -Pieter If you bump the SOVERSION would it be possible to fix the void * for everything problem? Define a abstract ctx_t, sock_t, msg_t, ...? MfG Goswin Well, and a third time now. Both the now blocks and changed monitor events are API changes that breaks existing stuff. One could even say the ABI hasn't changed, only the API. Which makes me wonder. How could we prevent this in the future? It would be easy enough to add runtime options to the code to switch between the old and new behaviour (at the cost of having to maintain them). But I don't want to end up having to set 1000 flags to switch on new features. Maybe we could add a new void zmq_init_ver (int io_threads, int api_version); The zmq_init_ver would have a switch statement that would toggle behavioural option on and off depending on the version passed to it. Calling 'zmq_init_ver (1, 3);' would give the old behaviour while 'zmq_init_ver (1, 4);' would activate the behaviour we currently have and remain there. When more changes are added a API version 5 would be added, then 6 and so on. Fuse has something similar but at a compile time property. /usr/include/fuse/fuse.h: * This file defines the library interface of FUSE * * IMPORTANT: you should define FUSE_USE_VERSION before including this * header. To use the newest API define it to 26 (recommended for any * new application), to use the old API define it to 21 (default) 22 * or 25, to use the even older 1.X API define it to 11. */ #ifndef FUSE_USE_VERSION #define FUSE_USE_VERSION 21 #endif Defining the FUSE_USE_VERSION in the application actually changes the API and if unset you get the base version. What do you think? Could we use one of those methods? Are there better ways? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ_POLLIN received on WRITE event
On Wed, Sep 24, 2014 at 07:51:39PM +0200, Dorvin wrote: Hi, I started using ZeroMQ not long ago and tried to use it in evented app. When I try to use FD with socket notifier it appears I'm receiving ZMQ_POLLIN events when FD is signalling write event. Is there any rationale behind such behavior? Naive testcase to show this behavior is at: http://pastebin.com/iXV2GVVv Thanks in advance for you reply, Jarek From http://api.zeromq.org/4-1:zmq-getsockopt: ZMQ_FD: Retrieve file descriptor associated with the socket The ZMQ_FD option shall retrieve the file descriptor associated with the specified socket. The returned file descriptor can be used to integrate the socket into an existing event loop; the ØMQ library shall signal any pending events on the socket in an edge-triggered fashion by making the file descriptor become ready for reading. The FD isn't the actual lowlevel socket. Instead it is a signalfd or eventfd or something like it. The IO threads write to it when something happens on any of the low-level socket attached to the zmq socket and the event handler reads from it to clear it again. The ability to read from the returned file descriptor does not necessarily indicate that messages are available to be read from, or can be written to, the underlying socket; applications must retrieve the actual event state with a subsequent retrieval of the ZMQ_EVENTS option. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ELI5: Why can't I get the IP address of the machine that sent a message?
On Wed, Sep 24, 2014 at 07:03:25PM -0700, Michel Pelletier wrote: IP addresses are easily spoofed, they should not be used as a security mechanism. If you want security, you have to either trust all your networks or use curvemq security or some other authentication mechanism like a vpn. 0mq doesn't expose the IP because this is an implementation detail of the tcp transport. -Michel On Wed, Sep 24, 2014 at 6:57 PM, Scott alcoholi...@gmail.com wrote: Hi folks, We had a problem a while back where some 'unknown machine' was sending malformed messages and causing our ZMQ based app to lose it's mind. It took a while to figure that out... And then once a developer on our team found the nastygram he had trouble figuring out what machine was doing this. Is there a design reason that we lose this information in the area between plain ol sockets and ZMQ messages getting delivered to the application? Thanks for your patience and such a great library! -Scott As a side note: The monitoring interface exposes the IP address. You can use that to monitor who connects (for logging purposes). The other interface is: [1] 27/ZAP - ZeroMQ Authentication Protocol On every connect a ZAP request is generated that includes the address, the origin network IP address. And you can set a used id and metadata in the reply. For example you could use the NULL (default) machanism and set the user id to the address from the request. When you receive a problematic message you can then use zmq_msg_gets(User-Id) to retrieve the addess of the peer as set in the ZAP handler. MfG Goswin [1] http://rfc.zeromq.org/spec:27 ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] It looks like socket id's can't change with a router-router topology. Is that correct?
On Wed, Sep 17, 2014 at 03:55:09PM -0400, Mark Wright wrote: I have a router-router setup (destination IDs are acquired via a broadcast/response, similar to the Freelance pattern in the ZMQ book). I've noticed that if my destinations go down and come back up with a new ID, clients can't route to that ID. I'm using 4.0.3. Are you sure they have a new ID? I'm guessing not. You have to allow reused IDs to take over the ID. Otherwise the messages keeps getting send to the old connection. See socket options. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Multiple Router/Dealer
On Fri, Sep 12, 2014 at 11:49:03AM -0700, Mohit Anchlia wrote: For my testing I've built a client (req) - router - dealer - worker (rep). I currently have multiple clients and workers, however I am trying to figure out how I can have multiple router/dealers. Is there a way to make router/dealers multi-threaded? One socket, one thread. If you really need multithreaded then you need multiple router/dealer sockets. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Proposal for a revised zlist API
On Wed, Sep 10, 2014 at 06:42:18PM +0200, Pieter Hintjens wrote: Sorry, I've lost track of some of the details here. Could you re-describe your timeout queue starting from the problem so I can think about how I'd implement it? Thanks Pieter The problem is that every connected host has a timeout for the heartbeat and every message send and received a timeout for ACK. Most of the time the timeout is canceled because there is activity or a message gets ACKed in time. And this has to scale well, being able to cope with 1000+ hosts and 100+ messages on the fly. So there can be a lot of timers and a lot of acticity. It is important that creating, canceling and restarting a timeout timer is fast. The normal approach to timeouts is to have a priority queue based on a heap or balanced tree. This makes creating, canceling and restarting a timeout an O(log n) operation while finding the next timeout is O(1). My timeout queue uses a different approach taken from the Linux kernels network code. Most timeouts get canceled early. So the idea is to delay work for each timeout as long as possible in the hope that it gets canceled and you don't have to spend time on it at all. For this I have an array of buckets (zring). Bucket N holds timeouts that expire between 2^N and 2^(N+1) ticks from now. Every 2^N ticks all timeouts in bucket N are inspected and resorted into new buckets. Over time timeouts migrate from the highest bucket to the lowest and then expire. That is unless they get canceled. A timeout also only gets inspected a maximum number of times [iirc number of buckets + 1], less if canceled. The overall work is therefore comparable to a heap or sorted tree if all timeouts expire. But much less if they get canceled. Creating a new timeout is trivial. I just compute which bucket it belongs to and insert it into the zring. Canceling is harder since searching the zring for the right timeout would take too long. That's where the node_t pointer comes in. It allows the timeout to remove itself without searching. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Router socket and connection identities?
On Fri, Sep 12, 2014 at 09:00:02AM +0100, Riskybiz wrote: Dear zeromq developers, I intend to create an example DEALER-ROUTER network arrangement which will pass multipart messages containing a variable number of parts. As I understand it a ROUTER socket uses an identity for each of the connections with which it corresponds; the identities are stored internally by the ROUTER socket in a lookup container. My question is: Is it necessary in my code to create a unique identity for each connection and then pass it to the ROUTER socket OR does the ROUTER socket create the necessary identity internally? Yes. :) If you create identities they better be unique. If you don't create identities then the ROUTER socket will create random ones for you. Also, does anyone please know of a simple console example of a ROUTER-DEALER arrangement which I may study and which will work on Windows using zeromq-4.0.4? There should be several in the guide. With thanks, Riskybiz. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Proposal for a revised zlist API
On Mon, Sep 08, 2014 at 07:02:53PM +0200, Pieter Hintjens wrote: On Mon, Sep 8, 2014 at 6:42 PM, Goswin von Brederlow goswin-...@web.de wrote: Urgs. :-) If zlist gets deprecated then there needs to be a new list (zring) that preserves the strict fifo ordering and simplicity of a list. We don't need to deprecate zlist then, rather we can keep it as the simple list container. Merging them together generaly just combines the bad aspects while loosing the good. Not really. I've experimented with the zring hybrid and it works very neatly, and it solves the case we have often, of keeping a list+hash of items. In fact the only case I've seen of code using two containers for the same data. This new mix with a hashtable makes the class unusable from an API point too. How? If you don't use the keyed calls, it works just like before. But my timeout queue needs the key api (or rather the initial zdlist_node_t*) to access items diretly. So I have to invent keys and convert them to strings to use the key API and implement my own detach by key. And you can't detach/delete an item by key, which would be the only point in all of this for me. (Note: I do need that zring_node_t * return value for append back by the way). If you have keys for an item then you use _insert to store them, and then _delete to remove by key. But not detach it by key. The delete destroys the item. On the other hand the extra arg is invaluable when combinging or stacking containers or stacking of free_fn functions. Which is a use case that zring now kills off. Unless you have other mixes of containers. I do. The timeout queue I posted a while back does use zring. That's what I wrote it for. And with it being in the critical IO path the speed of certain operations is important. It is demonstrably simpler to allow zring to do both list and hashing than manage two containers in the app, and adding user pointers just complexifies further. size_t czmq_hash_fn (void *item, void *hash_private) Sure, yes. The hash item needs to also get the key, if any, used to insert the item. My bad. Should have been size_t czmq_hash_fn (void *key, void *hash_private) The hash function shouldn't need the item to hash a key. Otherwise how is lookup going to work? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Automatic reconnection, blessing or curse?
At work we leave GUI clients running connected to remote servers via ZMQ all the time. But when we go home at night our local system get suspended. When we come back in the morning and wake them up again the GUI clients just keeps working because zmq will reconnect in the background. So that is basically your shutting the lid of your laptop case. As for flaky networking there you will run into message loss. You will need more than automatic reconnects and even heartbeating. You need to resend requests when they get lost. So far that is left to the application but I'm working on a protocol module for czmq that will handle that for you so the application doesn't have to. So zmq alone isn't the answere but with addons that are being developed it is getting there. MfG Goswin On Tue, Sep 09, 2014 at 11:10:23AM +, Tom Quarendon wrote: OK, sounds like it wouldn't be bending the technology too far. I don't want to just use zeroMQ for the sake of it, but if there's good value to be had, then I will use it. The good thing is that your guide explores good patterns to making different kinds of network topologies, so there's a lot that can be built upon. Thanks. -Original Message- From: zeromq-dev-boun...@lists.zeromq.org [mailto:zeromq-dev-boun...@lists.zeromq.org] On Behalf Of Pieter Hintjens Sent: 09 September 2014 12:06 To: ZeroMQ development list Subject: Re: [zeromq-dev] Automatic reconnection, blessing or curse? You can certainly set-up a CURVE connection that will ignore TCP disconnects and reconnects. At the lower level, the two peers will re-negotiate and create a new session key, but that'll be invisible to your application. On Tue, Sep 9, 2014 at 12:33 PM, Tom Quarendon tom.quaren...@teamwpc.co.uk wrote: OK, thanks Pieter. I was hoping I might get the guru :-) Can I pose my question in another way, just to make sure I understand? I have an SSH connection to a remote machine from my laptop. If I close the lid of the laptop to go home, or walk to another office, the SSH connection dies. On the face of it, zeroMQ would seem to provide a solution to that. So would zeroMQ be a good fit as the basis of an SSH implementation that would survive me sleeping my laptop, or the Wifi being flaky? Ignore the fact that it wouldn't be compatible with SSH, and that it's probably simpler just to set up Kerberos, or certificate signon so that reconnecting is easy anyway etc, just can you say whether that would be a good thing to base an implementation of something like an SSH server/client pair on? Or would it be square peg round hole, it would work, but would be bending the technology? I'm just trying to get a handle on the kinds of things zeroMQ might be good for. There's a lot of thought and intelligence gone into zeroMQ and the things that surround it, and on the face of it seems like it and the surrounding patterns have the potential to make writing good network code easier, I just need to work out whether it's applicable to the kinds of things I want to do. Thanks. -Original Message- From: zeromq-dev-boun...@lists.zeromq.org [mailto:zeromq-dev-boun...@lists.zeromq.org] On Behalf Of Pieter Hintjens Sent: 09 September 2014 11:19 To: ZeroMQ development list Subject: Re: [zeromq-dev] Automatic reconnection, blessing or curse? ZeroMQ does indeed hide some things which we're used to seeing with TCP. Even with TCP, if you want to maintain a connection for any length of time, you need heartbeating. Otherwise you will hit cases where TCP reports no error, yet the connection is effectively dead. We don't usually use REQ/REP in practice. Any realistic client-server work runs over DEALER/ROUTER. -Pieter On Tue, Sep 9, 2014 at 11:37 AM, Tom Quarendon tom.quaren...@teamwpc.co.uk wrote: I???m trying to get a handle on zeroMQ, and the when I read chapter 4 of the guide, thing I keep thinking is ???isn???t all that complexity just a by-product of the fact that sockets automatically reconnect after failure I want to do a very simple RPC application between two endpoints. Let???s say I want to something a bit like SSH or SFTP (it???s on my mind and I???m trying to understand whether zeroMQ would be a good fit for such an application). I can do this in raw TCP fairly easily, or at least I think I can, and if the connection goes down, then any attempt to read or write from the socket gives an error (the socket can survive the disconnection if you don???t actually try to send any data), so I know when either the server process crashes, the client crashes or the network cable gets unplugged while I???m trying to send traffic. Now in zeroMQ, using REQ/REP in a simplistic fashion, a number of things might happen. If the network cable is unplugged, then either the client will be unaware (if control is with it,
Re: [zeromq-dev] Proposal for a revised zlist API
On Sun, Sep 07, 2014 at 07:11:52PM +0200, Pieter Hintjens wrote: Hi Goswin, So the upshot of this is, (a) zlist is as-was, and represents a simple, stupid list; (b) zhash has some of the new stuff like item destructor and duplicator, at the cost of slight breakage in zhash_dup; (c) zring is now a hybrid list+hash container that starts to look like something fun. Urgs. If zlist gets deprecated then there needs to be a new list (zring) that preserves the strict fifo ordering and simplicity of a list. zring should not do hashing, that is the job of zhash. Don't try to make a hyper complex one container fits all class. That won't work and just have a million bugs. Each container type has it's own behaviour, complexity and use cases. Merging them together generaly just combines the bad aspects while loosing the good. I need a simple doubly linked list as building block in other classes and the overhead of the hashtable is unacceptable there. This new mix with a hashtable makes the class unusable from an API point too. I don't have keys for my items and using the items themself runs into problems with hashing and duplicate frees. Converting the items to hex strings just to insert them is also stupid. And you can't detach/delete an item by key, which would be the only point in all of this for me. (Note: I do need that zring_node_t * return value for append back by the way). Zyre and Zbroker some other projects work correctly on the new API, without changes. I switched zcertstore over to zring and it works nicely. The problem of one item in two containers disappears because we delegate everything to zring. So no need for that user pointer, and so container destructors remain compatible and simple. While I defined the free_fn to have 2 arguments (item and user) it is perfectly fine to use a normal destructor taking only 1 argument. The second argument will be put on the stack or into a register (according with eh C calling convention) and gets ignored. All it needs is casting the destructor to the free_fn type. On the other hand the extra arg is invaluable when combinging or stacking containers or stacking of free_fn functions. Further steps? It might be useful to add a custom hash calculator now, which takes the item contents and key and returns a pre-hashkey (8 bytes?) that zhash can re-hash minimally. Make it the same size as a void *, and we can do fun things like use the item reference as-is as pre-hashkey. This is a the second most common use case after access by string key, IMO. The hash function should return a size_t and the zhash then uses that module size to find the right bucket. It makes sense to have a state argument in the hash function, stored in the container, so a seed or similar can be stored. size_t czmq_hash_fn (void *item, void *hash_private) It is probably profitable now to switch zloop and zsys over to zring, and to measure before/after performance with many inserts/delete. I may do that later... for now I am going to leave CZMQ alone for a few days. -Pieter MfG Goswin On Sat, Sep 6, 2014 at 2:46 PM, Pieter Hintjens p...@imatix.com wrote: Also, sorted lists... it makes sense, however I think what we're really wanting is a container provides both keyed and sequential (sorted) access. Either a tree (there was a ztree class that I killed a while back, as it had imported a lot of horrid code from elsewhere), or a combination hash/ring. On Sat, Sep 6, 2014 at 11:27 AM, Pieter Hintjens p...@imatix.com wrote: OK, after spending way too long on this, I'd like to revert all changes to zlist, deprecate that class, and make zring the new V3 list container, doing everything properly and cleanly. This makes it very easy for developers to spot old vs new constructs. I think we've gone too far down the rabbit hole. Anyone using lists in anger wants a doubly-linked list. Let's continue the careful and slow evolution of zring then, and use it in CZMQ. I'm still undecided on the void *user argument as it prevents us using standard destructors, which is such a fine thing. I'm going to look for other ways to do this. -Pieter On Fri, Sep 5, 2014 at 7:14 PM, Goswin von Brederlow goswin-...@web.de wrote: On Fri, Sep 05, 2014 at 05:49:20PM +0200, Pieter Hintjens wrote: On Fri, Sep 5, 2014 at 3:56 PM, Goswin von Brederlow goswin-...@web.de wrote: Just saw your commit. You only found 17 issues? :-) I was coding till 1am, surprising it worked at all. 1) Small typos Fixed. 2) zlist_purge does not destroy the cursor allowing zlist_next and zlist_item access to freed memory. Nice catch, fixed. 3) change cursor to node_t ** but that would be internal. I'm not yet sure of that change, will try something else first (separate head/tail item as we have in zring, I think). The insert_before needs access to the previous node-next. You could have the cursor always
Re: [zeromq-dev] Proposal for a revised zlist API
Just saw your commit. 1) Small typos: The comment for zlist_dup mentions zhash_dup_v2. The comments for zlist_sort talk about key and value and straight ASCII comparison. But sorting goes by the compare function. 2) zlist_purge does not destroy the cursor allowing zlist_next and zlist_item access to freed memory. 3) change cursor to node_t ** but that would be internal. 4) Now there is zlist_detach and zlist_delete. How about zlist_detach_current and zlist_delete_current for the cursor based flavour of the same? (needs 3) 5) zlist_set_comparator for a container wide compare function. 6) zlist_sort could allow NULL as second argument to use the container compare (or compare by address, however useless that is :). (needs 5) 7) zlist_find can be added without name collision. (needs 5) 8) zlist_detach and zlist_delete need to specify their affect on the cursor. 9) I would suggest having zlist_detach and zlist_delete use container compare function. 10) I would suggest having zlist_detach and zlist_delete use zlist_find + zlist_*_current. (needs 7, solves 9 implicitly) 11) zlist_t still has a bool autofree;. The old behaviour could be achieved by having zlist_autofree set a duplicator (strdup) and destructor (something that calls free) instead. 12) Can we add the void *user argument to czmq_destructor and czmq_comparator? That would be realy usefull for keeping items in 2 container like zcertstore does among other things. 13) zlist_first, zlist_next and zlist_last each duplicate zlist_item at the end. 14) zlist_insert_before could be added without name collision. (needs 3) 15) zlist_insert(_after) could be added without name collision. 16) zlist_insert_sorted could be added without name collision. (needs 3+5+7) 17) zlist_last is a problem for 4 since it updates the cursor. Consider this stupid way to purge the list in reverse order: while (zlist_last (list)) zlist_delete_current (list); Either zlist_last needs to search the whole list to set the cursor or zlist_delete_current needs to search the whole list to set the tail. If you are changing the cursor behaviour anyway (see 18) then maybe we could make zlist_last not change the cursor. I don't see any use for that and it would keep the complexity at O(1) instead of O(n). If someone needs to process a zlist from the back like that then they should be using a zring instead. So I think this would be safe. 18) You've changed the API, the cursor behaviour, slightly already. I wonder if any code relies on the destructive cursor behaviour of zlist_push, zlist_append, zlist_remove (and any others I forgot). It is possible to have code like this: void append_item_and_print_list (zlist_t *list, item_t *item) { zlist_append (list, item); do { item = zlist_next (list); print_item (item); } while (item); } The loop relies on the fact that zlist_append sets the cursor to NULL and zlist_next then restarts it from the begining. This would now break and only print the new item instead of all items. MfG Goswin On Thu, Sep 04, 2014 at 09:00:44PM +0200, Pieter Hintjens wrote: :-) I started, by coincidence, to make some of these changes already. Hope to have this ready soon. Nothing dramatic, just the global callbacks. We can take the other changes one by one. Breaking the v2 API is not an option, and renaming zlist to another name isn't plausible either IMO. So we may rename some methods, like zlist_dup, which make broken assumptions (autofree, in that case). -Pieter On Thu, Sep 4, 2014 at 5:38 PM, Goswin von Brederlow goswin-...@web.de wrote: On Thu, Sep 04, 2014 at 04:59:19PM +0200, Goswin von Brederlow wrote: Hi, I've gone over the zlist API and redesigned it according to some ideas we had last month (partly for the zring class). I tried this out on zlist to see how the ideas would fit but the idea would be to change all the classes where appropriate so they behave the same. In case it wasn't clear this is just an experiment to see how it could work. To see if it makes more sense that way. No attempt is made to preserve backward compatibility and none is ment. I wanted to design the cursor behaviour and global callbacks cleanly without having to worry about making them work parallel to the old API. It might be possible to do that though. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] CZMQ: Error checking of z*_new() calls in other z*_new() functions
On Fri, Sep 05, 2014 at 01:35:54PM +0200, Pieter Hintjens wrote: What do you think of this style, for instance (slight change to ziflist constructor): self-names = zlist_new (); if (self-names) self-addresses = zlist_new (); if (self-addresses) self-netmasks = zlist_new (); if (self-netmasks) self-broadcasts = zlist_new (); if (self-broadcasts) ziflist_reload (self); else ziflist_destroy (self); ...? It's clearly correct by instant visual inspection, and trivial to maintain. Neat trick. That wouldn't work for allocating self itself though. How about this: #define CHECK(x) if (!(x)) goto out CHECK (self = zmalloc (sizeof (self_t)); CHECK (self-names = zlist_new ()); CHECK (self-addresses = zlist_new ()); CHECK (self-netmasks = zlist_new ()); CHECK (self-broadcasts = zlist_new ()); CHECK (ziflist_reload (self)); ... return self; out: ziflist_destroy (self); return self; On Thu, Sep 4, 2014 at 8:55 PM, Pieter Hintjens p...@imatix.com wrote: I agree, and it was meant to be safe to call the destructor on an unfinished object. I don't like the long lists of checks, they are a maintenance issue. It might be worth finding a better code style for constructors that avoids this duplication and yet gives us clean all or nothing construction. A goto might be neat. Or a series of chained if statements. I'll experiment... On Thu, Sep 4, 2014 at 4:06 PM, Goswin von Brederlow goswin-...@web.de wrote: On Thu, Sep 04, 2014 at 11:51:27AM +0200, Pieter Hintjens wrote: I'd prefer assertions in most places. The problem is that adding full error handling on every allocation creates more complex code and overall increases the risk of other errors. There are a few places where it's worth catching heap exhaustion, and that is on message handling, ans you'll see this done in zmsg and zframe for instance. However for the most part I don't think application developers are competent to deal with out of memory situations correctly, and the best strategy is to fail loudly and force the machine or architecture to be fixed. So we can start by asserting inside constructors, and then fix those constructors we want to, to return NULL on failures. What makes it complex is having to check every allocation and then error out if it failed. So you get tons of extra lines all doing the same (other than the check), possibly with more and more cleanup. One thing that I think keeps this managable is to not check every allocation by itself. Instead do all allocations in a block, then check if any of them has failed and only then start configuring or using them. Also the fact that destructors (and free) are save to call with a NULL pointer helps. So if any allocation fails you simply call the classes destructor and it will do any cleanup necessary. No need to check which resource where allocated and which not. Example: peer_t * peer_new (zframe_t *own_identity, zframe_t *peer_identity, proto_t *proto) { peer_t *self = (peer_t *) zmalloc (sizeof (peer_t)); if (self) { // allocate stuff self-own_identity = zframe_strhex(own_identity); if (peer_identity) { self-peer_identity = zframe_strhex(peer_identity); } else { self-peer_identity = NULL; } self-proto = proto; self-timeout_handle = proto_timeout_add (self-proto, MIN_TIMEOUT, self); self-out_msg = zlist_new (); self-in_msg = zlist_new (); self-requests = zhash_new (); // did any allocation fail? if (!self-own_identity || (peer_identity !self-peer_identity) || !self-timeout_handle || !self-out_msg || !self-in_msg || !self-requests) { peer_destroy (self); return NULL; } // safe to use allocated stuff (which this example doesn't :) self-last_send_sequence = 0; self-acked_send_sequence = 0; self-max_recv_sequence = 0; self-acked_recv_sequence = 0; self-missed_beats = 0; self-current_timeout = MIN_TIMEOUT; self-peer_state = ACTIVE; self-own_state = MESSAGE; self-need_to_ack = false; } return self; } I would prefer checks to assert and asserts to crash burn at some later time. I am able to deal with out-of-memory for the most part and an assert would prevent me from doing that. And if the code doesn't lend itself to handling out-of-memory I can always just assert(). That's my choice then. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http
Re: [zeromq-dev] Proposal for a revised zlist API
On Fri, Sep 05, 2014 at 03:56:56PM +0200, Goswin von Brederlow wrote: 12) Can we add the void *user argument to czmq_destructor and czmq_comparator? That would be realy usefull for keeping items in 2 container like zcertstore does among other things. czmq_destructor and czmq_duplicator I mean. ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Proposal for a revised zlist API
On Fri, Sep 05, 2014 at 05:49:20PM +0200, Pieter Hintjens wrote: On Fri, Sep 5, 2014 at 3:56 PM, Goswin von Brederlow goswin-...@web.de wrote: Just saw your commit. You only found 17 issues? :-) I was coding till 1am, surprising it worked at all. 1) Small typos Fixed. 2) zlist_purge does not destroy the cursor allowing zlist_next and zlist_item access to freed memory. Nice catch, fixed. 3) change cursor to node_t ** but that would be internal. I'm not yet sure of that change, will try something else first (separate head/tail item as we have in zring, I think). The insert_before needs access to the previous node-next. You could have the cursor always point to the previous node and handle the special cases of an empty list or one with just one item. Using the 2-stars trick avoids those special cases. 4) Now there is zlist_detach and zlist_delete. How about zlist_detach_current and zlist_delete_current for the cursor based flavour of the same? (needs 3) Not needed, you pass NULL as item and that does the job. I don't like long method names either... clumsy to use. Great, didn't think of that. 5) zlist_set_comparator for a container wide compare function. Tried that and decided against it, due to not wanting to break zlist_sort and not really seeing the benefit (in fact it's an extra line of code and extra risk of errors). I didn't add that for sort but for find. :) Find is probably done far more often than sort and I use find in detach and delete too. zlist_find can be used in zloop, zmsg, zpoller and zsys (some should be rewritten to zlist_delete/detach) if that uses the comparator. 6) zlist_sort could allow NULL as second argument to use the container compare (or compare by address, however useless that is :). (needs 5) Yes, that was my second idea, except it leaves the method messy and I still didn't see the benefit. It adds one line to the code if the comparator is initialized with an physical equality. Didn't seem so bad. 7) zlist_find can be added without name collision. (needs 5) So searching on content rather than item value, yes, it could work. zloop searches by timer_id and socket. zsys searches on handler. That's 3 uses already just in czmq itself. 8) zlist_detach and zlist_delete need to specify their affect on the cursor. zlist_detach and zlist_delete are works in progress, I left this for today as I had other work. They should end up working and looking like their zring counterparts. 9) I would suggest having zlist_detach and zlist_delete use container compare function. OK, I'm partially convinced, however I don't have a use case, this is engineering because it can work which I'd rather avoid. So far comparison on the item void * has always worked fine. - zsys_close does detach sockets by their handle instead of item. - zloop deletes reader by their socket, poller by socket or fd 10) I would suggest having zlist_detach and zlist_delete use zlist_find + zlist_*_current. (needs 7, solves 9 implicitly) Again, problem statements first, solutions second. The problem is code duplication vs. reuse. Both delete and detach currently implement their own find loop which would be identical to zlist_find. 3 times the same code is 3 times the work to maintain. 11) zlist_t still has a bool autofree;. The old behaviour could be achieved by having zlist_autofree set a duplicator (strdup) and destructor (something that calls free) instead. Indeed. I'd randomly thought that and didn't do it yet. My goal today was the general shape of the API and keeping existing code from not breaking. 12) Can we add the void *user argument to czmq_destructor and czmq_comparator? That would be realy usefull for keeping items in 2 container like zcertstore does among other things. Haven't understood the problem yet enough to evaluate the addition. I'll get there eventually, if you can wait. I came to this idea because I needed it for my protocol implementation. The timeout handler for an item needs access to the overall peer structure (to send heartbeats to the peers socket) and protocol structures (to remove dead peers after too many timeouts). I didn't want to store the same identical pointer in thousands if not millions of items. Storing it once in the container seemed smarter and uses far less code. Check out how I changed the zcertstore to use a container wide destructor on the zlist for a simple use case. Certificates are kept both in a list and a hashtable. The list is setup so that deleting a cert from the list will automatically delete the cert from the hashtable too and then destroy it. This can also be setup that deleting from either container deletes from both. 13) zlist_first, zlist_next and zlist_last each duplicate zlist_item at the end. Same in zring. I'll fix that. zlist_head too. 14) zlist_insert_before could be added without name collision. (needs 3) I've
Re: [zeromq-dev] CZMQ: Error checking of z*_new() calls in other z*_new() functions
On Thu, Sep 04, 2014 at 11:51:27AM +0200, Pieter Hintjens wrote: I'd prefer assertions in most places. The problem is that adding full error handling on every allocation creates more complex code and overall increases the risk of other errors. There are a few places where it's worth catching heap exhaustion, and that is on message handling, ans you'll see this done in zmsg and zframe for instance. However for the most part I don't think application developers are competent to deal with out of memory situations correctly, and the best strategy is to fail loudly and force the machine or architecture to be fixed. So we can start by asserting inside constructors, and then fix those constructors we want to, to return NULL on failures. What makes it complex is having to check every allocation and then error out if it failed. So you get tons of extra lines all doing the same (other than the check), possibly with more and more cleanup. One thing that I think keeps this managable is to not check every allocation by itself. Instead do all allocations in a block, then check if any of them has failed and only then start configuring or using them. Also the fact that destructors (and free) are save to call with a NULL pointer helps. So if any allocation fails you simply call the classes destructor and it will do any cleanup necessary. No need to check which resource where allocated and which not. Example: peer_t * peer_new (zframe_t *own_identity, zframe_t *peer_identity, proto_t *proto) { peer_t *self = (peer_t *) zmalloc (sizeof (peer_t)); if (self) { // allocate stuff self-own_identity = zframe_strhex(own_identity); if (peer_identity) { self-peer_identity = zframe_strhex(peer_identity); } else { self-peer_identity = NULL; } self-proto = proto; self-timeout_handle = proto_timeout_add (self-proto, MIN_TIMEOUT, self); self-out_msg = zlist_new (); self-in_msg = zlist_new (); self-requests = zhash_new (); // did any allocation fail? if (!self-own_identity || (peer_identity !self-peer_identity) || !self-timeout_handle || !self-out_msg || !self-in_msg || !self-requests) { peer_destroy (self); return NULL; } // safe to use allocated stuff (which this example doesn't :) self-last_send_sequence = 0; self-acked_send_sequence = 0; self-max_recv_sequence = 0; self-acked_recv_sequence = 0; self-missed_beats = 0; self-current_timeout = MIN_TIMEOUT; self-peer_state = ACTIVE; self-own_state = MESSAGE; self-need_to_ack = false; } return self; } I would prefer checks to assert and asserts to crash burn at some later time. I am able to deal with out-of-memory for the most part and an assert would prevent me from doing that. And if the code doesn't lend itself to handling out-of-memory I can always just assert(). That's my choice then. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] Proposal for a revised zlist API
Hi, I've gone over the zlist API and redesigned it according to some ideas we had last month (partly for the zring class). I tried this out on zlist to see how the ideas would fit but the idea would be to change all the classes where appropriate so they behave the same. Preparing for that I defined common types: -- typedef int (zcommon_compare_fn) (void *item1, void *item2); typedef void (zcommon_free_fn) (void **item, void *user); typedef void * (zcommon_dup_fn) (void *item, void *user); The first change is that the free callback now takes an void ** so that it is compatible with destructors. The other (and new) is that both free and dup take a second argument. I've added a new field void *user to the container. The field is initialized to the address of the container itself but it is free for use by the user for whatever he wants. The field is then passed to the free and dup callback whenever they get invoked. For an example how this is usefull look at zcertstore.c: static void s_zcertstore_cert_destroy (zcert_t **cert_p, zcertstore_t *store) { zcert_t *cert = *cert_p; if (cert) { zhash_delete (store-cert_hash, zcert_public_txt (cert)); zcert_destroy (cert_p); } } Without this the zcertstore could not use a free callback but would have to destroy certs manually. Changed cursor handling: I changed the cursor to point at the pointer to the current object instead of directly pointing to the current object. This simplifies several things: - no special cases for empty lists - allows inserting items before the current one - allows removing the current item without destroying the cursor Overall the cursor is more stable and will be kept valid across modifications. I renamed zlist_push to zlist_prepend because I think it better describes what happens and is used in other classes too (e.g. zmsg_prepend). I implemented a destructive removal (zlist_remove) and non-destructive removal (zlist_take), both of which act on the current item pointed to by the cursor and will advance the cursor to the next. Since zlist_remove no longer takes a void *item argument specific items can't be removed anymore without getting the cursor pointed at them. To make this simple zlist_find() will use the compare callback (or item address) to advance the cursor to the item and also return the item. Note: By setting the compare callback and passing the right argument (cast to void*) to zlist_find one can find items by different methods. For example in zloop.c readers are searched by their sock field: static int s_reader_compare (void *reader_, void *sock_) { s_reader_t *reader = reader_; assert (reader); zsock_t *sock = (zsock_t*)sock_; if (reader-sock == sock) return 0; else return 1; } I also added a zlist_insert_sorted that uses the containers compare function to skip all items less than the new one and then insert it. I used this in zdir.c to create the directory and file lists in sorted order from the start instead of sorting them multiple times later. This makes things more expensive [O(n^2) instead of O(n * log n)] but I assumed directories are kept small enough that that doesn't matter. Don't use it for news or mail dirs. If it becomes a problem it can be changed to sort once after building the lists instead of on insert. Affect on other classes: A lot of classes used zlist_pop in a loop to remove all items. Most of those I changed to use a free callback instead. Others I changed to use zmq_take or zmq_remove depending on context. One thing that might be worth adding to zlist (and other containers) is a function to empty the container without destroying the container itself. This seem to get used a few times and currently requires calling zlist_remove in a loop till the list ist empty. The other thing that I saw quite often is iterating over the contents of a list. Maybe it would make sense to provide a macro for this, like FOR_EACH (zlist, item_t *, item) { // do something with item } If so I would define FOR_EACH, TAKE_EACH. FOR_EACH leaves the list unchanged while TAKE_EACH removes but does not destroy each item. Anyway, I've pushed all the zlist changes into a branch of my czmq fork: https://github.com/Q-Leap-Networks/czmq/commits/feature-zlist I've hopefully changed all other classes correctly to follow the new API. They pass all the self tests at least. Have a look and try it out and let me know if you think this improves the API. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Proposal for a revised zlist API
On Thu, Sep 04, 2014 at 04:59:19PM +0200, Goswin von Brederlow wrote: Hi, I've gone over the zlist API and redesigned it according to some ideas we had last month (partly for the zring class). I tried this out on zlist to see how the ideas would fit but the idea would be to change all the classes where appropriate so they behave the same. In case it wasn't clear this is just an experiment to see how it could work. To see if it makes more sense that way. No attempt is made to preserve backward compatibility and none is ment. I wanted to design the cursor behaviour and global callbacks cleanly without having to worry about making them work parallel to the old API. It might be possible to do that though. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] CZMQ: why aren't zframes reference counted?
Oh, maybe that came across wrong. I don't intent do refcount zmsg nor zframe. My intention was to utilize the already existing refcount in the zmq_msg_t object, i.e. refcount the data blob itself and only those blobs. All the metadata (zframe and zmsg) would still be copied. I'm fine with duplicating the metadata. Just duplicating 1MB data blobs in messages seems out of the question. All that realy needs is a function to create a new zframe from an existing one (zframe_clone) that will share the underlying zmq_msg_t. The rest could be written from existing functions. But I think they could be usefull for others so I would like to include them in czmq too. On Mon, Sep 01, 2014 at 05:25:23PM +0200, Pieter Hintjens wrote: OK, a few ideas then. zframe is the wrong level to do reference counting; it should happen at zmsg or a similar multiframe class. Second, you are doing your own queuing and acking, so you can design your own ref counted class on top of zframe. I'd probably not even use zframe then, use zmq_msg directly. On Mon, Sep 1, 2014 at 5:11 PM, Goswin von Brederlow goswin-...@web.de wrote: The zmsg_send_keep() would be using ZFRAME_REUSE. I considered that. But when I send a message to all (or many) peers it gets put into each peers outgoing list, prefixed by the peers sequence number, and only gets removed when the peer ACKs the message. Even if I use ZFRAME_REUSE I need a way to know when a message can be destroyed. That means either reference counting the message itself or using a shallow copy since the underlying zmq_msg_t already reference counts (which then means every peer can destory its shallow copy when it gets an ACK). I considered keeping a global list of all pending messages with a list of peers that are pending and removing peers from the list as they ACK. But that would result in a much more complex code. On Mon, Sep 01, 2014 at 04:14:44PM +0200, Pieter Hintjens wrote: There is an option on zframe_send (ZFRAME_REUSE) which side-steps the destruction. It's a simple exception that lets you send the same frame many times, without the machinery of reference counting, duplicating, etc. On Mon, Sep 1, 2014 at 2:57 PM, Goswin von Brederlow goswin-...@web.de wrote: Hi, the zframe_t type is based on the zmq_msg_t type. But while zms_msg_t uses reference counting there is no function to clone a zframe_t using a shared zmq_msg_t object. Consequently there is no way to make a shallow copy of a zmsg_t (which holds a zlist of zframe_t). Also sending a zmsg_t is destrutive. I need to send the same message (except different identity frame at the start) to many peers and possibly resend it when a peer reconnects. Currently that means duplicating every message many (100-1000) times which wastes both ram and cpu a lot. Any objections to adding three functions: // create new zframe_t that shares the same content as frame zframe_t * zframe_clone (zframe_t *frame); // create new zmsg_t that is a shallow copy of msg // frame contents are shared, see zframe_clone zmsg_t * zmsg_shallow_dup (zmsg_t *msg); // Send message to socket, but do not destroy after sending. int zmdg_send_keep (zmsg_t *msg, void *dest); MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Ideas for a zfork class
On Tue, Sep 02, 2014 at 05:54:47PM -0300, Rodrigo Mosconi wrote: On Tue, Sep 2, 2014 at 2:42 PM, Pieter Hintjens p...@imatix.com wrote: Also, Zyre now supports IPC and inproc between nodes, if you use the gossip discovery feature. One of my goals eventually is to allow actors to start as external processes, connected over IPC. However there's a non-trivial layer of process control involved. That layer of process control is my target. And I agree, it`s not trivial. So I think your idea is a good one (I don't like the name as fork is a Unixism and doesn't really help). Ok, zspawn? https://github.com/savke/ztask0 may help, Martin Vala has been working on this problem for a while. I looked now at ztask0, and on ztask_job_proc_exec it fork and exec a new program. I could change to create new programs, 1 for each task type and start-it with with a pre-defined ID on the command line. (like sendmail libexecs). Problem against it aproach: on the master, I already load/parsed the config file and completed some parameters with random values, for example: ipc endpoint. This could be solved with a more deterministic parameter generation. Why not pass the socket to connect to as argument? You can then send any other config data when the process connects and says hello. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Ideas for a zfork class
On Tue, Sep 02, 2014 at 10:32:21AM -0400, Chad Koski wrote: Hi Rodrigo, You might take a look at zyre (https://github.com/zeromq/zyre) to see what you can apply from there before adding a new class. It might save you some work. zyre can handle the discovery, identification, heart beating and message passing between the master and worker (peers in the zyre network). I?m not sure if it?s exactly what you are looking for, but if it gets you part of the way there, seems like it?s at least worth a look. At the least, it should be able to give you some ideas on how to proceed if it?s not a good fit as-is. Chad On Sep 2, 2014, at 9:50 AM, Rodrigo Mosconi mosconi@gmail.com wrote: Hi all, I use a lot zthreads (I know that it`s deprecated, but I can`t change to zactor now) on a project. But for some external reasons, I need to fork some zmq worker and I would like to manage this fork like zthread/zactor. Let`s call it zfork, and I would like to exchange some thoughts before start developing it. Some aspects that I already toughed: 1 - master/worker intercomunication can`t rely on ZMQ_PAIR socket; maybe a ROUTER/DEALER over IPC approach could be better. 2 - The master could generate a identification for the slave 3 - presence of some sort of heart-beating That basically means you want a zactor. The zactor then can fork the process in turn and handle the heart-beating. 4 - the master may act like a broker over the control/parent-child zsocket 5 - if over IPC, the master could set filters (ZMQ_IPC_FILTER_...) on the socket Any other ideas or comments? Mosconi MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] How to disconnect a peer?
That doesn't answere my question though. I want to realy cut the peers connection and free all its resources. On Thu, Aug 28, 2014 at 07:18:11PM +0200, Pieter Hintjens wrote: My usual strategy is to delete the peer after some timeout, and if the peer returns, then treat it as a new peer. Secondly, treat an out-of-order message as an error so that the client side can re-initialize itself correctly. As example, see FILEMQ. On Thu, Aug 28, 2014 at 5:33 PM, Goswin von Brederlow goswin-...@web.de wrote: How do I disconnect a single peer from a ROUTER socket? The closest in the API seems to be zmq_disconnect(). But a) I didn't connect to the peer, the peer connected to me, and b) I don't have an endpoint, only a peer identity. And I can't tell the peer to disconnect since it is unresponsive. Is that something that should be there but is missing? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] router probing not probing when reconnecting
Hi, to me it looks like the probe router socket option only probes on the initial connect but not when reconnecting. I have a server and client process. The client calls (czmq code): zsocket_set_probe_router(zsock_resolve(self.sock), 1); After that the server receives a message containing 2 frames, the clients identity and a zero frame as expected. The client and servere then do some heartbeating. Now I stop the server and restart it. I would expect the server would again receive a message containing 2 frames, the clients identity and a zero frame. Instead I get a heartbeat from the clients backlog. Is that the intended behaviour or a bug? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] CZMQ: why aren't zframes reference counted?
Hi, the zframe_t type is based on the zmq_msg_t type. But while zms_msg_t uses reference counting there is no function to clone a zframe_t using a shared zmq_msg_t object. Consequently there is no way to make a shallow copy of a zmsg_t (which holds a zlist of zframe_t). Also sending a zmsg_t is destrutive. I need to send the same message (except different identity frame at the start) to many peers and possibly resend it when a peer reconnects. Currently that means duplicating every message many (100-1000) times which wastes both ram and cpu a lot. Any objections to adding three functions: // create new zframe_t that shares the same content as frame zframe_t * zframe_clone (zframe_t *frame); // create new zmsg_t that is a shallow copy of msg // frame contents are shared, see zframe_clone zmsg_t * zmsg_shallow_dup (zmsg_t *msg); // Send message to socket, but do not destroy after sending. int zmdg_send_keep (zmsg_t *msg, void *dest); MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Erlang style messaging
On Thu, Aug 28, 2014 at 08:58:49AM -0600, Steve Murphy wrote: Pieter-- Last year, I read the book, Programming Erlang, by Joe Armstrong, and I was fascinated by the ideology behind the general thread-per-object approach, where each object is managed by its own thread, via message passing. Erlang has a really involved message passing scheme, involving pattern matching, a mailbox, recieve timer, a save queue. Needless to say, all this makes a very powerful way of prioritizing messages, so a busy object manager can pull high-priority requests from the mailbox and act on them immediately, saving lower priority requests for later. I see in a paper at http://zeromq.org/blog:multithreading-magic, the same sort of admiration for Erlang's methodology. But... I'm not seeing the cure-all, end-all, solution to concurrency problems, and it bothers me, because I'm probably missing something fundamental, something I should have picked up on, but didn't. ???Erlang allows other processes/threads to drop requests in an object's mailbox, but it also has a mechanism for waiting until the action is complete, as the object can send a response. It's this response wait that is the killer. Now, I've done a lot of work on/with Asterisk, and it is heavily mulithreaded, in the old-school way, and has a ton of critical sections, and locks, and mutiple locks for a single action. They have evolved some fairly involved strategies to avoid deadlocks, including putting a timer on the lock, and if it times out, giving up the lock they already have, and starting over, allowing the contending party to obtain the lock they need, finish their thing, which allows you to continue and obtain the lock you need to do your thing. And on and on it goes. Now, I didn't go and look up the particulars about N-party dance, etc., but the classic resource deadlock situations still seem in play when you have to wait for completion. A asks B to complete a task, and waits for him to respond. In order to get that done, B requests A for something, and waits forever for A to finish. And so on. Perhaps C or even D are also involved. ??? ???I keep thinking that such basic situations ??? ???aren't solved by switching to the Erlang methods. There must be some architectural, perhaps hierarchical organizing, ??? ???some sort of general design practice, that can overcome these kinds of problems, I'm just blind to it at the moment. Situations like 'atomic' changes on two or more objects at once, etc. and I don't see in the fog, how Erlang solves these problems in general. Can someone point me to some literature that might make things clear? murf ??? You can't solve the problem. As soon as you have a wait for completion/reply operation it is always possible to have a deadlock. Any time something can block you can have a deadlock. You can analyze the code to see if deadlocks are possible but that gets complex quickly. You can avoid the situation by not blocking, which means you can never change 2 objects atomically. That limits what you can do. You can give objects an order (A B C) and only allow locking of objects in order. So a thread that holds a lock on B can never ever lock object A. That makes it impossible to deadlock. But prooving that your source will never violate the order is hard and at runtime all you can do is give an error, often an unrecoverable one (since the case should never happen in the first place nobody writes code to recover from it). You can also hide groups of objects behind a single gatekeeper that never deadlocks and can't be locked. The problem with deadlocks is that a thread can block waiting while blocking something else (or the same thing). If all access to a group of objects goes through a gatekeeper and the gatekeeper itself can't deadlock then the whole can't deadlock. The gatekeeper acts as a lock for the whole group. The drawback is that it acts as a lock for the whole group. You loose parallelity unless the gatekeeper is clever enough to run non conflicting operations in parallel. So in reality you have to balance between locking individual objects and risking deadlocks versus locking groups and being slow. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] First draft for PPPP rfc (paranoid pirate publish protocol)
On Thu, Aug 28, 2014 at 06:40:10AM -0400, André Caron wrote: Interesting spec :-) The one thing I might be concerned about is the high volume of bandwidth required by ACK messages. The're a section in the guide that does a quick computation of the bandwith requirements for heartbeating and the conclusion is to treat any message from a peer as a heartbeat because it's as effective and isn't as prohibitive (indeed, that's what your spec recommends for heartbeats). It might be worth coming up with a quick bandwith estimate using the same type of formula for ACK messages. If I may, I'd recommend storing the last ACK-ed sequence number in all messages sent to a peer. This assumes your application cannot process messages out of order, but it completely eliminates the need for ACK messages: you will automatically encode the ACK into the next message/reply/error reply you send or in the next hearbeat. Any thoughts on this? Cheers, André That is what I'm doing. Look at Message Specifications. All messages contain an ACK sequence number and optionally an array of NACK sequence numbers. So as long as there is a steady send and recv of messages the ACK will go with that. If no message is send then the next heartbeat will carry the ACK. The overhead for ACKs is 8 byte sequence number + 1 byte length for the zmq on-wire encoding. Another 9 bytes for the message sequence number, 2 bytes for the flags, 1-4 + 8 * num NACKs bytes if there are NACKs. So every message (or heartbeat) is at least 20 byte larger. Some more with NACKs but that should be rare. NACKs - With tcp NACKs can only occur when a message is lost during transport. I think that is limited to a connection disconnect/reconnect while the message is in the air. In that case everything in the kernels socket buffers is lost. In the specs I wrote: The ACK sequence number is the highest integer so that all message sequence number less or equal where received without gap (excluding gaps listed in NACK, see below). I changed my mind there while implementing. Say I do have NACKs and I also have 10 messages to send. Do I then send one message with NACKs and a high ACK and the next 9 without NACK and a lower ACK? Sending the NACKs 10 times seems wastefull. But I don't want the ACK counter to ever fall. It seems better for the ACK to be the highest number up until the first gap. The optional NACKs are then all larger than the ACK. The peer can then decide wether to send NACKs or not on a message per message basis and the ACK sequence number is strictly monotonic. The drawback is that the sending peer can not destroy messages received out-of-order by the peer. If there is a gap in the stream then all messages after the gap must stay cached till the gap can be closed. Maybe everything below the highest NACK that isn't itself an NACK could be destroyed but the complexity probably outweighs the gain. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] How to disconnect a peer?
Hi, I've been thinking of error scenarios for my protocol and one of the cases is a peer connecting to the server and becoming unresponsive. This could be accidental or a malicious peer. So what happens in that case? - zmq puts messages into the kernel buffers - zmq puts messages into the pipe (default 1000 messages) - No messages (real or heartbeat) come back from the client So one of 2 things happens: 1) the SNDHWM is reached and messages can't be send 2) the peer misses too many heartbeats Both cases flag the peer first as LATE and then as DEAD. I free up all the resources for a DEAD peer that I hold but what about resources zmq holds? There could be up to 1000 messages in the pipe and whatever the kernel buffer holds. Also there is a file descriptor and the peer identity and metadata are kept. That could add up to quite a bit. So here is my question: How do I disconnect a single peer from a ROUTER socket? The closest in the API seems to be zmq_disconnect(). But a) I didn't connect to the peer, the peer connected to me, and b) I don't have an endpoint, only a peer identity. And I can't tell the peer to disconnect since it is unresponsive. Is that something that should be there but is missing? MfG Goswin PS: my code would also be pretty confused when a DEAD peer suddenly becomes alive again. And the tcp timeout is generally higher than my own timeout. ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] An extension to CZMQ for ephemeral ports
On Mon, Aug 25, 2014 at 12:10:04AM -0600, Steve Murphy wrote: HUH? WHY did you do this? WHY the HECK would this be useful? Here is another good reason: The service might have to be reachable through a firewall, which means adding port forwarding for a range of sockets. You don't want too large a range for that or you run out of ports. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] First draft for PPPP rfc (paranoid pirate publish protocol)
Hi, I've written up a first draft for my protocol: implements a reliable two-way dialog between peers. covers presence, heartbeating, request-reply processing and ACKing of messages. It was based loosly on the Paranoid Pirate Protocol. The goals of are to: * Allow peers to detect disconnection of the other peer, through the use of heartbeating. * Allow peers to detect out-of-order messages by adding sequence numbers. * Allow peers to detect message loss when peers reconnect and resend (or rerequest) messages. * Allow peers to acknowledge receiving a message or not-acknowledge lost messages for fast resent. * Allow peers to associate replies with requests. * Allow peers to publish messages without disrupting the ability of peers to associate replies with requests. For the full specs see: https://github.com/Q-Leap-Networks/rfc/blob/pull-/spec_40.txt The RFC defines the on-wire protocol and behaviour for 2 protocol devices. One device for clients talking to a single server and one for servers talking to many clients. The devices will handle all the tidious protocol handling, acks, resends and timeouts and so on. The applications can then simply send-and-forget messages. RFCs for the devices to follow. Comments welcome. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Using zmq sockets from multiple threads
On Thu, Aug 21, 2014 at 10:59:14AM -0400, Peter Durkan wrote: Thanks Charles. On Thu, Aug 21, 2014 at 10:57 AM, Charles Remes li...@chuckremes.com wrote: Yes. http://zeromq.org/area:faq cr On Aug 21, 2014, at 9:53 AM, Peter Durkan pdur...@lucerahq.com wrote: Hi, Is it safe to use sockets from multiple threads if you prevent concurrent access using a mutex? i.e. { std::lock_guardstd::mutex lck(send_mtx_); ... send a message on the socket } I can also create a new socket each time a new thread calls the function and use the thread_id as a key but was wondering if using a mutex like the above would get around that. Thanks, Peter Or give each thread a socket to begin with and pass the socket to any function that will send data. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Query on zmq_poll API
On Mon, Aug 18, 2014 at 05:14:19PM -0700, Badhrinath Manoharan wrote: Hi, I have the following topology Client --- Broker Server Both the client and server are sockets of type ZMQ_REQ while the Broker has a socket connected to client and another Socket Connected to Server both are of type ZMQ_BROKER. Do you mean ZMQ_REQ, ZMQ_ROUTER, ZMQ_DEALER and ZMQ_REP? Otherwise your topology is different: Client --- Broker Server void *frontend = zmq_socket (context, ZMQ_ROUTER); void *backend = zmq_socket (context, ZMQ_ROUTER); zmq_bind(frontend, ipc:///users/badmanoh/frontend.ipc); zmq_bind(backend, ipc:///users/badmanoh/backend.ipc); poll_items[0].socket = frontend; poll_items[0].fd = 0; poll_items[0].events = ZMQ_POLLIN; poll_items[0].revents = 0; poll_items[0].socket = backend; poll_items[0].fd = 0; poll_items[0].events = ZMQ_POLLIN; poll_items[0].revents = 0; while (1) { ret = zmq_poll(poll_items, 2, -1); if (ret == -1) { printf(zmq_poll returned -1. Error: %d\n, errno); return 0; } if (poll_items[0].revents ZMQ_POLLIN) { } if (poll_items[0].revents ZMQ_POLLIN) { } } On the broker code, I have a zmq_poll(poll_items, 2, -1) on a while loop. I see the zmq_poll notifying the first message from each socket. However subsequent messages from both the client or server sockets are not at all returned and the zmq_poll just stays in an infinite loop. Could you let us know if I am missing anything? Do I need to reset the revents value as part of the first notification? Thanks Badhri After getting the initial message do you send the right reply back? A REQ socket has a strickt send/recv pattern. When you post source don't remove anything. Always post a working source that shows your problem. The bits you cut out are often the bits that are broken. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] need to lower memory usage
On Fri, Aug 01, 2014 at 11:23:24AM -0700, Dave Peacock wrote: Have just run into this same issue. I haven't tried uclibc yet, tho thanks for that suggestion, will investigate later. For those of you running embedded linux or similar, under posix it looks like there are a few options for controlling this. * Before the thread is created, you could call pthread_attr_setstacksize ( http://man7.org/linux/man-pages/man3/pthread_attr_setstacksize.3.html). This requires modifying zeromq source. What exactly is the problem with stacksize? Are you running out of address space or don't you have memory overcommit enabled? Because while the thread might get a few MB of stack only those pages actually being used should get allocated by the kernel. So a new thread might show 16MB virtual memory used for the stack the resident memory will be just the few kiB being actually used (unless you do use more and then a small stack would segfault). MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Edge-triggered polling vs Level-triggered. Which one ZMQ is using? Why?
On Fri, Aug 15, 2014 at 01:22:53PM +0300, artemv.zmq wrote: hi Goswin You mentioned: With a level trigger you can poll() and then consume one message from every socket that has input available. Rince and repeat. No socket can starve any other. Why polling every time and do 1-poller-tick/1-msg-handling? I usually do with zmq' poller like following: ... poller.poll(); ... if (poller.pollin(0)) { for (;;) { frames = socket.recv(DONTWAIT); if (frames == null) break; ... // process frames here. } } ... I.e. gobble messages until I can consume them. BR -artemv Because then one peer might starve every other. Esspecially when processing a request takes noticable time. Assume every thing is idle to start with. Now messages start to arrive on one socket as fast as they can. A bit later other sockets get messages too. What does your code do? The poller will return the first socket. You then process messages from that socket until they are all consumed, which never happens. So you process and process and process that one socket and never ever check other sockets. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] zeromq, abort(), and high reliability environments
On Wed, Aug 13, 2014 at 09:35:18AM +1000, Michi Henning wrote: My current view on what constitutes a sane API and behavior from the library is heavily driven by what I want, as a user. That is, my C libraries are things I primarily make to use, not to sell. I think it's been about 30 years that I wrote my first C libraries, and my style and view has shifted massively since then, to what we have in cases like CZMQ today. I can attest to having undergone a similar change of view over the past 30 years :-) Mainly, the API enforces its style upwards, so that you simply *cannot* get strange code paths and bizarre arguments. If you do, your application is corrupt, or incompetent, and the library has a responsibility to stop things immediately, not allow them to continue. It is a safety cord that has proven its usefulness many times. Indeed, some of the hardest bugs to catch in recent months were from older APIs that precisely returned EINVAL on bad arguments, and where the calling code forgot to check the return code. Stuff is then... bizarrely broken and tracking that down can be insanely hard. I hear you, and there is probably not a single one true answer here. Part of the problem is C, which makes it possible to ignore error codes and just blithely stumble on regardless. In languages with exception handling, it's a different matter though, because I can force the caller to pay attention to invalid arguments. My main concern is that, by aborting in the library, it becomes very difficult to write something that needs to have high reliability. My concern is more that I can't give a proper error message and shut down cleanly. E.g. a distributed system might want to send a goodbye message to it's peers before going down or a server might want to log an error. Basically, I can be sure that my program won't dump core only if I have exercised it to the extent that all possible code paths with all possible argument values are tested under all possible combinations. For any sizeable program (especially with lots of threads and asynchronous things going on), that can be damn near impossible. In turn, if I still want to persist, I now have to wrap the underlying C API and check all the preconditions for every API call myself, just so I can throw an exception when a pre-condition is violated instead of having the program aborted by the library. But And you have to do that for the c++ API, the python API, the ruby API, the ocaml API, the Go API, the obj-c API, the . And each and every one has to match exactly, to the tiniest detail the checks made in zmq itself. And those checks might change over time. How often do we need to duplicate argument validation that libzmq already has? validating the pre-conditions myself may well be very difficult or very expensive. For example, the cost of verifying that a valid socket pointer is passed to every API call is quite high. Well, use a better language. E.g. with ocaml the type system won't even compile code that tries to pass a non-socket to something expecting a socket. On that note: Why is zmq using void * instead of declaring abstract types? If A context where a context_t and a socket a socket_t then even in C you couldn't accidentally pass a context in place of a socket. And yes, I've passed the wrong thing to zmq in C by accident because I got the argument order wrong and both values where void *. No compiler warning or error. It just fails at runtime. If I'm given the option of catching an exception, I may be able to recover from my own programming error, for example, by terminating only the current operation. At least, the program keeps running, instead of dumping core, and I can splatter my log with error messages or whatever I deem appropriate. The point here is that general-purpose libraries should avoid setting policy, because what should happen under certain error conditions is something that needs to be under control of the caller. Or a server can log the request that started the sequenze leading to the invalid call before going down. I hear you about the difficulty of debugging code that ignores EINVAL from API calls. But that is the price of programming in C. It's no different from making system calls and ignoring the return value; do so at your peril. But a system call is policy-free: it allows me to decide what should happen when I have passed bad arguments, instead of taking that decision away from me. At least for languages that support exceptions, I believe throwing an exception for invalid arguments is far preferable to just killing the process. Cheers, Michi. Many languages also allow using them interactively. For example I can start an ocaml toplevel and then interactively enter commands to quickly try out stuff. But as soon as I type in the wrong thing libzmq would abort the whole session, loosing all the work entered before. An error or exception is
Re: [zeromq-dev] zeromq, abort(), and high reliability environments
On Wed, Aug 13, 2014 at 08:37:36AM +0200, Pieter Hintjens wrote: In our APIs we've stripped down error reporting to a minimum. libzmq with its POSIX tendencies still relies IMO far too heavily on subtle error returns (errno == EAGAIN vs. errno == EINVAL?). In higher level language bindings that can be fixed. For example in ocaml you would have val recv : ?(wait=true) - socket - msg option Recv is a function taking an optional argument wait [must be before socket due to language constraints] and a socket and returns an optional message. Meaning it may return None or Some message. An error of EAGAIN or EINTR would return None (no message was received) while EINVAL or other true errors would throw an exception. CZMQ is much cleaner: a method works, or fails if there's a recoverable error, or asserts if there's an unrecoverable error. The key point there is *recoverable error*. An invalid argument to a function is clearly recoverable. The function simply does nothing and returns EINVAL. So are you contradicting yourself here? EINVAL is bad but CZMQ is better because it returns EINVAL? :) MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] zeromq, abort(), and high reliability environments
On Wed, Aug 13, 2014 at 09:39:24AM -0500, Thomas Rodgers wrote: For the other cases where the assert happens in a background thread, I could see retrying before giving up in the event of transient errors, but there's still the fundamental complication of how you communicate the now asynchronous, hard failure back to the caller in some reliable/sane way (as was noted before, the choice the CUDA SDK made here is great example of how not do it). One way may be to have an abort callback that language bindings (or applications) can set. Instead of killing the program outright the abort callback would be invoked and the bindings / application can take the proper actions. If unset or if the callback returns the real abort() can be called. But this would be for unrecoverable errors. I don't think there should be many of those in zmq. There is still a class of errors left though. A background thread can have a persistent error that isn't unrecoverable. For the simplest case when a connection dies and the client doesn't reconnect then there is an error. It won't go away. It won't fix itself. It doesn't impact any other socket (or even other connections of the same socket). So abort/assert is realy the wrong thing. But how to tell the application? Currently these kind of errors get silently ignored in zmq. Messages get dropped to the floor in most cases. Note: I don't know of a better solution so this isn't critizism. Just an example of another class of errors. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] How do I derive a zactor in CZMQ?
Hi, I'm not sure how zactor in CZMQ is to be used when extending it and following the coding style in CZMQ. The problem is that czmq seems to mix the is a and has a relationships. - Messages can be send/recv from a zmq socket. - A zsock is a struct containing a magic and a zmq socket (has a socket) - Messages can be send/recv from a zsock too (is a socket) - A zactor is a struct containing a magic and a zsock (has a zsock) - Messages can be send/recv from a zactor too (is a zsock, is a socket) This works because the low-level code knows the magics for zsock and zactor and will extract (recursively) the underlying zmq socket. So now when I want to extend zactor do I invent a new magic and create a struct containing the magic and a zactor. Do I add that magic to the low-level code too? I see two problems with that: 1) checking for each magic value in turn takes time 2) following each pointer to extract the next level takes time Seems to me like this needs a redesign using a form of inheritance. Maybe something like this: struct Base { magic_t magic; magic_t sub; /* 0 if not derived */ base_t base_data; }; struct Derived { struct Base base; /* base.sub = MAGIC_DERIVED */ magic_t sub; /* 0 if not derived */ derived_t derived_data; } Checking for a derived class still has to go through all the magics so it gets slower as you stack more classes on top of each other. But that should be rare and only the Derived class needs to know about it. On the other hand any function acting on Base can act on any derived class too. The *_destroy() functions can also refuse to destroy a derived object and derived_destroy() would set base.sub=0 before calling base_destroy(). Alternatively the magic_t sub could be the address of the destroy function. This would make them automatically unique and base_destroy() could call sub() if set to destroy the derived object. Think virtual destructor. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] zeromq, abort(), and high reliability environments
On Mon, Aug 11, 2014 at 07:52:02AM +0100, Gerry Steele wrote: How about not sending an ack to your users until the unit of work they input has cleared the pipeline? That way the input application can decide what to do. Obviously depends on your application... What if the input application gets the SIGABRT? Zeromq should imho never fail an assertion. That should be reserved for bugs, not exceptional circumstances. Although with out of memory the application simply gets killed by the OOM killer or gets a segfault due to memory overcommit. There isn't much you can do there. My suggestion is that if you find an assertion that gets triggered then patch it out and handle the error properly and send a pull request for the fix. MfG Goswin On 9 Aug 2014 03:12, Dylan Cali calid1...@gmail.com wrote: Hey guys, What is the right way to use zeromq in high reliability environments? In certain insane/impossible situations (e.g. out of memory, out of file descriptors, etc) libzmq assertions will fail and it will abort. I came across a thread by Martin where he addresses a similar situation [1]. If I'm reading his argument correctly, the gist in general is: If it's impossible to connect due to some error, than you're dead in the water anyways. Crash loudly and immediately with the error (the Fail-Fast paradigm), fix the error, and then restart the process. I actually agree with this philosophy, but a user would say You terminated my entire application stack and didn't give me a chance to cleanup! I had very important data in memory and it's gone! This is especially the case with Java programmers who Always Expect an Exception. For example, in the case of being out of file descriptors, the jzmq bindings will abort, but a Java programmer would expect to get an Exception with the Too Many Open Files error. I guess one possible retort is: if the data in memory was so important, why didn't you have redundancy/failover/some kind of playback log? Why did you put all your eggs in one basket assuming your process would never crash? Is that the right answer here (basically blame the user for not having disaster recovery), or is there a different/better way to address the high reliability scenario? I came across another thread where Martin gets this very complaint (zeromq aborted my application!), and basically says well, if you really, really want to, you can install a signal handler for SIGABRT, but caveat emptor [2]. To me, this is playing with fire, dangerous, and just a Bad Idea. But maybe it's worth the risk in high reliability environments? Thanks in advance for any advice or thoughts. [1] http://lists.zeromq.org/pipermail/zeromq-dev/2009-May/000784.html [2] http://lists.zeromq.org/pipermail/zeromq-dev/2011-October/013608.html MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] get socket condition
On Mon, Aug 11, 2014 at 05:10:11PM +0800, bino oetomo wrote: Dear all ... Let's say I (using python) have simple : ctx = zmq.Context() socket=ctx.socket(zmq.PUSH) socket.setsockopt(zmq.SNDHWM, 10) socket.connect('tcp://127.0.0.1:9001') I knew that when HWM reach, I'll get EAGAIN exception. But, is there any posibilities that a socket is accidently/silenty crash/dead ? If so ... what is the error code ? Is there any docs that explain zmq.ZMQError ? I mean : - What error code (int) - What error name - Meaning of the error, or what caused the error ? Sincerely -bino- There are lots of ways the underlying tcp socket can die. But zeromq will reconnect the socket again and again and again. Only problem is that you can loose messages when it dies with messages in flight. Zeromq has reliable delivery (if a message arrives then it is all of a message), not garanteed delivery. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] get socket condition
On Mon, Aug 11, 2014 at 05:47:14PM +0800, bino oetomo wrote: Dear All, c/q Goswin von Brederlow .. Really appreciate your response On Mon, August 11, 2014 5:36 pm, Goswin von Brederlow wrote: There are lots of ways the underlying tcp socket can die. But zeromq will reconnect the socket again and again and again. Only problem is that you can loose messages when it dies with messages in flight. Zeromq has reliable delivery (if a message arrives then it is all of a message), not garanteed delivery. So . .For now .. all I need to concern is just the HWM ? Sincerely -bino- A PUSH socket will block when the HWM is reached or return EAGAIN with ZMQ_DONTWAIT. Easy enough to handle. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] zeromq, abort(), and high reliability environments
On Mon, Aug 11, 2014 at 04:14:19PM +0200, Pieter Hintjens wrote: On Mon, Aug 11, 2014 at 11:33 AM, Goswin von Brederlow goswin-...@web.de wrote: My suggestion is that if you find an assertion that gets triggered then patch it out and handle the error properly and send a pull request for the fix. Respectfully disagree. Exceptions indicate unrecoverable failure of one kind or another. The fix depends on the case. CZMQ fwiw uses exceptions to check arguments, e.g. asserts if caller passes NULL when not allowed. This is extremely effective. If the application is misusing the API then it's incapable of handling error codes. If the caller passes NULL when not allowed that is a bug. So you can assert there. That is not what I ment. What was ment is like when a lowlevel recv() call returns EAGAIN because some signal occured. zmq must not throw an exception there. Signals do happen from time to time and zmq must deal with syscalls getting interrupted (which it does). I've started adding more aggressive CZMQ-style exceptions to libzmq as well, for the options API, enabled with the --with-militant configure switch. -Pieter MfG Goswin PS: Personally I prever an error with EINVAL to an assertion failure on bad arguments. Anything recoverable should not abort(). Easier for bindings to deal with in a meaningfull way. ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] forwarding a message with metadata
Hi, I hope you don't mind but I want to brainstorm a bit. I want a generic solution to handle heartbeats that I can reuse in different applications. The most transparent way for this seems to be to create a seperate thread for it. So the design looks something like this: [client] PAIR --inproc-- [heartbeat thread] PAIR-DEALER --tcp-- and --tcp-- ROUTER-PAIR [heartbeat thread] --inproc-- PAIR [server] The heartbeat thread just forwards messages between the app and the outside world and back monitoring the traffic. And when there is no traffic it inserts heartbeats at regular intervals. It also filters out incoming heartbeats from the outside world. So far this is pretty simple. But now for the interesting parts: 1) I want to use CURVE and zmq_msg_gets() When I recv() a message on the ROUTER socket it has metadata attached to it. Can I (and how) forward this message across the PAIR socket savely keeping the metadata pointer intact so the app can still call zmq_msg_gets() on it? 2) I want to report disapeared peers to the app Basically I have normal messages and control/monitoring messages. I've been thinking that on a ROUTER socket each message starts with an identity frame. So I could send a 0 frame to indicate the message is a control message. But what do I do with the client? I can't use the same with a DEALER socket. I would have to prefix every message with 0 (control) or 1 (data) and that would need change the application. I couldn't just plug in or remove the heartbeat without rewriting the message parsing. A short time ago we talked about using some bit in the message itself to say wether it is a control message or normal and having in-band control messages for connect/dicsonnect and so on. Maybe it would be a good time to design and implement something like that now and use it here? 2b) Combining 2 and 1. Can I set my own metadata for control messages? I think the lack of a User-Id on a message could be used to identify control messages. Right? 3) I want to connect/disconnect/bind/unbind or set HWM and so on The app and the DEALER/ROUTER sockets are in different threads. This makes modifying them directly impossible (never share a socket between threads, right?). So this would also need some form of control message but going the other way this time. Should this be just a custom control message that is specific to my use case or would it make sense to define a set of control messages for all the socket operations in zmq so other use cases use the same syntax? I'm thinking this could be reusable for proxies. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] zeromq, abort(), and high reliability environments
On Mon, Aug 11, 2014 at 04:37:18PM +0200, Goswin von Brederlow wrote: On Mon, Aug 11, 2014 at 04:14:19PM +0200, Pieter Hintjens wrote: On Mon, Aug 11, 2014 at 11:33 AM, Goswin von Brederlow goswin-...@web.de wrote: My suggestion is that if you find an assertion that gets triggered then patch it out and handle the error properly and send a pull request for the fix. Respectfully disagree. Exceptions indicate unrecoverable failure of one kind or another. The fix depends on the case. CZMQ fwiw uses exceptions to check arguments, e.g. asserts if caller passes NULL when not allowed. This is extremely effective. If the application is misusing the API then it's incapable of handling error codes. If the caller passes NULL when not allowed that is a bug. So you can assert there. That is not what I ment. What was ment is like when a lowlevel recv() call returns EAGAIN because some signal occured. zmq must not throw an exception there. Must not throw an assertion there. An exception in the highlevel bindings (if the language has any) could be ok. That's a matter of taste and language style. Signals do happen from time to time and zmq must deal with syscalls getting interrupted (which it does). I've started adding more aggressive CZMQ-style exceptions to libzmq as well, for the options API, enabled with the --with-militant configure switch. -Pieter MfG Goswin PS: Personally I prever an error with EINVAL to an assertion failure on bad arguments. Anything recoverable should not abort(). Easier for bindings to deal with in a meaningfull way. ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Edge-triggered polling vs Level-triggered. Which one ZMQ is using? Why?
On Wed, Jul 30, 2014 at 12:32:03PM -0700, Michel Pelletier wrote: On Mon, Jul 21, 2014 at 11:23 AM, Goswin von Brederlow goswin-...@web.de wrote: On Mon, Jul 14, 2014 at 08:42:14AM -0700, Michel Pelletier wrote: I think it is a big issue. On read a level trigger is better because when you didn't consume all input then the next zmq_poll() should not block. It's too easy to accidentally run into this. Further you actually do want to only consume some input from each socket and then poll again. The reason for that is so that you can round-robin all sockets fairly, e.g. consume one message from each socket and then poll again. If you instead consume as much as possible from every socket then one socket can be flooded with messages and starve other sockets. So level trigger on read is a must for me. By the time you are calling poll the flood has already occurred, the data has arrived locally and is in memory. The polling strategy isn't a form of The RCVHWM will put a stop into the flow there. So you don't get a total flood. You can still get a bad ratio of messages handled between sockets, 1:1000 with the default HWM. flow control. At some point something has to block or drop when you don't consume the data that has arrived whether in one edge triggered call or many level triggered calls. If you're worried about flooding then credit based flow control is the way to go. -Michel No. Credit based flow control is something different. This is about simplicity. With a level trigger you can poll() and then consume one message from every socket that has input available. Rince and repeat. No socket can starve any other. With edge trigger you poll(), consume one message from every socket that has input available and then you have a problem. You can't simply poll again. You have to remember the sockets from the last poll call and merge them with the result of the next one since sockets that had messages the last time might or might not appear again if they have more input now. That is not just a lot harder to put into words but also much harder to implement. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Can't run tests concurrently due to hardcoded port numbers
On Thu, Jul 24, 2014 at 02:27:41PM -0400, Greg Ward wrote: On 22 July 2014, Pieter Hintjens said: Good catch. We could definitely use ephemeral ports (libzmq supports that), though it would take changes to most of the test cases. OK, I'll open a bug. I'll see about starting on a patch too. From the zmq_tcp man page: The TCP port number may be specified by: · A numeric value, usually above 1024 on POSIX systems. · The wild-card *, meaning a system-assigned ephemeral port. When using ephemeral ports, the caller should retrieve the actual assigned port using the ZMQ_LAST_ENDPOINT socket option. See zmq_getsockopt(3) for details. Sounds doable, but tedious. I predict I'll do 4 test cases before I wander off into factoring out the boring part. ;-) Thanks! Greg Most tests create a socket pair so I guess it would make sense to have a helper function void *socks[2]; bool res = make_tcp_pair(socks, ZMQ_PUB, ZMQ_SUB); assert(res); MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] pub/sub and multicast
On Wed, Jul 23, 2014 at 08:45:37PM +, McMillan, Scott A wrote: Hi, Sorry for the very basic question, I'm new to zeromq. The FAQ (http://zeromq.org/area:faq#toc0) says PUB does multicast. How should I interpret this statement? Does this mean that IP-level multicast is required to use the pub/sub pattern? No. PUB will simply unicast each message to every connected peer. So to you it looks like multicast since zmq hides the details from you. Looking at the examples, it looks like the protocol can be selected independently from the pattern? E.g., several examples contain the following, which I interpret as the pub/sub pattern on top of TCP unicast: void *publisher = zmq_socket (context, ZMQ_PUB); zmq_bind (publisher, tcp://*:5563); If one wanted to use multicast, then the second line would be replaced with something like: zmq_bind (publisher, pgm://...); Am I understanding this correctly? Thanks, Scott That is my understanding too. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Looking for the best server/client layout
On Thu, Jul 24, 2014 at 02:44:31PM -0400, Greg Ward wrote: On 24 July 2014, Mike Zupan said: I'm new to zeromq on the dev side and looking for the best layout to use for a server/client setup where the server can send commands to clients and also the clients can send data back to the server without being told to run a command. Pretty much like the client checking in with some data it found on the server as it happens instead of waiting for the server to say ok run this command give me data back. I believe you need a DEALER/ROUTER pair. REQ/REP is limited in that it's a strict back-and-forth communication pattern: 1 request, 1 reply, 1 request, ad infinitum. So REQ/REP is not terribly useful in real-world apps like yours. DEALER/ROUTER is the real-world version of REQ/REP -- either peer can send a message at any time. IIUC, the typical usage is DEALER for the client connected to ROUTER on the server. Greg You can use DEALER/DEALER, DEALER/ROUTER, ROUTER/DEALER or ROUTER/ROUTER. Both kinds allow multiple peers to connect but a DEALER socket will send outgoing messages in a round-robin fashion to some peer, which peer you have no control over. Similary incoming messages have no origin attached to them so you don't know where they came from. On the other hand a ROUTER socket will prepend the peer identity to every incoming message and use the first frame of every outgoing message as the peer identity the message should got to. So you get told from where a message came and can tell it where to send messages. Most often you have a star shaped topology. A single server will have many clients and clients only have one server. So you tend to have ROUTER on the server and DEALER on client side. On the other hand if a server has many workers and doesn't care which worker gets each request a DEALER/DELAER setup works just fine. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Multiple clients with REQ/REP
On Thu, Jul 24, 2014 at 08:30:40PM -0500, Gregg Jensen wrote: I currently using zeromq 3.2.3 and will be upgrading to the latest soon, but for now my question relates to this older version. I have been use the REQ/REP socket types for single client to single server with great success. And according to the guide now that I want to move up to the final goal of multiple clients to a single server, I was under the impression that I am supposed to changed the clients to include the identity, and then use a ROUTER to DEALER proxy which delivers to REP worker processes that have been spawned off in separate threads. I have code that will do this, however I have some problems with the processing in the threads. But, that is not the question. I backed out the code that does the ROUTER/DEALER/workers and went back to just the one REP socket on the server that spawns the processing on a thread upon recv. I went back to run my multi-client test and bingo it all worked great. Therein lies my question. How many client REQ connections will the standard REP socket handle? Can I leave my code with just REQ/REP and not move to REQ-ROUTER/DEALER/workers? That is, will it scale up to 100?s, 1000?s or even 100,000s (given good hardware) connections? Will the client connections all stay aligned with the socket connection they make the request to? If so, then why would I want that pattern on the server of ROUTER to DEALER (proxy) out to workers in separate threads? REQ/REP work just fine with multiple peers. The only restriction is that you have a strict request/reply/request/reply/request/reply pattern. You can not handle multiple requests in parallel or send replies out of order. Also the different socket types in zmq are pretty much all the same concerning scalability I think. If zmq can't handle N peers on a REP socket then it won't be able to handle N peers on a ROUTER either. And N is 1000 for sure. The scalability problem with 10 peers will be in your server. It can only handle one request at a time and if a request takes time that will block all other peers. Worse if a request has to wait, e.g. to extract some data from a database. So when a single request has to wait or can't be parallelized to use all cores then you can improve your scalability by switching to a REQ/ROUTER setup and handle multiple requests in parallel. You can also improve each clients performance by switching to DEALER/ROUTER and issuing multiple requests in parallel, if your problem allows for that. Note: If you like me are using python then beware that python has a global lock allowing only one python thread to run at any given time. Multiple threads only help if threads block waiting for something. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Router-Dealer Example not working in C on 4.0.4 ZMQ release
On Wed, Jul 23, 2014 at 12:50:37PM -0700, Badhrinath Manoharan wrote: Hi Peter, Thanks a lot for your response. I was able to try the same as what you had mentioned. I used the raw Zeromq APIs instead of the C binds. Below is my code and the result that I get when. I still don't see them working. Please let me know where I am going wrong here. #include /usr/local/include/zmq.h #include stdio.h #include unistd.h #include string.h #include assert.h int main (void) { char buffer[15]; void *context = zmq_ctx_new (); void *router = zmq_socket (context, ZMQ_ROUTER); void *dealer = zmq_socket (context, ZMQ_DEALER); memset(buffer, 0, 15); zmq_bind(router, tcp://127.0.0.1:9990); zmq_connect(dealer, tcp://127.0.0.1:9990); printf(Sending Message from Dealer\n); zmq_send(dealer, Hello World, 11, 0); printf(Waiting to receive message from Dealer\n); zmq_recv(router, buffer, 15, 0); printf(Received message from Dealer: %s\n, buffer); zmq_close(dealer); zmq_close(router); zmq_ctx_destroy(context); return 0; } bash-3.2$ ./zmq_router_dealer Sending Message from Dealer Waiting to receive message from Dealer Received message from Dealer: bash-3.2$ Thanks Badhri A side note that I feel is extremly important when something doesn't work: CHECK RETURN VALUES How do you know zmq_bind() worked? zmq_connect()? zmq_send()? Any of them can fail and then your zmq_recv() will wait forever. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] queue overhead
On Sun, Jul 27, 2014 at 11:13:31AM -0700, Justin Karneges wrote: I have a stable (in the addressing sense) worker that I want to take advantage of multiple cores. So, I run N instances of this worker, where N is the number of cores on the host machine, and each worker binds to its own socket. Components that wish to make use of this worker service connect to all N worker instances. Unfortunately this is a little awkward. The connecting components must be configured with the N socket specs. And it's hard to automate this, since even if the connecting components could generate socket specs programmatically, this still requires knowing the number of cores of the remote machine. What I'd like to do is put an adapter component in front of the N worker instances (on the same machine as the worker instances) that binds to a single socket. It would route to the N workers, and this is easily done since the adapter lives on the same machine and knows the number of cores. Connecting components could then simply connect to this adapter, and not need to care about the number of remote cores. The question I have is what kind of overhead this introduces. An MxN set of connections between M remote components and the N workers seems like it would be far more efficient than M-1-N, which looks like a bottleneck. But maybe in practice, if the routing is very simple, then it becomes negligible? Justin You want one worker per core? So that is all a single system? So why not multithread your worker? You have one main thread that handles the communication with the outside using a ROUTER socket and talks to the worker threads using inproc://. That way you have a single socket to connect to and inproc:// avoids the overhead of retransmitting messages to other processes or between systems. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Question about ROUTER queue (what if client dies)
On Mon, Jul 28, 2014 at 02:42:50PM +0100, Pedro Januário wrote: hi Peter, I didn???t write C code for a while, but would be something like this: #include czmq.h int main (void) { zsock_t *router = zsock_new_router (tcp://127.0.0.1:9990); zsock_t *dealer = zsock_new_dealer (tcp://127.0.0.1:9990); // dealer send message zstr_send (dealer, Hello World); // router received message zmsg_t *msg = zmsg_recv (router); zmsg_print (msg); // dealer (client) disconnects permanently zsock_destroy (dealer); // router send message to dealer, because it thinks that is connected zmsg_send (msg, router); zmsg_destroy (msg); zsock_destroy (router); return 0; } I hope this is self explanatory. One way to solve this, is by sending ???good bye??? message from dealer (client), but for my use case would be enough discharge message that are send after x minutes. Regards, Pedro Januário Delivering Innovation and Technology That case is simple. The DEALER shuts down normally so the tcp socket sends its FIN and the ROUTER removes the outgoing pipe. But what if the DEALER deadlocks or the kernel crashes (asumming different systems)? Then tcp will not close the connection for a long time. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Dissecting Message Queues by Tyler Treat
On Fri, Jul 11, 2014 at 12:47:42PM -0400, Steven McCoy wrote: I don't think I saw this posted yet: noted on highscalability.com today, http://www.bravenewgeek.com/dissecting-message-queues/ -- Steve-o Sending messages to PUB is a lot faster than receiving from SUB according to that post, not surprising since PUB will drop messages when the network connection can't keep up. But then it would be interesting how many messages where dropped. And not just for the ZeroMQ case. Or do the same test with a non-dropping socket type. Measuring how fast zmq (or other) can loose data is not realy interesting I think. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Edge-triggered polling vs Level-triggered. Which one ZMQ is using? Why?
On Mon, Jul 14, 2014 at 08:42:14AM -0700, Michel Pelletier wrote: There are some discussions here: http://lwn.net/Articles/25137/ http://www.kegel.com/c10k.html A quick scan of nanomsg source indicates it can use epoll, kqueue, or poll. The first two are edge-triggered and the third is level-triggered. Actually epoll() supports both level and edge trigger. nanomsg appears in the first two cases to keep an internal array of events (self-events) to emulate level triggering (I could be wrong about that as I am not a nanomsg dev). As for one vs the other, I think it's a non-issue. Neither one is right or wrong, correct code can be written for either case. If you are doing event-driven programming you are either already aware of the issues of the underlying api (epoll/kqueue/poll) or will have to come to understand them eventually. -Michel I think it is a big issue. On read a level trigger is better because when you didn't consume all input then the next zmq_poll() should not block. It's too easy to accidentally run into this. Further you actually do want to only consume some input from each socket and then poll again. The reason for that is so that you can round-robin all sockets fairly, e.g. consume one message from each socket and then poll again. If you instead consume as much as possible from every socket then one socket can be flooded with messages and starve other sockets. So level trigger on read is a must for me. Now on write the opposite holds. Normaly you write to the socket and the data goes directly into the buffer/queue. But once in a while you write too much and the socket returns EAGAIN. Then you keep the message in some app internal buffers and wait for POLLOUT. When you get a POLLOUT you write the backlog. With level trigger you have to add waiting for POLLOUT every time your buffer/queue runs full and remove again once the backlog is cleared. Otherwise POLLOUT would get triggered all the time when there is no backlog. With edge trigger on the other hand you simply add POLLOUT once and you make sure to write as much backlog as possible whenever it gets triggered. In rare cases it might get triggered with no backlog but only at most once every time the backlog is cleared [this happens when the backlog exactly fills the buffer]. So in summary the best behaviour would be (if you can't select it yourself): read - level triggered write - edge triggered On Mon, Jul 14, 2014 at 4:52 AM, artemv zmq artemv@gmail.com wrote: Hi Pieter Not sure if it is related or not, but I experience an issue with using ZMQ.Poller and java socket (SelectableChannel). I.e. poller api itself allows me to register SelectableChannel on poller (for POLLIN), but combination isn't workable -- poller does render presence of incoming traffic only once.. (( 2014-07-14 14:31 GMT+03:00 Pieter Hintjens p...@imatix.com: This was a design decision from very long ago, 2009 or so, and there was no real discussion of it. I've always assumed if it was really a problem in libzmq, someone would have changed that by now. On Mon, Jul 14, 2014 at 11:50 AM, artemv zmq artemv@gmail.com wrote: Hi community Did read nanomsg' docs, the part where they have been explaining differences against zmq. They mentioned that they use level-triggered polling instead of edge-triggered one as it's in zmq. Since I'm not expert in this low-level stuff, but I'm still very curious why zeromq decided to go edge-triggered way instead of level-triggered one. Thanks in advance. I'm pretty sure zmq uses level trigger or my code would surely deadlock all the time. I'm also pretty certain I saw that specified in the docs but now all I see is mention that zmq_poll() behaves like the systems poll(), which is level triggered. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Extending zmq_msg_t API
On Wed, Jul 09, 2014 at 04:53:16PM -0500, Thomas Rodgers wrote: Right. This is my trepidation about surfacing the refcount. The sharedness is indicated by the flags field on the msg type, and that, I'm reasonably sure, is not altered once it is set. At least it can't only become unshared in the background, not suddenly start being shared. With 2 copies of a message floating around one can be closed or shared again inbetween the check and the copy. But with only one copy (the one YOU hold) nobody else can share the message in the background. Assuming you don't share message pointer between threads. A zmq_msg_get(msg, ZMQ_SHARED) is easy to add and should be thread save, erring on sometimes returning true when a message is later not shared anymore. Looking forward to a PULL request for that. MfG Goswin On Wed, Jul 9, 2014 at 4:05 PM, KIU Shueng Chuan nixch...@gmail.com wrote: Couldn't the refcount change after you have obtained its value? E.g. Make a copy Send the 1st Read the refcount (2) Background io releases 1st copy On 9 Jul 2014 18:21, Thomas Rodgers rodg...@twrodgers.com wrote: zmq_msg_get() could be extended to give either the refcount or an indicator on whether a message was share; based on other refcounted designs I'm hesitant to promote surfacing the actual count. Similarly, zmq_msg_set() could allow 'unsharing' by adding a ZMQ_SHARED property #define and setting it's value to 0 (no effect on non-shared messages). So the only API surface area change is an additional message property. This seems the cleanest to me. On Wednesday, July 9, 2014, Goswin von Brederlow goswin-...@web.de wrote: On Tue, Jul 08, 2014 at 10:42:41AM -0500, Thomas Rodgers wrote: tl;dr; Is there any objection to adding some sort of accessor to the API to determine if a given zmq_msg_t is_shared()? Background/Rationale: Something I encountered while writing a high level C++ wrapper for zmq_msg_t and it's API is the following set of behaviors - zmq_msg_init(msg_vsm, 20); Results in a type_vsm message, the body of which is held entirely within the space allocated to zmq_msg_t zmq_msg_init(msg_lmsg, 1024); Results in a type_lmsg message, the body is held as a reference to a block of size bytes. memcpy(zmq_msg_data(msg_vsm), VSM, 3); memcpy(zmq_msg_data(msg_lmsg), LMSG, 4); So far so good. Now copy - zmq_msg_copy(msg_vsm2, msg_vsm); zmq_msg_copy(msg_lmsg2, msg_lmsg); Now change contents - memcpy(zmq_msg_data(msg_vsm2), vsm, 3); memcpy(zmq_msg_data(msg_lmsg2), lmsg, 4); assert(memcmp(msg_vsm, msg_vsm2, 3) != 0); // ok assert(memcmp(msg_lmsg, msg_lmsg2, 4) != 0); // fail This happens by design (lmsg's are refcounted on copy, not deep copied). But it results in a situation where a zmq_msg_t is sometimes a Value and sometimes a Reference. This could lead to astonishment for the unwary. From the perspective of a wrapper (particularly one that takes a strong stand on value semantics and local reasoning), this behavior is ungood. So my options are deep copy always or implement copy-on-write. For efficiency I prefer the latter approach in the case of type_lmsg messages. I have implemented the copy-on-write logic through a horrible brittle hack that examines the last byte of zmq_msg_t. I would prefer a less brittle solution. lmsg's are refcounted on copy Can't you access the refcount? Or is that the API call you want to add? Maybe instead of is_shared() an unshare() call would be more usefull, which would copy the message payload if it is shared. Or both? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Curve: Potential DoS with error commands ?
On Fri, Jul 04, 2014 at 07:11:52PM +0200, Diego Duclos wrote: Actually, after some more consideration. As CurveZMQ runs over TCP, which can already be trivially DoS'd using a FIN packet, I don't think adding authentication will add much value to the protocol. On Fri, Jul 4, 2014 at 4:34 PM, Goswin von Brederlow goswin-...@web.de wrote: On Thu, Jul 03, 2014 at 09:24:59PM +0200, Pieter Hintjens wrote: I guess the error command could be encrypted with the server long term private key, yes. On Thu, Jul 3, 2014 at 8:15 PM, Diego Duclos diego.duc...@palmstonegames.com wrote: I've been reading up the Curve spec with more detail, and the way the error packet currently works caught me by surprise. Couldn't a crafted TCP packet with an error command be sent to a client ? Tricking it into thinking the server has denied it's credentials when it has done no such thing ? This allows someone with the ability to listen in but not block packets to do denial of service, which wouldn't be the case if the error packet was authenticated encrypted. What if the error was that the servers public key didn't fit? MfG Goswin Not the same I think. If you send a FIN to the tcp connection then zmq will reconnect. So you can disrupt zmq somewhat but it will recover. If you send an error to the CURVE authentication then zmq will not retry. A single well timed attack disables the connection completly. As for the well timed you probably would need to use a FIN or something to disrupt an existing connection and then catch the reconnection to DOS it. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Creating a Simple HTTP Server using zmq_msg_send and zmq_send
On Sun, Jul 06, 2014 at 08:03:11PM -0700, Satish wrote: Hello Programmers I am trying to get acquaint using zeromq library and am developing a program provided below on this forum. I want to use zeromq message to retrieve HTTP message and send HTTP response to multiple client. I have gone through zguide documentation but there is only a mention of Multiple client connecting to zeromq broker/process via InProcess (inproc). I am deliberately not using inproc for queues purposes/ inter threading messaging as I want this done via TBB queuing. However the crux of the issue I am not get the zeromq to send back a response to a web-browser using zmq_send and zmq_msg_send. I have tried using zmq_msg_send because in runtime there is a possbility of have large data stream sent back to client/web browser. I want to zmq_msg structure there is no need to specify the data limitations will receiving and sending stream from and to the client. How can I solve problem? Please help. Thank you. Satish Zmq wasn't made for this. The existance of ZMQ_STREAM sockets is more of an ugly hack than proper design and it behaves differently from normal zmq sockets. It's there so one can interface non-zmq stuff with zmq, not for it's own sake. So maybe that isn't the best way to learn zmq. --- void release(void* data, void* hint){} What's that? It's never used. And where are all the required #includ directives? void main(int argc, char** argv){ void* ZMQCONTEXT = ::zmq_ctx_new(); std::istringstream instr = std::istringstream(); zmq_msg_t message = { }; int init_status = 0, multi_status = 0; zmq_pollitem_t timer[2] = { { ::zmq_socket(ZMQCONTEXT, ZMQ_STREAM), 0, ZMQ_POLLIN, 0 }, { 0, 0, ZMQ_POLLOUT, 0 } }; int POLLED_SIZE = sizeof(timer) / sizeof(zmq_pollitem_t); timer[1].socket = timer[0].socket; WTF? Why don't you poll for ZMQ_POLLIN | ZMQ_POLLOUT? WTF2? Why poll for ZMQ_POLLOUT at all? Since polling is level triggered that makes no sense. ZMQ_POLLOUT only makes sense temporarily when sending out data overflows the outgoing buffers and then only until you managed to send the pending data. const char *tcp_desc = tcp://*:9090; ::zmq_bind(timer[0].socket, tcp_desc); char http_response[100] = HTTP/1.0 200 OK\r\n Content-Type: text/plain\r\n \r\n Hello, World!; int packet_size = sizeof(http_response) / sizeof(char); Per definition sizeof(char) == 1. What you need here is int packet_size = sizeof(http_response) * CHAR_BIT / 8; But if CHAR_BIT != 8 then you are screwed anyway since the http header needs to be send as octets. ::zmq_msg_init(message); do { for (int p = 0; p POLLED_SIZE; p++){ ::zmq_poll(timer[p], 1, 10); }//for So you wait 10ms for incoming data and then 10ms for outgoing not being blocked. So after this loop there might or might not be incoming data and outgoing might or might not be blocked. But maybe you waited 20ms, or not. ::zmq_msg_recv(message, timer[0].socket, 0); So this will block till there actually is some incoming data. while (::zmq_msg_more(message) == 1) { ::zmq_msg_recv(message, timer[0].socket, ZMQ_DONTWAIT); } ZMQ_STREAM always sends data in pairs of frames. One frame for the peer ID and one frame for the data. Also you get connection and disconnection messages, which are identified by having 0 bytes of data as second frame. You need to catch (and ignore in this case) those. std::cout std::endl Size = ::zmq_msg_size(message) std::endl; instr.str(static_castchar*(zmq_msg_data(message))); ::zmq_msg_close(message); Who says the data is a 0 terminated C string? Zmq messages are byte arrays of a given size that can contain 0 at any point. You are lucky this doesn't segfault. std::cout Data Received = instr.str() std::endl; //send back the response //Prefix the socket with port id ::zmq_send(timer[0].socket, instr.str().c_str(), instr.str().size(), ZMQ_SNDMORE); //Send payload ::zmq_send(timer[0].socket, http_response, packet_size, ZMQ_SNDMORE); //Postfix the socket with port id ::zmq_send(timer[0].socket, instr.str().c_str(), instr.str().size(), ZMQ_SNDMORE); WTF? Postfix? There is no postfix. You are starting a new message there. ::zmq_send(timer[0].socket, 0, 0, ZMQ_SNDMORE); Ahh, you mean that you close the connection to the pear. std::cout Sending... http_response; } while (std::cin.get() == 0xd);
Re: [zeromq-dev] Peer ip address ?
On Sun, Jul 06, 2014 at 08:07:17PM -0700, Satish wrote: Hi, Try using getsocketname. The only problem is that some show the zeromq socket descriptor needs to appear/compatible with a normal socket descriptor. Satish On Thursday, 16 January 2014 03:14:35 UTC+11, mraptor wrote: hi I was looking for a way to find the peer/client ip address. All of the replies I've seen so far say it is not possible to get the IP address of the peer in ZeroMQ. The main objection for not providing the IP address seem to be that zeromq work on top of protocols which may not be TCP/IP. The solution pointed by most of the people seems to be to figure out the IP address at the client and pass it as a part of the message. I'm currently needing the IP address for logging purposes and in the future for filtering and routing. Two problems arise : 1. What happens if you don't have access to the client code i.e. it is written by third party 2. Second allowing the client to provide the IP address could be major security breach, because if it is up to the client, they can place whatever IP they want, how would you know ? How do you solve those problems ? Unless zeromq, already have some means of getting the peer IP, the discussions about this were from 2011 ? thank you Or a ZAP handler that stores the IP address of the connection attempt as User-Id or as metadata. You can then use zmq_msg_gets() to retrieve it when a message is received. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Extending zmq_msg_t API
On Tue, Jul 08, 2014 at 10:42:41AM -0500, Thomas Rodgers wrote: tl;dr; Is there any objection to adding some sort of accessor to the API to determine if a given zmq_msg_t is_shared()? Background/Rationale: Something I encountered while writing a high level C++ wrapper for zmq_msg_t and it's API is the following set of behaviors - zmq_msg_init(msg_vsm, 20); Results in a type_vsm message, the body of which is held entirely within the space allocated to zmq_msg_t zmq_msg_init(msg_lmsg, 1024); Results in a type_lmsg message, the body is held as a reference to a block of size bytes. memcpy(zmq_msg_data(msg_vsm), VSM, 3); memcpy(zmq_msg_data(msg_lmsg), LMSG, 4); So far so good. Now copy - zmq_msg_copy(msg_vsm2, msg_vsm); zmq_msg_copy(msg_lmsg2, msg_lmsg); Now change contents - memcpy(zmq_msg_data(msg_vsm2), vsm, 3); memcpy(zmq_msg_data(msg_lmsg2), lmsg, 4); assert(memcmp(msg_vsm, msg_vsm2, 3) != 0); // ok assert(memcmp(msg_lmsg, msg_lmsg2, 4) != 0); // fail This happens by design (lmsg's are refcounted on copy, not deep copied). But it results in a situation where a zmq_msg_t is sometimes a Value and sometimes a Reference. This could lead to astonishment for the unwary. From the perspective of a wrapper (particularly one that takes a strong stand on value semantics and local reasoning), this behavior is ungood. So my options are deep copy always or implement copy-on-write. For efficiency I prefer the latter approach in the case of type_lmsg messages. I have implemented the copy-on-write logic through a horrible brittle hack that examines the last byte of zmq_msg_t. I would prefer a less brittle solution. lmsg's are refcounted on copy Can't you access the refcount? Or is that the API call you want to add? Maybe instead of is_shared() an unshare() call would be more usefull, which would copy the message payload if it is shared. Or both? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] RFC: PPPP - Paranoid Pirate Publishing Protocol
On Thu, Jul 03, 2014 at 11:50:44AM +0200, Pieter Hintjens wrote: On Thu, Jul 3, 2014 at 10:35 AM, Goswin von Brederlow goswin-...@web.de wrote: I can think of 3 ways to implement this (across all socket types): Do we want this across all socket types? For ROUTER, certainly. Right now we're hacking this using ZMQ_ROUTER_PROBE. For other socket types, it seems to break the abstraction. 1) Sending a message with identity (or command?) flag set containing the identity and wether it is a connect or disconnect as frames. For backwards compatibility that should only be send when a socket option is set, default to off. Yes. 2) Include the identity in the DISCONNECT event on the monitor socket. 3) Add 2 new monitoring events: IDENTITY_SET and IDENTITY_DISCONNECTED. This was the original argument against adding monitoring at all, that it would be abused for topology purposes. Once you start to give the application knowledge of the topology, you break the ZeroMQ abstraction and you kill scalability. (This was Sustrik's argument, and I mostly accept it.) More pragmatically, you *cannot* safely use monitoring for such things. Monitoring events are asynch and processed out of band. You cannot sanely synchronize monitoring events with message flow, except by injecting them into the message flow itself, and then you get option (1). The simplest backwards compatible solution would be: 1. use a socket option to enable connect/disconnect events 2. deliver these events as messages, on the socket 3. use a message format that is easy to filter, e.g. size=2 4. use a message flag, ZMQ_EVENT, via zmq_msg_get 5. enumerate the events, 1=CONNECT, 2=DISCONNECT, etc. 6. optionally, allow more parts to follow, depending on the event type. This could be implemented for all socket types. There are other events we could conceivably add, and we could fix XPUB to use events instead of its magic subscribe/unsubscribe messages. ZMQ_STREAM sockets would benefit from that as well. An event interface is much cleaner than monitoring. It still breaks the scalability abstractions, except for ROUTER where it's a good fit. -Pieter I agree that an application should not care about the topology and such things. But that assumes the proper protocol exists to handle such things transparently for the application. For simple cases like REQ/REP or PUB/SUB a protocol exists. But to build a more complex protocol on top of basic sockets requires some interaction with the topology. Maybe the right thing would be to implement PPP (and ) directly in zmq as new socket types. But I think that would be bad in the long term. Adding a new socket type for every protocol would be a nightmare to maintain I bet. Exposing events on the socket and implementing protocols as a middle layer seems better. My test setup for looks like this: QT client --inproc-- SUB helper --tcp-- PUB helper -- server MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] RFC: PPPP - Paranoid Pirate Publishing Protocol
On Thu, Jul 03, 2014 at 11:50:44AM +0200, Pieter Hintjens wrote: The simplest backwards compatible solution would be: 1. use a socket option to enable connect/disconnect events 2. deliver these events as messages, on the socket 3. use a message format that is easy to filter, e.g. size=2 4. use a message flag, ZMQ_EVENT, via zmq_msg_get 5. enumerate the events, 1=CONNECT, 2=DISCONNECT, etc. 6. optionally, allow more parts to follow, depending on the event type. This could be implemented for all socket types. There are other events we could conceivably add, and we could fix XPUB to use events instead of its magic subscribe/unsubscribe messages. An event interface is much cleaner than monitoring. It still breaks the scalability abstractions, except for ROUTER where it's a good fit. -Pieter I've been thinking some more about this. Say you have the following topology with a proxy: SUB 1 --\ SUB 2 ---= PUB SUB 3 --= XPUB-XSUB / SUB 4 -/ When SUB 1 subscribes to something then PUB gets a subscription event. When SUB 3 subscribes to something then XPUB gets a subscription event and then sends out a subscription on XSUB so PUB gets a subscription event too. So applications need to be able to generate and send event messages over the network. It's not just events generated by ZMQ internally as a reaction to a change in the sockets state. Now lets look at a more general case of what that would mean for other sockets and connected events in general: REQ 1 --\ REQ 2 ---= ROUTER2 REQ 3 --= ROUTER-DEALER / REQ 4 -/ When REQ 1 connects to ROUTER2 it gets a connected event with one data frame containing the identity of REQ 1. When REQ 3 connects to ROUTER it gets a connected event with one data frame containing the identity of REQ 2. So far that is all internal in zmq. But the proxy should then forward that event to the DEALER socket so the ROUTER2 gets a connected event with two data frames containing the identity of DEALER and REQ 3 respectively. That way ROUTER2 then knows that REQ 3 has connected and is reachable via the DEALER. Of course wether the proxy even gets events at all and wether it forwards those is left to the application. But sometimes you want/need that. Does that make sense? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Full-Dublex communication, some questions.
On Wed, Jul 09, 2014 at 12:40:55PM +0200, Kurt Degiorgio wrote: Hi, I am looking to implement a system, where the server controls a number of agents and send them processing tasks, the agents will then update the server (asychnrously) of the progess of the task and finally send the result of the task to the server. Each server can have multiple agents, and each agents can have multiple servers. I was looking at a dealer/router, the server won?t just randomly push tasks on to the clients (hence why pub/sub won?t work)it needs to route them to agents according to how busy the agents are and the task at hand (some tasks require specific agents to be completed) Some questions: 1. Is this the right approach? (using dealer/route) That depends on more factors that you gave. From what you said so far it would work though. 2. Will the agent be able communicate with the server asynchronously? (full dublex) That depends on you. You need to use async I/O. See 3. 3. Regarding recv, is there some possible event-based system in place? (i.e I give zeroMQ a callback and every time data is received zeroMQ calls that callback) because from my perspective constantly polling zeroMQ for data is not very efficient and will require a dedicated thread. There are several such systems, depends on what language you use as well. But they are addons to libzmq and not itself part of libzmq. For exampl the high-level C bindings for zmq (http://czmq.zeromq.org/) have a zloop module. The low-level solution to this problem though is to zmq_poll() the socket(s). This will block untill there is activity on one of the sockets or a timeout is reached. It doesn't just loop over all sockets continiously checking them for activity in a busy loop. So efficiency is not an issue. Thanks! Kurt. MSG Goswin DISCLAIMER The information contained in this electronic mail may be confidential or legally privileged. It is for the intended recipient(s) only. Should you receive this message in error, please notify the sender by replying to this mail. Please do not read, copy, forward or store this message unless you are an intended recipient of it - unauthorized use of contents is strictly prohibited. Unless expressly stated, opinions in this message are those of the individual sender and not of GFI. While all care has been taken, GFI is not responsible for the integrity of and that of the contents of this electronic mail and any attachments included within. DISCLAIMER This is a public mailinglist. The intended recipient(s) are the world, the galaxy, the universe, the multiverse. Any information posted is out there for anyone to see and any expectation of privacy is non-existant. All use of contents is allowed and eventually will happen somewhere, sometime. While all care has been taken, this sender is not repsonsible for readers without a sense of humor. ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] RFC: PPPP - Paranoid Pirate Publishing Protocol
On Wed, Jul 02, 2014 at 06:09:04PM +0200, Pieter Hintjens wrote: I believe it's been mooted before, and ZMQ_STREAM sockets do this, sending null messages to signal when there's a new client connection and/or a disconnected client. On Wed, Jul 2, 2014 at 2:08 PM, Goswin von Brederlow goswin-...@web.de wrote: On Tue, Jul 01, 2014 at 06:23:41PM +0200, Pieter Hintjens wrote: On Tue, Jul 1, 2014 at 4:46 PM, Goswin von Brederlow goswin-...@web.de wrote: 1) with ROUTER/DEALER I don't get a message when a subscriber disconnects without unsubscribing. Eventualy the heartbeat will get it but that can take a long time and a lot of messages can queue up inbetween for no good reason. There is no other way, in practice. Even if libzmq reports the TCP disconnect, there are cases where the network connection will block and die without reporting an error. You must eventually use heartbeats. You can tune these to a few seconds. But with tcp there is a way. And (X)PUB/(X)SUB do use it. I know it isn't totaly reliable since it can't catch kernel or network crashes in a timely fashion. But 99.999% of cases will properly close the tcp socket. The goal is to catch the common case early while still handling the exceptional one with heartbeats. So I tried using a monitor socket. But that only gives me the FD and address of the receiving socket: No use IMO. 2) with XPUB/XSUB the CURVE metadata seems to get lost zmq_msg_gets() always returns Null it seems. I think the problem is that zmq::xpub_t::xread_activated just appends the message data to its internal std::deque. A ROUTER socket on the other hand attaches the pipe to its fair queue. Should this be rewritten to use a fair queue too? That would make sense, yes. -Pieter MfG Goswin I can think of 3 ways to implement this (across all socket types): 1) Sending a message with identity (or command?) flag set containing the identity and wether it is a connect or disconnect as frames. For backwards compatibility that should only be send when a socket option is set, default to off. 2) Include the identity in the DISCONNECT event on the monitor socket. Again some flag would have to be there for backward compatibility as it would increase the DISCONNECT event by one frame. Since the identity is set only after the connection is established I think this couldn't be used to announce identities as part of an existing event, right? 3) Add 2 new monitoring events: IDENTITY_SET and IDENTITY_DISCONNECTED. For this the event mask would act as compatibility flags. This would probably be the cleanest solution. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] PUB/SUB question with IPC
On Wed, Jul 02, 2014 at 10:01:18PM +0200, Pieter Hintjens wrote: The queue will fill up in the background. On Wed, Jul 2, 2014 at 9:33 PM, Martin Townsend martin.towns...@xsilon.com wrote: Hi Pieter, Thanks for the swift reply. I'll give it a go. Another quick question, the process could decide to do something else for minutes maybe even hours, would this upset the subscriber or even the publisher or would it just fill up it's receive queue? I just want to know whether it would be better to implement a thread to just process the measurements or my preference would be to just set the subscriber queue to something like 4 and then when the process needs to just empty the queue like you suggested and keep the most recent result. Best Regards, Martin. Wouldn't it also fill up the sockets in-kernel buffer on both the receivers and sender side and the zmq queue on the sender side? Only then the sender side would start dropping messages. So when your worker comes back it will have a large backlog of messages. Unless your messages are large the sockets in-kernel buffer will hold more than 4 messages and they will be old. Just calling recv() till there are no more messages and only using the last will still give you an old message for the first few times after an hour. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] RFC: PPPP - Paranoid Pirate Publishing Protocol
On Tue, Jul 01, 2014 at 06:23:41PM +0200, Pieter Hintjens wrote: On Tue, Jul 1, 2014 at 4:46 PM, Goswin von Brederlow goswin-...@web.de wrote: 1) with ROUTER/DEALER I don't get a message when a subscriber disconnects without unsubscribing. Eventualy the heartbeat will get it but that can take a long time and a lot of messages can queue up inbetween for no good reason. There is no other way, in practice. Even if libzmq reports the TCP disconnect, there are cases where the network connection will block and die without reporting an error. You must eventually use heartbeats. You can tune these to a few seconds. But with tcp there is a way. And (X)PUB/(X)SUB do use it. I know it isn't totaly reliable since it can't catch kernel or network crashes in a timely fashion. But 99.999% of cases will properly close the tcp socket. The goal is to catch the common case early while still handling the exceptional one with heartbeats. So I tried using a monitor socket. But that only gives me the FD and address of the receiving socket: No use IMO. 2) with XPUB/XSUB the CURVE metadata seems to get lost zmq_msg_gets() always returns Null it seems. I think the problem is that zmq::xpub_t::xread_activated just appends the message data to its internal std::deque. A ROUTER socket on the other hand attaches the pipe to its fair queue. Should this be rewritten to use a fair queue too? That would make sense, yes. -Pieter MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Speed of Subscriptions
On Tue, Jul 01, 2014 at 09:44:49PM -0500, Charles Remes wrote: Should take less than a second assuming a LAN or other low-latency network. cr On Jul 1, 2014, at 3:32 PM, Johnny Lee j...@peaceatwork.org wrote: Hello, I have a question about how fast an average workstation can subscribe to its publishers using the Pub/Sub method. We have developed a client application (Publisher) that sends simple, pre-formatted message to receivers (Subscribers). When the receiver program launches and at specified intervals, it subscribes to its list of publishers. As our program may have a receiver listening to 100's of senders, the receiver needs to subscribe to those hundreds of publishers first. As those publisher stations may be coming on and off at through out the day, we were going to run this process every 15 minutes or so. But how long would that take? I realize that average workstation may have different meaning but say a standard Windows workstation that can be found in a typical office environment (Example: Intel Core i3-2130 CPU @ 3.40 GHz; 6 GB RAM; 64-bit Operating System) trying to subscribe to 300 publishers. 5 seconds, 10 seconds, 40 seconds, 1 minute? Please let me know if you need any clarification. thank you, Johnny The successfull subscribtions should be fast. The connection attempts to the publishers that are down will take time though and will keep retrying (see connection interval in setsockopt). So when a publisher comes online it can take a little while till the next connection attempt. But certainly less than your 15 minuites. Actually don't run the process every 15 minutes. Just let it run all the time. It will stay connected to publishers that are online, disconnect from publishers that go offline and reconnect to publishers that come online automatically. If all peers are in a local network then you might consider using (e)pgm to just shout the info out there regardless of who is subscribed. Or have a central server collect the data from all publishers and rebroadcast it. Would save you from opening 300 connections for every subscriber. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] RFC: PPPP - Paranoid Pirate Publishing Protocol
On Mon, Jun 30, 2014 at 05:12:08PM +0200, Pieter Hintjens wrote: CURVE works over all socket types, when using TCP or IPC (not when using PGM). On Mon, Jun 30, 2014 at 3:48 PM, Goswin von Brederlow goswin-...@web.de wrote: Hi, I came up with an extension to the Paranoid Pirate Protocol [4] for use in a PUB/SUB pattern. The protocol should work with ROUTER/DEALER or XPUB/XSUB sockets I hope. With ROUTER/DEALER the publisher has to manage sending messages to all peers itself, with XPUB/XSUB zmq will handle that part. With ROUTER/DEALER the resends can be more specifically targeted. Questions: - Would this work with epgm? - Does XPUB/XSUB work with CURVE? MfG Goswin I've run into 2 problems trying to implement this: 1) with ROUTER/DEALER I don't get a message when a subscriber disconnects without unsubscribing. Eventualy the heartbeat will get it but that can take a long time and a lot of messages can queue up inbetween for no good reason. So I tried using a monitor socket. But that only gives me the FD and address of the receiving socket: EVENT_DISCONNECTED: 13 on b'tcp://0.0.0.0:1234' So how do I find out that FD 13 belongs to peer xyz? 2) with XPUB/XSUB the CURVE metadata seems to get lost zmq_msg_gets() always returns Null it seems. I think the problem is that zmq::xpub_t::xread_activated just appends the message data to its internal std::deque. A ROUTER socket on the other hand attaches the pipe to its fair queue. Should this be rewritten to use a fair queue too? MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] RFC: PPPP - Paranoid Pirate Publishing Protocol
Hi, I came up with an extension to the Paranoid Pirate Protocol [4] for use in a PUB/SUB pattern. The protocol should work with ROUTER/DEALER or XPUB/XSUB sockets I hope. With ROUTER/DEALER the publisher has to manage sending messages to all peers itself, with XPUB/XSUB zmq will handle that part. With ROUTER/DEALER the resends can be more specifically targeted. Questions: - Would this work with epgm? - Does XPUB/XSUB work with CURVE? MfG Goswin -- - Paranoid Pirate Publishing Protocol ## The Paranoid Pirate Publishing Protocol () defines a reliable publish-subscribe dialog between a publisher and subscribers. covers publishing, subscribing, ACK/NACK of messages, heartbeating, reconnecting and custom messages * Name: rfc.zeromq.org/spec:???/ * Editor: Goswin von Brederlow goswin-...@web.de License === Copyright (c) 2011 iMatix Corporation. Copyright (c) 2014 Goswin von Brederlow goswin-...@web.de This Specification is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This Specification is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, see http://www.gnu.org/licenses. Change Process == This Specification is a free and open standard[2] and is governed by the Digital Standards Organization's Consensus-Oriented Specification System (COSS)[3]. Language The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119[1]. Goals = The Paranoid Pirate Publishing Protocol () defines a reliable publish-subscribe dialog between a publisher and subscribers. covers publishing, subscribing, ACK/NACK of messages, heartbeating, reconnecting and custom messages. The goals of are to: * Allow the publisher to detect disconnection of a subscriber, through the use of heartbeating. * Allow a subscriber to detect disconnection of the publisher, through the use of heartbeating. * Allow subscribers to detect message dropouts, through sequence numbers. * Allow the publisher to know messages were delivered, through ACK/NACK feedback. * Allow custom messages from subscribers to the publisher. Architecture Roles - defines two types of peer: * A publisher sends messages to subscribers and receives ACK/NACK, subscriptions and custom messages. * A subscriber subscribes at the start and listens for messages from the publisher and sends ACK/NACK feedback. It can also send custom messages to the publisher. Overall Conversation connects a single publisher and a pool of subscribers. We do not specify which peers connect to which, but usually subscribers will connect to the publisher. A conversation consists of two intermingled dialogs, as follows ('P' represents the publisher, 'S' the subscriber): Synchronous dialog: Asynchronous dialogs: - - S: SUBSCRIBERepeat: P: SYNC (optional) S: HEARTBEAT Repeat: Repeat: P: PUBLISH P: HEARTBEAT S: ACK/NACK (periodically) Ocasionally: S: CUSTOM P: PUBLISH (optional) Breaking this down: * The subscriber initiates the conversation by subscribing. * The publisher syncs state with the subsriber. * The publisher publishes messages. * The subscriber replies with an ACK/NACK, and this repeats indefinitely. * The subscriber sends HEARTBEAT at regular intervals to the publisher. * The publisher sends HEARTBEAT at regular intervals to the subscriber. * The subscriber can send CUSTOM messages to the publisher. The publisher may or may not publish a reply. The first message in any conversation MUST be S:SUBSCRIBE. Command Specifications == SUBSCRIBE Consists of a 1-part message containing the single byte 0x01 followed by the channel to subsribe to. A subscriber can subscribe to multiple channels. SYNC Consists of either a full state dump or a backlog of messages. Wether full state dump or backlog is used (or neither) is application specific. PUBLISH Consists of a multipart message containing a channel as first frame, a channel specific
Re: [zeromq-dev] REQ to many REP.
On Mon, Jun 23, 2014 at 09:24:56PM +0100, Riskybiz wrote: I'd like to set up a 0MQ REQ-REP arrangement where there are many REP sockets connected to just one REQ socket. The actual number of REP sockets is unknown at design-time; however at run-time a list of the port addresses will be provided to the code running the REQ socket. What I'd like is that, somehow, the code running the solitary REQ socket will loop through the list of port addresses and connect to the multiple REP sockets as they bind and become available. Subsequently the REQ socket would work with (poll??) the established connections to send and receive messages as necessary. It would be simpler to bind the REQ socket and connect the REP sockets. Send and receive just works and since you only have one socket there is no need to poll. It's the first time I've tried this. Looking at the zmq_poll reference in the manual http://api.zeromq.org/3-2:zmq-poll it's unclear to me whether I can handle variable number of connections in an iterative manner. For example in: http://zguide.zeromq.org/cpp:mspoller the poll set is hard coded. Poll handles sockets. Connects are handled transparently internally according to the strategies listed in the docs for each socket type. Another question is can a REQ socket handle multiple connections? How best could a message be routed to the desired destination REP socket? Is some more advanced pattern necessary here ROUTER, BROKER?? REQ has a round-robin outgoing strategy. So you can't route any message, libzmq will pick one for you. What I'm trying to achieve is a REQ-REP flow to act as a command/control layer which will coordinate a PUSH-PULL socket pair. There will be a run-time flexible number of PUSH sockets but always just one PULL socket. Is anyone able to offer any guidance to clear my muddy thoughts on how to make this work? Am coding in C++. Start by describing what you want to do. Not the solution you think is right. With thanks, Riskybiz. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] ZMQ_STREAM does not receive 16KB but only 8KB, possible information disclosure
On Tue, Jun 24, 2014 at 02:25:48PM +0200, Mathias Hablützel wrote: Hi everyone, I ran into the issue that sending more than 8KB of data with ZMQ_STREAM (yeah I know, zmq is not intended for that ??? anyway) that on the receiver side it gets truncated. PoC https://gist.github.com/0x6d686b/16f79e092156dae223c9 If you look in the memory dump you'll see that at 0x2000 (or 8196 bytes) it changes from received payload to pre-initialised memory, and also that the received payload get's split in two part of 8196 bytes. IMO this MAY result in leaking sensitive information (information disclosure) if the server side would just reply with the received payload (like ping does). The bug is in your code. Line 77 in server.c should read: hexdump (buffer, recveived_bytes); If you ignore the amound of data actually received that is your problem. I also suggest to document this in the manpage of zmq_socket ZMQ_STREAM that the biggest batch size is 8KB. Mathias It is interesting that you get 8KB chunks. I would have expected 64KB or 128KB chunks as the limit. Anyway, what you have here are two things: 1) ZMQ_STREAM is a byte stream and gets send out over tcp in chunks of whatever MTU the tcp connection negotiates. With ethernet, unless you have gigabit ethernet with jumbo frames, this will be far less than 8KB. Your test uses localhost so it won't be ethernet at all and simply pass the data around in-kernel. So in this special case you don't get the stream cut into even smaller chunks. 2) A socket has a limited receive buffer. A receive will never get more than the full buffer size. If the default buffer size is 8KB on your system then that is what you get as maximum. You can get less than the full size though if not enough data has arrived yet. As a side note: Never use ZMQ_STREAM between 2 ZeroMQ apps. It is simply a hack to support connecting to or provide existing non-ZeroMQ interfaces. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] API changes from 4.0 to 4.1
On Mon, Jun 23, 2014 at 08:06:30PM +0200, Peter Kleiweg wrote: Pieter Hintjens schreef op de 23e dag van de zomermaand van het jaar 2014: On Fri, Jun 20, 2014 at 6:44 PM, Peter Kleiweg pklei...@xs4all.nl wrote: A socket connect for these two addresses return no error. With version 4.0 they both return the error invalid argument: tcp://localhost:invalid tcp://in val id:1234 That's strange, and not normal. I'll investigate. The Go version of test_security_curve fails. This could be due to various things. I don't think we modified the curve test case since 4.0. Can you investigate? test_security_curve.cpp is identical for 4.0.4 and 4.1.0 (except for comments). test_connect_resolve.cpp is different. The two invalid tcp-addresses above are supposed to return an error in 4.0.4, but not in 4.1.0. The behaviour has changed when authentication (ZAP) fails. The connecting socket then has no outgoing pipe causing zmq_msg_send() behaviour to change (blocks forever iirc). In the libzmq tests a send timeout was added to the bounce helper to handle this. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Timeout issue with concurrent send and recv
On Wed, Jun 25, 2014 at 12:56:14PM -0400, br...@openbazaar.org wrote: If I have two servers talking to each other and each listens and send via the same port and they send messages at the same time to each other will this create a race condition? I'm seeing a timeout in this situation with my servers. Should zeromq servers that act as client and server listen and send via different ports? Can that even happen? One side must bind and one side must connect. If both are on the same system they can't have the same port. The connecting side will get a random port anyway and the kernel will not give you one that is already bound or won't let you bind to a port already used otherwise. As for listening, I assume you mean receving, and sending on the same port, I assume you mean socket, that is how it usualy works. But not all ZMQ socket types are bidirectional. Check the docs and look into zmq_poll(). MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Running under valgrind shows a lot of possible data races
On Thu, Jun 26, 2014 at 08:03:08AM +0400, Dmitry Antipov wrote: I've tried to run hwserver and hwclient examples (taken unmodified from zguide) under 'valgrind --tool=helgrind' and see a lot of Possible data race during read of size ... errors (?). Can someone please explain them? Obviously hwclient and hwserver don't share sockets between threads, and all valgrind reports traces down to zeromq internals. Dmitry I believe that is an implementation detail of the internal queues. There is a race between the writer and reader of a queue. But no matter what order the writer and reader run data is neither lost nor duplicated. So yes, there is a race, but both possible orderings produce correct behaviour. Unless there is some bug there. You should trace down the race, verify that the behaviour is actualy intended that way and then annotate the code for helgrind to not show that specific race in the future. Repeat till you get no more reports. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Running under valgrind shows a lot of possible data races
On Thu, Jun 26, 2014 at 12:03:20PM +0200, Pieter Hintjens wrote: There are various false positives with different valgrind tools. You can maybe catch these in the suppression file? On Thu, Jun 26, 2014 at 10:43 AM, Goswin von Brederlow goswin-...@web.de wrote: On Thu, Jun 26, 2014 at 08:03:08AM +0400, Dmitry Antipov wrote: I've tried to run hwserver and hwclient examples (taken unmodified from zguide) under 'valgrind --tool=helgrind' and see a lot of Possible data race during read of size ... errors (?). Can someone please explain them? Obviously hwclient and hwserver don't share sockets between threads, and all valgrind reports traces down to zeromq internals. Dmitry I believe that is an implementation detail of the internal queues. There is a race between the writer and reader of a queue. But no matter what order the writer and reader run data is neither lost nor duplicated. So yes, there is a race, but both possible orderings produce correct behaviour. Unless there is some bug there. You should trace down the race, verify that the behaviour is actualy intended that way and then annotate the code for helgrind to not show that specific race in the future. Repeat till you get no more reports. MfG Goswin The supression file is what I ment with annotate the code. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] auth and metadata in Go (ZeroMQ 4.1.0)
On Thu, Jun 26, 2014 at 11:06:06AM +0200, Peter Kleiweg wrote: I wrote some code in Go to deal with metadata. Here is an example of how to use it: http://godoc.org/github.com/pebbe/zmq4#example-AuthStart This makes the most sense to me. Any suggestions? Note: I prefer functions/methods/variables starting with lower case and types starting with Upper case. What is AuthStart() supposed to do? Is that a binding for zauth from czmq? Why isn't that calles ZAuthNew and returns a struct ZAuth? Am I right to assume that when you send a multi-part message to a socket with authentication (or is it authorisation?), then each frame gets the same metadata attached? So I can read it from the first frame, and don't have to bother with metadata from the other frames? Since all frames come from the same connection they will all have identical metadata. So yes, read it from the first frame for a multi-part message. MfG Goswi ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] RFH: PUB/SUB + REQ/REP + PPP combo needed
Hi, I'm need a combination of PUB/SUB and REQ/REP with some form of PPP (Paranoid Pirate Protocol) added in the mix and I wonder how to best do this. I have something in mind but I don't want to influecne your thinking. So lets look at it fresh from the outside. Peers: -- - I have a central master (M) that acts as a controler for a large number of workers, interface to a maria DB and internal config settings. - I have a large number of workers (W) connected to the master with heartbeat so the master knows what workers are online. Take that part as given. - I have a small number of clients that users start/stop at any time. A client is a frontend for configuration and running jobs on the workers (through the master). Message traffic: 1) A client send simple requests to the master, e.g. set config BAR=foo. The master should ACK the request if it is correct or NACK on error. I can make those message idempotent I think. So a client can resend a request till it gets an ACK or NACK back. Simple requests are synchronous, atomic and fast. If they aren't done in 1s then something is wrong. 2) The master tells all clients that a config option has changes, now BAR=foo. That message must not be droped or clients get out of sync. 3) The master tells all clients that a worker is now online/offline (same as 2 but different source). 4) A client sends a work order to the master, e.g. run date on worker beo-[1-5]. The master should ACK the request, sends it to the respective workers. Each worker ACKs the command, sends output for the command as it appears and finaly sends a FINISHED for the command including the exit status. The worker output needs to be forwarded to the client. When all workers have send their FINISHED the master sends a finale ALLFINISHED to the client. So this a complex req/rep pattern with many async replies for a single request. Idealy other clients should be able to subscribe to a running work order too. Requirements: - - must handle network outage - must handle crash/restart of master (pending requests must not be lost) - must handle crash of client (pending requests can be lost) - clients should use a single socket to make port configuration and tunneling easy So what are your ideas or recommendations? Note: This uses Python3 and latest libzmq/pyzmq and on the client pyside (QT). MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] How to view current SUBs and their filters from PUB side?
No. I mean http://api.zeromq.org/4-0:zmq-socket-monitor MfG Goswin On Thu, Jun 12, 2014 at 04:55:58PM +0800, ? wrote: Hi MfG, Do you mean use zmq_proxy() to monitor inbound messages of a XPUB socket? - Zhichang 2014-06-12 16:06 GMT+08:00 Goswin von Brederlow goswin-...@web.de: You can get infos through a monitoring socket and through a ZAP handler. MfG Goswin On Thu, Jun 12, 2014 at 10:44:49AM +0800, ? wrote: I'm looking for debug and/or log information. Thanks for your suggestion! - Zhichang 2014-06-12 10:25 GMT+08:00 Michel Pelletier pelletier.mic...@gmail.com : That would break the transport abstraction in 0mq, not all transports are IP based. If you're thinking of using IP addresses as a whitelist/blacklist security mechanism, I would suggest looking at the curve security stuff instead. IP based security is easily defeated. http://curvezmq.org/ If you're looking to just log connections, you can get that from the OS, for example on Linux: http://gr8idea.info/os/tutorials/security/iptables5.html -Michel On Wed, Jun 11, 2014 at 7:16 PM, ? yuzhich...@gmail.com wrote: Hi Michel, XPUB looks good. However I want more details info: the SUBs' address info(IP + TCP port). Is that possible? Regards, Zhichang 2014-06-12 9:40 GMT+08:00 Michel Pelletier pelletier.mic...@gmail.com: Subscriptions are received in the form of incoming messages you get by calling recv on the xpub socket. http://api.zeromq.org/3-2:zmq-socket Same as ZMQ_PUB except that you can receive subscriptions from the peers in form of incoming messages. Subscription message is a byte 1 (for subscriptions) or byte 0 (for unsubscriptions) followed by the subscription body. -Michel On Wed, Jun 11, 2014 at 6:26 PM, ? yuzhich...@gmail.com wrote: I don't see routines to dump xpub_t::subscriptions. Regards, Zhichang ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] on scalability of PUB/SUB and PUSH/PULL
On Wed, Jun 11, 2014 at 02:05:13AM -0700, Jun Li wrote: Hi, I am using PUB/SUB socket pattern to distribute commands from the coordinator to the many worker processes, and I also have the PUSH/PULL to have each worker process to push the processing results to the coordinator. The coordinator is bound to the PUB socket and also the PULL socket, with the current context to set to 1 thread. In my test environment, there would be one single coordinator process and up to 200 worker processes. I have just started the scalability testing. But it seems that with 15 worker processes, the end-to-end communication latency is about 15 ms, for the coordinator to distribute (via PUB) the commands and finally aggregate the results back (via PULL) from the worker processes. But when I increased the number of worker processes to 50, I then observed the end-to-end communication latency of about 80 ms. This implies that as the number of the worker processes grow, the latency also grows and thus brings up the scalability issue. You can hardly say anything with just to points. Is that a linear increase? exponential? logarithmic? Does is jump between 49 and 50? Does it stay at 80ms up to 10 workers? The message size communicated between the coordinator and the worker processes are not that big, less than 100 Bytes. While I am planning to measure the latency spent on each hop, I would like to seek suggestions: *for a large number of the worker processes to be handled by a single coordinator with low latency, should the context at the coordinator be set to 1 thread? *Should I use the other socket pattern such as Router/Dealer, instead of pub/sub and push/pull, in order to address the scalability issue? Regards, Jun Personally I think that if you depend on latency then you always have a problem. That will be your bottleneck and seriously harm scalability. You need to pipeline your work, send out more jobs ahead of time while the workers are still busy with the last job. That way the latency gets combletly absorbed and becomes irelevant. MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] How to view current SUBs and their filters from PUB side?
You can get infos through a monitoring socket and through a ZAP handler. MfG Goswin On Thu, Jun 12, 2014 at 10:44:49AM +0800, ? wrote: I'm looking for debug and/or log information. Thanks for your suggestion! - Zhichang 2014-06-12 10:25 GMT+08:00 Michel Pelletier pelletier.mic...@gmail.com: That would break the transport abstraction in 0mq, not all transports are IP based. If you're thinking of using IP addresses as a whitelist/blacklist security mechanism, I would suggest looking at the curve security stuff instead. IP based security is easily defeated. http://curvezmq.org/ If you're looking to just log connections, you can get that from the OS, for example on Linux: http://gr8idea.info/os/tutorials/security/iptables5.html -Michel On Wed, Jun 11, 2014 at 7:16 PM, ? yuzhich...@gmail.com wrote: Hi Michel, XPUB looks good. However I want more details info: the SUBs' address info(IP + TCP port). Is that possible? Regards, Zhichang 2014-06-12 9:40 GMT+08:00 Michel Pelletier pelletier.mic...@gmail.com: Subscriptions are received in the form of incoming messages you get by calling recv on the xpub socket. http://api.zeromq.org/3-2:zmq-socket Same as ZMQ_PUB except that you can receive subscriptions from the peers in form of incoming messages. Subscription message is a byte 1 (for subscriptions) or byte 0 (for unsubscriptions) followed by the subscription body. -Michel On Wed, Jun 11, 2014 at 6:26 PM, ? yuzhich...@gmail.com wrote: I don't see routines to dump xpub_t::subscriptions. Regards, Zhichang ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] RFC: finer control of socket type / behaviour
On Tue, Jun 03, 2014 at 12:19:53PM +0200, Pieter Hintjens wrote: It's a valid use case yet you're describing a one-to-one pattern, so perhaps PUSH/PULL or DEALER-DEALER. ROUTER is specifically for servers talking to many clients. There is little sense in returning an EAGAIN there. Servers can't treat individual clients favorably or they tend to crash in unpleasant ways. The different socket types do have different semantics, for good reasons. Sure. I can use PUSH or DEALER. But both of them block when the HWM hits. That would stop the monitoring. So the problem remains. For a ROUTER example. How about transfering files to a number of clients? You can't use (just) PUB/SUB because you need the clients to ACK to ensure file transmission is complete. So ROUTER/DEALER seems better than having both PUB/SUB to send files and PUSH/PULL for the ACKs (assume you aren't doing epgm). So you send out a large file to each client. If one of the clients stalls you would want to stop sending it more data until it comes back. Maybe that example isn't so convincing since due to the ACKs you have a simple means to do some credit based flow control and avoid hitting the HWM in the first place. Anyway. I don't feel quite satisfied with the restriction on the action in mutet state based on socket type and haven seen other complain about it too. I guess I should stop bitching and start writing a patch. Maybe it is such a simple change that it isn't worth fighting about. After all it wouldn't add any new mute state behaviour, just more flexibility which behaviour is used. MfG Goswin On Tue, Jun 3, 2014 at 9:52 AM, Goswin von Brederlow goswin-...@web.de wrote: On Mon, Jun 02, 2014 at 01:11:21PM +0200, Pieter Hintjens wrote: Returning EAGAIN on a full pipe might be a good improvement, though it's unclear how an app could use this. Blocking seems problematic as it exposes the app to failure when a single peer stops reading its messages. An app that does multiple things and needs to remain responsive would want EAGAIN. For example we want to monitor things every host in a cluster collecting data every minute. Now say a switch dies as they ocasionally do and the collected data can't be send. The app needs to keep monitoring so blocking is not an option. Droping is also bad since that would leave gaps in the monitoring. Instead if send returned EAGAIN the app could store the data locally for later submission. Or it could thin out and compress the data. Instead of keeping every dataset for every minute in memory only keep every second minute, every 5 minutes, every hour. The amount of memory can thus be limited without leaving a total blackout in the data. The same example could also work with blocking mode and send timeout. In an app with high traffic flow occasionaly hitting the HWM could be normal and resolve itself after a second or two. Then blocking with a send timeout would be the better option. Basically a delayed EAGAIN so the app doesn't have to do anything if it outpaces the available bandwith for s second. I agree that dropping messages is rather brutal in this case. However it's also the most robust policy. You should perhaps not be sending unlimited messages to a peer. There's suggestions in the Guide for credit-based flow control that rates how much gets sent to any single peer. -Pieter On Mon, Jun 2, 2014 at 11:19 AM, Goswin von Brederlow goswin-...@web.de wrote: On Wed, May 28, 2014 at 10:07:38AM +0200, Pieter Hintjens wrote: This has been mooted before and I think it's a good idea in some ways. Certainly to allow experimentation. However the current patterns do kind of cover the sane use cases. It's hard to see what the point would be, for instance, of blocking in a ROUTER socket when you can't send a message. There's no real sense to that. If you want to experiment with this, you can build custom socket types on top of ZMQ_ROUTER and virtualize them, e.g. like the CZMQ zactor model. Then if you get patterns that work well you've got arguments for pushing this into libzmq. -Pieter Except I can't change the Action in mute state frop Drop to Block or EAGAIN. That means that messages get silently lost when one sends to fast or there is a temporary hickup in the network connection. E.g. when I reset a switch in the network I want messages to block till the switch is back. MfG Goswin MfG Goswin ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman