Re: [zeromq-dev] Sometime fails zmq 3.1 with latest czmq on windows
Pieter, Do you have any example or instruction on how to property writing log to file file from C? We do not want to interrupt the libzmq on live production system. Cheers, Viet On Jun24, 2012, at 1:21 PM, Pieter Hintjens wrote: Viet, Sorry for the slow response. To debug such problems we absolutely need a stack trace of some kind. It should be possible to change the definition of zmq_assert in src/err.hpp to log the error properly, on Windows. -Pieter On Wed, Jun 20, 2012 at 4:48 AM, Viet Hoang (Quant Edge) viet.ho...@quant-edge.com wrote: We have a critical issue with ZMQ right now. The connection type is ZMQ DEALER = ZMQ ROUTER Our code structure is C#_BIZ = CZMQ = LIBZMQ. We wrote our own c# wrapper. It works fine but out of nowhere causes assertion every day or two. We run as windows service so we could not log the assertion info out for debugging. The system was fine with 200-250 connections, 400-500 messages per second (on router socket). Now concurrent connections are 500-600, total mps is 1000-1200 and it starts showing cracks. The fail is at context level, as other workers running on the same server will stop working. We have to manually restart all services. Cheers, Viet ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Sometime fails zmq 3.1 with latest czmq on windows
On Mon, Jun 25, 2012 at 4:46 PM, Viet Hoang (Quant Edge) viet.ho...@quant-edge.com wrote: Do you have any example or instruction on how to property writing log to file file from C? We do not want to interrupt the libzmq on live production system. I don't have code examples right here, but you might write to the Windows event log. In any case this is when a process is crashing, so it's not going to interrupt anything. You can configure the service to restart automatically, right? -Pieter ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Client crash triggers zmq_abort in publisher
Michel, It seems that if our client crashes (unrelated to 0MQ) it sometimes takes down the publisher. Below is the stacktrace of the issue. It's critical for us the publisher keeps running when a client crashes. Martin Hurton has pushed a patch for this issue to the libzmq master. Could you test that and confirm that it fixes the problem? Thanks Pieter ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Client crash triggers zmq_abort in publisher
Wow thats fast! I'm still working on creating a usable test case for this :) I will test the patch and let you know. Michel - Original Message - From: Pieter Hintjens p...@imatix.com To: Michel Polder michel.pol...@quality-it.com, ZeroMQ development list zeromq-dev@lists.zeromq.org Sent: Monday, June 25, 2012 7:10:10 AM Subject: Re: [zeromq-dev] Client crash triggers zmq_abort in publisher Michel, It seems that if our client crashes (unrelated to 0MQ) it sometimes takes down the publisher. Below is the stacktrace of the issue. It's critical for us the publisher keeps running when a client crashes. Martin Hurton has pushed a patch for this issue to the libzmq master. Could you test that and confirm that it fixes the problem? Thanks Pieter ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Client crash triggers zmq_abort in publisher
On Mon, Jun 25, 2012 at 6:42 PM, Michel Polder michel.pol...@quality-it.com wrote: Wow thats fast! I'm still working on creating a usable test case for this :) I will test the patch and let you know. Thanks to Martin Hurton :-) A test case that you can use to prove the before/after result would be ideal. You can comment on and/or close the issue at https://zeromq.jira.com/browse/LIBZMQ-389. -Pieter ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Sometime fails zmq 3.1 with latest czmq on windows
We have a broker (major domo) together with few worker instants running on the same server. When the broker fails, it just hangs on there, restarting only broker instant does not solve the problem, we have to restart all services (stop ALL services then start again, not restart each service individually). We figured out the service is stopped due to an assertion on context level, which prevents other services from accessing libzmq. What we are missing is a proper way to log the assertion out to have debug information. I think libzmq by default should log the assertion to file for easier tracing and debug, something like log4c is good. Cheers, Viet On Jun 25, 2012, at 7:05 PM, Pieter Hintjens p...@imatix.com wrote: On Mon, Jun 25, 2012 at 4:46 PM, Viet Hoang (Quant Edge) viet.ho...@quant-edge.com wrote: Do you have any example or instruction on how to property writing log to file file from C? We do not want to interrupt the libzmq on live production system. I don't have code examples right here, but you might write to the EWindows event log. In any case this is when a process is crashing, so it's not going to interrupt anything. You can configure the service to restart automatically, right? -Pieter ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] Sometime fails zmq 3.1 with latest czmq on windows
On 25 June 2012 15:29, Viet Hoang (Quant Edge) viet.ho...@quant-edge.comwrote: We have a broker (major domo) together with few worker instants running on the same server. When the broker fails, it just hangs on there, restarting only broker instant does not solve the problem, we have to restart all services (stop ALL services then start again, not restart each service individually). We figured out the service is stopped due to an assertion on context level, which prevents other services from accessing libzmq. What we are missing is a proper way to log the assertion out to have debug information. I think libzmq by default should log the assertion to file for easier tracing and debug, something like log4c is good. It's Windows: attach a remote debugger through Visual Studio. Alternatively use procdump.exehttp://technet.microsoft.com/en-us/sysinternals/dd996900.aspxon the hung process and import the mini-dump into Visual Studio. It's possible to replace assert() with something that raises a fatal log message and a stack trace but then you have other more convenient means available to utilize first. -- Steve-o ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS
On Wednesday, May 02, 2012 03:27:42 AM Paul Colomiets wrote: Ok. Behind the scenes ZMQ_FD, is basically a counter, which wakes up poll when is non-zero. The counter is reset on each getsockopt ZMQ_EVENTS, zmq_send and zmq_recv. The following diagram shows race condition with two sockets A and B, in a scenario similar to yours: https://docs.google.com/drawings/d/1F97jpdbYMjjb6-2VzRtiL2LpHy638-AEOyrUX84 HL78/edit Note: the last poll is entered with both counters set to zero, so it will not wake up, despite the fact that there is pending message. Was there ever a resolution on this? I am using ZMQ_FD now to integrate into an event loop, and I am seeing some odd behavior when testing a hello world REQ/REP on the REP side. The REP server binds and waits for data. The fd is indicated as readable twice. First, the events are 0 (maybe this happens when the client connects?), then the events are 1 (ZMQ_POLLIN). The server considers the REP socket readable and so it reads a message without blocking. Now it wants to reply, but it considers the socket not yet writable. I was expecting that after reading from the socket, the fd would be indicated as readable and the events would be 2 (ZMQ_POLLOUT). However, this event never comes and so the server just idles. Now here's where it gets weird: if I kill the client (which was also waiting around, as it never got a reply), then the server gets new events with ZMQ_POLLOUT set. This causes the server to finally write its reply to the REP socket, without blocking. Of course there is no client, so this write goes into a black hole. My guess is that the events change with ZMQ_POLLOUT is somehow being backlogged, and the client disconnect helps push the queue another step forward. I found that if, immediately after reading from the REP socket, I query ZMQ_EVENTS, then I can see the ZMQ_POLLOUT being flagged even though I never got a read indication on the fd. Does this mean that maybe I need to check ZMQ_EVENTS not only after read indications on the fd, but also after anytime I call zmq_recv() ? Thanks for any help. ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev
[zeromq-dev] ZeroMQ Tracing
I've been using ETW in Windows and it's pretty neat, I would consider an option to implement Vista's new Winsock Tracing equivalent for ZeroMQ: http://msdn.microsoft.com/en-us/library/windows/desktop/bb892103(v=vs.85).aspx It's cute but not sure the exact value of it. -- Steve-o ___ zeromq-dev mailing list zeromq-dev@lists.zeromq.org http://lists.zeromq.org/mailman/listinfo/zeromq-dev