from:"Paul Colomiets"

Re: [zeromq-dev] [ANN] zmq.rs - native stack of ØMQ in Rust

2014-07-06 Thread Paul Colomiets

Hi,

On Fri, Jul 4, 2014 at 8:08 PM, Pieter Hintjens p...@imatix.com wrote:
 If GitHub had a configurable one-time I agree to submit all my
 patches to this project under license XYZ that would be workable IMO.


There is a way to put contributor guidelines in front of the every
pull request created:

https://github.com/blog/1184-contributing-guidelines

Isn't it all that needed?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] ZMQ dealer not receiving message from ZMQ router with ZMQ_FD

2012-11-04 Thread Paul Colomiets

Hi,

On Wed, Oct 31, 2012 at 9:02 PM, Kah-Chan Low kahchan...@yahoo.com wrote:
 Thanks Paul!
 I followed your advice and checked for ZMQ_EVENTS after each each zmq_send()
 and it worked!

 I am still puzzled.
 1. Checking for ZMQ_EVENT somehow resets the trigger, even if there is no
 message to be read.  Why is this necessary?

All of the zmq_send(), zmq_recv() and ZMQ_EVENT, reset trigger. This
is how internals of zeromq work.

 2. ZMQ router does not need checking ZMQ_EVENT after zmq_send() for the read
 trigger to work properly.  Why?


I believe every socket needs that. The reason it works for you is just
because it happens to be no messages to be received when you do
zmq_send(). For instance, this is a case  when you send single request
at a time (requester sends a request, then sends next only when reply
from previous received).

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] ZMQ dealer not receiving message from ZMQ router with ZMQ_FD

2012-10-31 Thread Paul Colomiets

Hi,

On Wed, Oct 31, 2012 at 5:03 AM, Kah-Chan Low kahchan...@yahoo.com wrote:
 I have read about the caveats of using ZMQ_FD so once an event is triggered,
 I do use ZMQ_EVENTS to test for ZMQ_POLLIN before calling zmq_recv()
 I also make use that I read all messages off a socket once a read event i
 triggered.
 [...]
 I have read that some people had similar problems and they were advised to
 keep reading until EAGAIN before calling select(). I can't do that since the
 ZMQ dealer is only one of a number of sockets owned by the thread and any
 one of them may receive a message at any time.


I'm not sure how those two paragraphs are connected. Do you check for
ZMQ_EVENTS until it gets becomes unreadable, instead of doing zmq_recv
until EAGAIN?

If you are, that's ok. You must also check for ZMQ_EVENTS after each
zmq_send, before calling select.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] announcing Zurl

2012-09-26 Thread Paul Colomiets

Hi Justin,

On Wed, Sep 26, 2012 at 8:04 AM, Justin Karneges jus...@affinix.com wrote:
 Hi folks,

 I want to share a project I've been working on called Zurl. It's a server with
 a ZeroMQ interface that makes outbound HTTP requests. Think of it like the
 inverse of Mongrel2 or Zerogw. This is the project I was discussing earlier on
 the list that needed two input sockets.

 I've made it open source:
   https://github.com/fanout/zurl

 Introduction article here:
   http://blog.fanout.io/2012/09/26/make-http-requests-over-zeromq-with-zurl/

 Not much in the way of docs at the moment, but sample scripts in the tools
 subdir give you an idea of what's possible. Feedback welcome.


Nice thing. I'm seeking for something similar to do benchmarks for my
web applications. However, most our applications are websocket-driven.
Any chance websockets will be supported?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] obtain generated identity of socket

2012-09-24 Thread Paul Colomiets

Hi Justin,

On Tue, Sep 25, 2012 at 1:08 AM, Justin Karneges jus...@affinix.com wrote:
 Protocol flow goes like this:
   1) client pushes a start request that gets picked up by one of the server
 instances
   2) server pubs a clear to send response that includes a reply address
 field, containing the value of the in_stream identity
   3) client sends a series of messages to in_stream, addressed using the reply
 address

As your transfers seems big, I think it's ok to open separate (push)
socket for each transfer. (Or just have a separate socket connected to
each server).
I.e. in the clear to send message server gives and address where to
connect to for sending message.

Creating sockets for each transfer is antipattern if you are doing it
hundred times per second, but for your use case  it may be cleaner
than router to router connection.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] High water mark notification for publisher

2012-09-23 Thread Paul Colomiets

Hi Edwin,

On Sun, Sep 23, 2012 at 6:23 AM, Edwin Amsler
edwinams...@thinkboxsoftware.com wrote:
 I have a mechanism out of band over TCP that re-requests pieces once the
 transfer is done, but I'm never actually sure when it's done sending so I
 just wait 1 minute before re-requesting.

 If I had some indicator of whether or not the message goes missing, I could
 re-transmit or throttle back the 500MB/s to what the network is actually
 able to provide.


I think you can also have an out of band pub-sub over TCP to send acks
from clients to server. You need some way to count acks (e.g. clients
may send presence notification before start of a transfer). But it may
give you a way to limit bandwidth in more automatic way.

You may also look at ZMQ_RATE and ZMQ_RECOVER_IVL options, they should
(theoretically) give you reliability without out of band socket to
request data, however I don't know how well that works in practice.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Odd numbers with zeromq

2012-09-19 Thread Paul Colomiets

HI Maninder,

On Wed, Sep 19, 2012 at 6:21 AM, Maninder Batth whatpuzzle...@gmail.com wrote:
 Paul,
 With messages being sent one way, via pub and sub sockets, i am getting a
 very decent performance. About 80% of our network gets saturated.
 The code is zserver.cpp and zclient.cpp

 But if i configure the software such that client only sends the next
 message, after it has received a response from the server, the throughput
 is really bad.
 The code is zserver-ack1.cpp  and zclient-ack1.cpp
 The difference is that in the former case, i can get 110k messages per
 second , whereas in the latter case,  i can only get 1k messages per second.
 The sockets that i use in latter case are of type REQ and REP. Am i using
 wrong sockets type ?


1. Measure latency (time for single roundtrip). If roundtrip is about
1ms (which is ok for most networks i think) then you can certainly do
no more than 1k roundtrips per second
2. You can saturate bandwidth with multiple clients (even if you need
110 ones, each client can be an asynchronous one)
3. You will saturate bandwidth with larger messages (and at least two
clients if that's full duplex network :) )


-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Odd numbers with zeromq

2012-09-18 Thread Paul Colomiets

Hi Maninder,

On Tue, Sep 18, 2012 at 10:21 PM, Maninder Batth
whatpuzzle...@gmail.com wrote:
 Paul,
 Here is number of messages as seen by the server in one second. Each message
 is 1024 byte excluding tcp/ip and zmq headers. Based on these numbers and i
 am getting a throughput of 1.4 Gb/sec.
 Enclosed is the source code for the server and the client.


Zeromq closes the message after sending. So you effectively send
messages of the zero length after first one. You should use
zmq_msg_copy (or whatever C++ API is there) before  doing send() in
case you want to reuse message.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] ZeroMQ context threads names

2012-09-17 Thread Paul Colomiets

Hi Samuel,

On Mon, Sep 17, 2012 at 11:49 PM, DATACOM - Samuel Lucas
samuellu...@datacom.ind.br wrote:
 Hi,

  I'm writing a linux application that uses zeromq. I'm also writing
 a cpu usage monitor for my application that take in account all threads
 (reading from /proc/pid/tasks/thread id/stat). I would like to set
 the thread name (prctl, PR_SET_NAME) for my zeromq threads so I can
 easily identify it.

  Is there any way to do this?


As it's very linux specific, you can probably just do write to a
/proc/pid/task/thread id/comm. I think all threads are started
when you call zmq_init. So it should be easy to identify them.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] zmq_monitor

2012-09-14 Thread Paul Colomiets

Hi Paul,

On Thu, Sep 13, 2012 at 1:39 AM, Justin Cook jhc...@gmail.com wrote:
 On Wednesday, 12 September 2012 at 23:16, Paul Colomiets wrote:
 In my opinion it's wrong to provide callback interface for ZMQ_MONITOR
 in scripting languages.

 Thoughts?
 Scripting language is fairly vague. Python is an interpreted language that is 
 used in large codebases. What's wrong with providing a callback for 
 ZMQ_MONITOR?


AFAICS, the callback is called directly in IO handling thread. So it
hurts performance a lot.

And by scripting language I mean language with much larger overhead
of calling function than in C, and may be with GIL :)

 MinRK has a valid point. Is it just too much work and/or expensive to put 
 this on sockets and not the context? It makes 100% sense to put this in  
 individual sockets versus the entire context.


Yes and no. Please describe use cases, to prove the point. For logging
case the context option is better. It's also better to force
developers not doing business logic in the callback.

Providing logging has a side effect of establishing a standard for
logging zeromq errors, and being single setting to turn on, instead of
writing callbacks. It's also future proof in case zeromq will add more
monitor events (pyzmq will add a message and there will be no need to
change an application).

Providing statistics has a benefit of being able to collect statistics
with C (i.e. faster) without writing C code by end user.

So what are use cases for callback?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Odd numbers with zeromq

2012-09-14 Thread Paul Colomiets

Hi ,

On Thu, Sep 13, 2012 at 7:33 PM, Maninder Batth whatpuzzle...@gmail.com wrote:
 2. Clients which publish without needing ack:
 In this use case, a client publishes data as fast as it can in one direction
 and server simply discards the output. Enclosed are files zserver.cpp and
 zclient.cpp which accomplishes it.
 What is puzzling to me is that i have 1Gb network, but based on the numbers
 published by the zclient.cpp, i am able to publish 5GB in one second?
 With message size of 1KB, i am able to publish 5157783 messages per second.
 How is this possible ?

You can publish at any speed. But do you check what is read at other
end? Pub socket just discards message internally if the output queue
is full.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Problem with Intergation of libzmq 3.2 with libev

2012-09-14 Thread Paul Colomiets

Hi,

On Thu, Sep 13, 2012 at 12:37 PM, Tejaswi, Saikrishna
saikrishna.teja...@in.verizonbusiness.com wrote:
 Can you kindly validate these changes  and see if there are any
 repercussions in sending the activate_read even if the reader_thread is
 active


AFAICS, you've made zeromq thousand times slower (speculating, no
benchmarks done). What you really need is:

1. Read all messages from input queue until EAGAIN is returned from
zmq_recv* call
2. Read all messages after each zmq_send* call

We are using libzmq with libev for years in zerogw
(http://zerogw.com), an I'm doubt something changed in zmq 3.2.
Zerogw isn't easiest piece of code to read, but it may help.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] finding the right message pattern

2012-09-12 Thread Paul Colomiets

Hi David,

On Tue, Sep 11, 2012 at 3:21 PM, David Kaufman david.kauf...@gmx.de wrote:
 Hi ZeroMQ Community,

 I'm currently working on a small project where multiple machines
 output data from a recording device (e.g. audio data). Note that a
 recording device is just another zeromq capable client. A master
 machine collects the output of each recording device. The recording
 device is selected, initiated and stopped by the master machine. To
 sum up, the master has to communicate with each recording devices in a
 bidirectional fashion.

 The request/replay message pattern works but produces unnecessary
 overhead since the master machine hast to reply to every data packet
 the recording device sends. Is there a cleaner solution, perhaps a
 paired socket? Do you have any suggestions?


Countrary to other proposal I advice to use two patterns:

1. PUB-SUB for recording, where recording device is publisher and
master is subscriber
2. Either ROUTER to REP or PUB-SUB (where master is a publisher) for
control info

It looks like more clean design and allows you to scale by having
failover/redundant master servers, introducing devices (proxies) and
so on.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] zmq_monitor

2012-09-12 Thread Paul Colomiets

Hi Benjamin,

On Thu, Sep 13, 2012 at 12:48 AM, MinRK benjami...@gmail.com wrote:
 Hello,

 We've been investigating adding support for the experimental new monitor
 functionality in pyzmq, and I have some questions.

 The callback is per-context, which makes no sense to me.  If I wanted to
 monitor socket events on one socket out of one hundred, I have only two
 choices: make the callback aware of which socket(s) it cares about, and try
 to make minimize the disruption when called on the 99 sockets I don't care
 about (troublesome in Python, as acquiring the GIL from the io_thread is
 already problematic), or register monitored sockets with one Context, and
 non-monitored sockets with another.


1. All the conditions, that trigger monitor callback are not at hot places.
2. They are not intended for business logic, so the only thing you should
probably do is find a name for the socket and log the message

 Can someone explain why the monitor is per-context and not per-socket?  I
 saw on the list that it was added to the Context for a cleaner interface
 than cramming it into setsockopt.  Is there any reason that
 zmq_ctx_set_monitor is preferable to zmq_socket_set_monitor, which would
 solve the exact same problem, without a fundamental change in how it is
 meant to work?


Not that I support this implementation, but it's ok for me.

For python implementation, I think you should provide something along
the options:

1. Write monitoring message to another socket (possibly inproc)

2. Write message to the specified logger directly (as you need to hold
GIL at the moment of writing, and python logging is quite heavyweight,
this option may be implemented in another thread and inproc socket)

3. Collect number of occurrences of each of the event in the socket
object, so it can be periodically inspected by main loop and sent to
statistics collection software

In my opinion it's wrong to provide callback interface for ZMQ_MONITOR
in scripting languages.

Thoughts?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Using libsdp and zeromq

2012-08-27 Thread Paul Colomiets

On Mon, Aug 27, 2012 at 4:28 PM, Michael Rissi
michael.ri...@dectris.com wrote:
 Hello zeromq devs!

 I tried to use zeromq with preloaded libsdp over infiniband. This
 unfortunately fails. The reason for it is the
 SOCK_CLOEXEC flag set in ip.cpp. Libsdp checks the socket type to be exactly
 SOCK_STREAM. If
 it is not the case (as in zeromq, where the socket type is type =
 SOCK_STREAM | SOCK_CLOEXEC),
 it will fall back to use TCP/IP. This can be found in line 681 of port.c in
 the libsdp library (download under
 http://www.openfabrics.org/downloads/libsdp/ )

 Being clever in zeromq and unsetting HAVE_SOCK_CLOEXEC will not help, as the
 zeromq server will
 crash sooner or later when exiting a client.


Nothing will crash. It will leak socket if you will run exec call or
fork then exec. Which is usually avoidable in zeromq apps. To remove
memory leak you can use one of the techniques, described here:

http://stackoverflow.com/questions/899038/getting-the-highest-allocated-file-descriptor


-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Keeping processes up and running

2012-08-16 Thread Paul Colomiets

On Thu, Aug 16, 2012 at 12:56 PM, andrea crotti
andrea.crott...@gmail.com wrote:
 Now I have many nice processes that do their job communicating via
 zeromq socket, and it's all very nice.

 But how do you try to ensure the fact that they should be running all
 the time?

 For example I would like to have always between 50 to 75 workers
 running, but I also want to be able to stop them all or some of them,
 and restart only the needed number I want..

 Is it maybe better to try to handle all these things from the operating
 system or from another manager process?

Yes you need a process manager, and it's not related to zeromq. There
is one from me:

https://github.com/tailhook/procboss

It is suited to run many similar processes (it lacks docs, so feel
free to contact me privately to ask). There are also a lot of them:
runit, daemontools, supervisord, sysvinit, systemd, upstart, launchd,
just to name a few. AFAICS, none of them suited to run tons of similar
processes, but YMMV.

And Salt has no process supervising, has it? It can run a service or
check is it alive, but is also not suited to run lots of identical
workers. Please, point me to the docs if I'm wrong.

--
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Designing a new architecture

2012-08-08 Thread Paul Colomiets

On Tue, Aug 7, 2012 at 4:47 PM, andrea crotti andrea.crott...@gmail.com wrote:
 So now I tried the following, the smart process runs a subprocess in
 background, the samplecmd should send the LIST query, but it just
 hangs there, and I don't get any answer..
 Is there anything missing (can't find anything there)?


Are you building something like Salt? (http://saltstack.org) I think
you should try salt, or look at it's codebase.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] [ANNOUNCE] zerogw mailing list

2012-08-04 Thread Paul Colomiets

Hi Thomas,

On Thu, Aug 2, 2012 at 7:33 PM, Thomas S Hatch thatc...@gmail.com wrote:
 This looks really cool, is it production ready? I might use it with Salt


We are running it in production for few months, without any big
problems. Of course it's not as well tested as nginx or apache.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

[zeromq-dev] [ANNOUNCE] zerogw mailing list

2012-08-01 Thread Paul Colomiets

Hi,

Due to recently grown interest in zerogw, I've setup the mailing list.

zer...@googlegroups.com

Feel free to join. For thouse who don't know what zerogw is, here is a
short description:

Zerogw is an fast HTTP server with backend protocol that uses zeromq
as transport. Zerogw grown from web game industry, so it supports
websockets with fallback to longpolling, and a few very powerful
websocket-related features which help to write very efficient chat,
gaming and other near real-time applications.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] If not Majordomo for asynchronous workers, then what?

2012-07-29 Thread Paul Colomiets

Hi Ivan,

On Sat, Jul 28, 2012 at 5:34 PM, Ivan i...@blackpx.com wrote:
 Modifying MDP and the implementations will be simpler than writing
 your own from scratch.

 Right. I believe what I describe above is the correct modification to the
 Broker. I have to understand the magic by which the broker knows how to route 
 a
 message back to the client. It would be a great addition to MDP if I could 
 call
 a function that, given a message, would break the multipart message for me. 
 This
 way I could hash the route to the client based on some key, in my case, 
 OrderID.


There are actually two schools here:

First, proposes to modify the broker, as you described.

Second, requires to decompose the communication into patterns. In your
case it's as follows:

1. You have two connections: requrest/reply, and pub/sub
2. Before doing request, you create and OrderID, and you subscribe on
that OrderID in pub/sub
3. Than you do the request. There is no reply until the whole order is processed
4. The order status updates are sent by pub/sub
5. When everything is done, you receive final reply on request

I usually choose the latter approach, for the following reasons:

1. The solution consists of two clearly defined patterns
2. It leverages zeromq ability to introduce intermediate nodes (as it
uses bare patterns)
3. It allows to run brokerless when setup is small (e.g. for development)
4. It allows to use simple request-reply for clients which do not need
intermediate statuses
5. It allows another peer/application to follow intermediate statuses
(so gives more flexibility to the client)


-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Notify send()er that they've hit the high water mark

2012-07-10 Thread Paul Colomiets

Hi Edwin,

On Mon, Jul 9, 2012 at 11:24 PM, Edwin Amsler
edwinams...@thinkboxsoftware.com wrote:
 So here I am, publishing messages through ZeroMQ's send() function at about
 300MB/s, and my network's set to only send at 10MB/s.

 This is kind of a big problem because, while I don't care if the clients
 loose data on their end, the server is either using its memory until it
 crashes, or I'm setting its high water mark and loosing about 29 of every
 thirty messages I produce because I don't know that ZeroMQ can't keep up.

 Ideally, when a HWM condition happened, send() would return false, then I'd
 test EAGAIN so I could decide for myself whether I should drop the message,
 or retry later. With that kind of functionality, I could throttle back my
 producer algorithm so that I exactly meet the demand of ZeroMQ instead of
 overwhelming/starving it out.

 I'm willing to do the work if this sort of addition makes sense to the rest
 of the project. I'd rather contribute here instead of forking it off in some
 forgotten repository.

 Can/should this be done? Is there someone out there willing to mentor me?


The behavior is intentional for pub/sub sockets. If you'd have only
one subscriber you could use push/pull. Push sockets block when reach
high water mark, so are Req sockets.

The pub/sub sockets can't reliably block in general case because there
could be multiple subscribers, only one of which reaches high water
mark. Partially this comes from implementation: when you do zmq_send()
zeromq starts to push message to the pipe for each connected
subscribers, if one of the pipes full, there is no way to rollback
messages already put into pipes, if it would it's not clear whether
single slow subscriber should stop the publisher.  Also pgm sockets
can't have backpressure AFAIU.

You can try to implement the behavior as a socket option (it can't be
default behavior), but you have to be aware of the problems above (And
I don't know if core developers are willing to accept the patch).

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Websockets as a Transport ?

2012-07-05 Thread Paul Colomiets

Hi Mark, Ian,

On Thu, Jul 5, 2012 at 12:44 AM, Ian Barber ian.bar...@gmail.com wrote:
 I’m evaluating zeroMQ, and one main criteria we have is to support
 Websockets for client to server communications.



 Has there been any work on making a native Websocket transport for ZeroMQ
 in the core C++ libraries ?


 No work that made it anywhere near head, but as Apostolis said there are
 plenty of bridging methods. My personal favourite is Paul's ZeroGW:
 https://github.com/tailhook/zerogw


Thanks Ian, I appreciate your respect :) However, zerogw is designed
with browser clients in mind. Mark's use case may be a bit different.
Although, I would support usage of zerogw for other applications, but
I'm not going to implement outbound websocket connections at the
current stage of evolution of zerogw, so you need some complementary
client implementation.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Releasing RAM in Chunks

2012-07-03 Thread Paul Colomiets

Hi Anatoly,

On Mon, Jul 2, 2012 at 9:48 PM, Anatoly tolit...@gmail.com wrote:
 ØMQ Crowd,

 Reading messages from a socket, placing them on the zmq (PUSH). On the
 other side reading messages of off the queue (PULL) and persisting them in
 to DB.

 If we get millions of messages, ØMQ takes X GB of RAM (since the pushing
 in this case is at much higher speed than pulling that waits for a DB for
 each pull), and does not release these Gigs until ALL of the messages are
 consumed (e.g. pulled).

 Is there a way to configure it to release memory in chunks as the queue
 is being emptied?


Unless you do something wrong, as others described. The effect you see
may be because of memory fragmentation, so as last resort you can try
to link your app against jemalloc, as there is evidence that jemalloc
handles fragmentation better.

In any case please report what fixed the problem.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] HWM behaviour blocking

2012-06-29 Thread Paul Colomiets

Hi Justin,

On Thu, Jun 28, 2012 at 9:06 PM, Justin Karneges jus...@affinix.com wrote:
 It's really just for functional completeness of my event-driven wrapper. The
 only time I can see this coming up in practice is an application that pushes a
 message just before exiting.

 For now, I set ZMQ_LINGER to 0 when a socket object is destroyed, making the
 above application impossible to create. What I'm thinking of doing now is
 offering an alternate, blocking-based shutdown method. This would violate the
 spirit of my wrapper, but may work well enough for apps that finish with a
 single socket doing a write-and-exit.


I think you should just set linger and use it. zmq_close() doesn't
block. The zmq_term() blocks. And usually starting an application has
much bigger overhead than sending a message. So in the case of
starting application, doing request(send) and shutting down, this
delay is probably negligible (unless your data is too big and/or
network is overloaded).

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] HWM behaviour blocking

2012-06-28 Thread Paul Colomiets

Hi Justin,

On Thu, Jun 28, 2012 at 8:50 AM, Justin Karneges jus...@affinix.com wrote:
 On Thursday, May 10, 2012 01:53:48 PM Pieter Hintjens wrote:
 On Thu, May 10, 2012 at 3:44 PM, Paul Colomiets p...@colomiets.name wrote:
  Can you be more specific, why setting HWM to 1 is a bad thing? Do you
  mean, that it smells bad to set HWM to 1 for reliability? Or do you
  think that setting it will have other consequences? (low performance?)

 it's bad because you're trying to force a synchronous model on an
 asynchronous system, and doing it at the wrong level. If you really
 want synchronization you MUST get some upstream data from the
 receiver. Just throttling the sender cannot work reliably.

 I'm about to set HWM to 1 and I recalled a thread about this so I've looked it
 up. Totally agree about what's been said so far. The reason I want to do this
 is because I need a way for an event-driven application to determine if data
 has been written to the underlying kernel. This is useful in case the
 application wants to close the socket immediately after writing data. In a
 traditional blocking application, this is easy: just call zmq_close() and
 it'll unblock when done. However, in an event-driven application, the only way
 I can think of imitating this functionality is by setting HWM to 1 and waiting
 until ZMQ_EVENTS indicates writability, then calling zmq_close().


Why you need zmq_close in the asynchronous application in the first
place? Is your application very connection hungry? We never close
zeromq sockets even on fairly low traffic connections, and it works
nice.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-06-27 Thread Paul Colomiets

Hi Justin,

On Tue, Jun 26, 2012 at 2:16 AM, Justin Karneges jus...@affinix.com wrote:
 On Wednesday, May 02, 2012 03:27:42 AM Paul Colomiets wrote:
 Ok. Behind the scenes ZMQ_FD, is basically a counter, which wakes up
 poll when is non-zero. The counter is reset on each getsockopt ZMQ_EVENTS,
 zmq_send and zmq_recv.

 The following diagram shows race condition with two sockets A and B,
 in a scenario similar to yours:

 https://docs.google.com/drawings/d/1F97jpdbYMjjb6-2VzRtiL2LpHy638-AEOyrUX84
 HL78/edit

 Note: the last poll is entered with both counters set to zero, so it
 will not wake up, despite the fact that there is pending message.

 Was there ever a resolution on this?

 I am using ZMQ_FD now to integrate into an event loop, and I am seeing some
 odd behavior when testing a hello world REQ/REP on the REP side.

 The REP server binds and waits for data. The fd is indicated as readable
 twice. First, the events are 0 (maybe this happens when the client connects?),
 then the events are 1 (ZMQ_POLLIN). The server considers the REP socket
 readable and so it reads a message without blocking. Now it wants to reply,
 but it considers the socket not yet writable. I was expecting that after
 reading from the socket, the fd would be indicated as readable and the events
 would be 2 (ZMQ_POLLOUT). However, this event never comes and so the server
 just idles.

 Now here's where it gets weird: if I kill the client (which was also waiting
 around, as it never got a reply), then the server gets new events with
 ZMQ_POLLOUT set. This causes the server to finally write its reply to the REP
 socket, without blocking. Of course there is no client, so this write goes
 into a black hole.

 My guess is that the events change with ZMQ_POLLOUT is somehow being
 backlogged, and the client disconnect helps push the queue another step
 forward. I found that if, immediately after reading from the REP socket, I
 query ZMQ_EVENTS, then I can see the ZMQ_POLLOUT being flagged even though I
 never got a read indication on the fd.

 Does this mean that maybe I need to check ZMQ_EVENTS not only after read
 indications on the fd, but also after anytime I call zmq_recv() ?


I've not tried REP sockets with asynchronous event loop (XREP usually
needed). But I'm pretty sure, you're right. You need to recheck
ZMQ_EVENTS after doing zmq_recv(), as the state of the socket changes
at that time (it's not writable before not because of network issues
but because of state machine).

However, checking ZMQ_EVENTS after each zmq_recv and zmq_send is
needed anyway, as described in current documentation and in this ML
thread. And it doesn't sounds like possible to change in any future
version of zeromq.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-06-27 Thread Paul Colomiets

Hi,

On Wed, Jun 27, 2012 at 10:57 PM, Justin Karneges jus...@affinix.com wrote:
 On Wednesday, June 27, 2012 12:44:45 PM Paul Colomiets wrote:
 On Tue, Jun 26, 2012 at 2:16 AM, Justin Karneges jus...@affinix.com wrote:
  Does this mean that maybe I need to check ZMQ_EVENTS not only after read
  indications on the fd, but also after anytime I call zmq_recv() ?

 I've not tried REP sockets with asynchronous event loop (XREP usually
 needed). But I'm pretty sure, you're right. You need to recheck
 ZMQ_EVENTS after doing zmq_recv(), as the state of the socket changes
 at that time (it's not writable before not because of network issues
 but because of state machine).

 Yeah I understand the ability to write is part of the state change that occurs
 by reading. I just wonder why the ZMQ_FD isn't triggered internally by
 zmq_recv(). That would have been more intuitive I think.


For performance reasons: its cheaper to call zmq_getsockopt, than to
write to, poll and read from fd.

 However, checking ZMQ_EVENTS after each zmq_recv and zmq_send is
 needed anyway, as described in current documentation and in this ML
 thread.

 In which document is this described? I do not see this in the ZMQ_EVENTS
 section of the zmq_getsockopt man page in 2.2.0.


It's seems it was late for 2.2, but it is in current master both for
2.x and 3.x series.

 In any case, thanks for clarifying. I'd actually gone ahead and changed my
 code to check ZMQ_EVENTS after all three scenarios (post zmq_recv, post
 zmq_send, and upon read indication of the ZMQ_FD), and that managed to get
 things to work properly.


Nice.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Profiling and 0mq

2012-06-18 Thread Paul Colomiets

Hi,

On Thu, Jun 14, 2012 at 1:29 PM, hp010170 hp010...@gmail.com wrote:

 In case of [1], as you may know in REQ/REP examples, it is not just a
 matter of retrying, since that returns EFSM due to the strict
 alternating pattern; unless I have read the manual pages incorrectly.


Have you tried to retry request? If you can't retry request after
EINTR, than its a critical bug in zeromq, and you should file a bug in
jira.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] ZMQ_FAIL_UNROUTABLE functionality change / BC

2012-06-17 Thread Paul Colomiets

Hi Ian,

On Sun, Jun 17, 2012 at 10:46 AM, Ian Barber ian.bar...@gmail.com wrote:
 Hi all,

 This is re: https://github.com/zeromq/libzmq/pull/383

 I just merged in a commit from Andrey that changes the behavior of the
 ZMQ_FAIL_UNROUTABLE sockopt - it is renamed to ZMQ_ROUTER_BEHAVIOR (though
 with the same value), and now causes an EAGAIN rather than EHOSTUNREACH. I
 believe this functionality was only in 3.x, and since we're still on the RC
 this may be fine, but this may be a BC break, particularly for any bindings
 that were checking for EHOSTUNREACH or referencing the constant.


It's probably OK for doing the change at this stage. But is the
feature designed carefully? The problem is that many asynchronous
loops are probably do start polling on EAGAIN, and retry when poll
succeeds. Which basically means they will go into tight loop until
peer appears again (which is probably never).

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] ZMQ_FAIL_UNROUTABLE functionality change / BC

2012-06-17 Thread Paul Colomiets

Hi Ian,

On Sun, Jun 17, 2012 at 8:07 PM, Ian Barber ian.bar...@gmail.com wrote:
 Reasonable thought - it is only triggered by setting the sockopt, so
 presumably someone doing that will think through the implications (and be
 more likely to have the peer actually reappear if they've specifically asked
 for the message not to be silently dropped), though I think you are right
 that it might not play too nicely with some existing reactors, which could
 lead to user confusion.  Hard to say really - I don't have a strong opinion
 one way or the other. From Andrey's pull req it looks like he was
 specifically doing this because bindings tend to handle EAGAINs differently
 that other types of errors, so there is that side to think on. I guess
 having a third option that would return EHOSTUNREACH as opposed to EAGAIN
 would probably help those with reactors, so at least they would have the
 option of throwing that instead of being unable to distinguish between that
 an other EAGAIN cases.


I think that documenting Never use this option if you don't know the
internals of your reactor might help. But it looks strange. But
without it people may submit a bugs like zeromq eats 100% cpu.

I don't think two similar options is a good solution either.

The pull request description says:
 '1' forces the socket to fail with an EAGAIN error code on
 unroutable messages, enabling the library user to use
 blocking sends, sends with timeout and other useful stuff
 while waiting for the peer to appear, like with the other
 socket types.

Does this mean, that side-effect of this patch is that blocking send
will block until peer is up again? (Or it just returns EAGAIN?)

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Statistics protocol (ESTP) v0.2

2012-06-13 Thread Paul Colomiets

Hi,

On Wed, Jun 13, 2012 at 9:42 PM, Schmurfy schmu...@gmail.com wrote:
 Latest version looks fine :)
 I will try implementing it on one project tomorrow see if I find other
 potential problems.


Nice. I will wait a week or so until declaring 1.0. And will try to
implement more stuff during this time too.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

[zeromq-dev] Last call for Extensible Statistics Transmission Protocol (ESTP) v1.0

2012-06-11 Thread Paul Colomiets

Hi,

I'm going to pronounce protocol version v1.0, as its now complete, and
most issues discussed here are fixed. Now it's an opportunity to find
last issues, before freezing the specification.


The last changes are:

* Type is denoted by colon and english letter, instead of cryptic character

* There is x type marker that allows experimental types


Spec is now in github organization, which should aggregate projects
using the protocol:

https://github.com/estp/estp/blob/master/specification.rst

I'll be happy if someone could fix grammar, as I'm not native English
speaker (We are not at IETF yet, so will fix grammar later if needed,
but now is a better time).


There is also collectd plugin implementation (it supports v0.2 proto
at the time of writing, but will be updated soon):

https://github.com/estp/collectd-zeromq-estp


After protocol is declared stable, the following steps will be next:

* Define collectd extension, which allows to transfer data losslessly
between collectd instances (and implement it in plugin)

* Implement collectd plugin for crossroads (mostly same as zeromq one)

* Define a recommendation for the metric names to use for various
application classes and propose to their communities, HTTP servers
being the first in my plan

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Statistics protocol (ESTP) v0.2

2012-06-10 Thread Paul Colomiets

Hi Schmurfy,

On Sun, Jun 10, 2012 at 1:11 PM, Schmurfy schmu...@gmail.com wrote:
 I am really not sure on this, while I like sensible defaults when writing
 code to speed things up
 I prefer protocols to clearly say what they try to express, if there is a
 default I agree that gauge
 would be the most common type but it introduces one problem for me: you have
 to deal
 with the presence or absence of this field and as you mentioned the number
 of fields
 will change between a gauge value and a counter value. How do you handle
 multiple flags scenario without separator ?
 43btgh would really be ugly to parse.


Well, I consider it type, not flags. So the type is only one. Being it
non-letter gives ability to use letters to describe flags, or
parameters easier.

 If you know the value will always be something like:

 23.4|ghty
 (with hty being hypothetical flags and g the type)

 then you can just split the string with | and look for known flags in the
 second part which
 becomes a little harder is the flags are directly added at the end of the
 value (34.56ght) since
 in this later case you need to do some magic to separate what is two
 different informations for me.

I've tried to implement a collectd plugin:

https://github.com/estp/collectd-zeromq-estp

It was easy. Even in C. In scripting languages there are even more
tools (e.g. regular expressions).

 If there is a separator (the pipe here) then having gauge as default bother
 me less because a quick
 split will reveals there is no flags defined without further operations.

 I agree mostly on the extensibility but if you wanted to add more options to
 the value the logical choice
 would be to add other one letters flags in this case.


No, the important point is that lazy parser can skip parameters, and
almost always be right.

 Until now I never needed to add much to the value itself and its type,
 knowing counters size may be
 useful to detect overflow though but I never worked with counters.

This is also the reason why gauge is default. Most people will just
use it, without thinking about types.

 I think the real question is what informations we could want linked to a
 value, allowing anything and everything
 to be added leter may not be the more efficient way ;)
 (that's funny because some parts of the discussion start to look like low
 level concerns for zeromq /libxs)


I've added forward compatibility section. Can you look through?


P.S.: Latest spec is now at
https://github.com/estp/estp/blob/master/specification.rst

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Statistics protocol (ESTP) v0.2

2012-06-08 Thread Paul Colomiets

Hi Pieter,

On Fri, Jun 8, 2012 at 4:11 AM, Pieter Hintjens p...@imatix.com wrote:
 Paul,

 I've made a pull request with a few spelling/grammar fixes.


Thanks a lot.

 Looks interesting, would be fun to see some running code.


Sure. Will be done shortly.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Statistics protocol (ESTP) v0.2

2012-06-08 Thread Paul Colomiets

Hi Marten,

On Fri, Jun 8, 2012 at 2:24 PM, Marten Feldtmann itli...@schrievkrom.de wrote:
 I can work with this specification, tough these symbols seems to be
 indeed strange in a readable protocol.


Described the reasoning in another message. Feel free to discuss if I'm wrong.

 I'm in the process to implement this specification in our Smalltalk
 based system.


Great. I will probably start a collectd plugin soon.


-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

[zeromq-dev] Statistics protocol (ESTP) v0.2

2012-06-07 Thread Paul Colomiets

Hi,

I've updated protocol, based on the feedback. The description is now
mostly full, except the forward compatibility section.

https://github.com/tailhook/estp/blob/master/specification.rst

Any feedback is appreciated.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Statistics protocol v0.1

2012-06-03 Thread Paul Colomiets

Hi Gregg,

On Fri, Jun 1, 2012 at 5:56 PM, Gregg Irwin gr...@pointillistic.com wrote:
 How tightly do you want to couple sources and sinks?

I want to decouple them as much as possible

 e.g., do you need type1 and type2 as part of the protocol?

Yes, in some form proably yes.

 I haven't used RRD, but that's the model, correct?


I don't know what you mean by model. For rrd counter type is
basically a growing counter sent each time (e.g. messages passed
through a gateway). rrdtool subtracts previous value from the current
value to know rate (messages per second), and stores that rate. Its
very convenient way to count several kinds of statistics.

 Starting from a minimum spec, I sometimes write a statement answering
 the WHY question for things included beyond the minimum. For example:


Sure. I've just wanted to have a quick review to know if I'm not
terribly wrong. I will add more explanation in future versions.

  Given a minimum of [name timestamp value]:

  NAME has a trailing delimiter (colon) because name segments can
  contain spaces.


No, segments can't contain spaces. The colon is there to have a way to
subscribe to whole value. E.g. subscribing to example.org:cpu will
also hit example.org:cpu0, which isn't what intended. Having to
subscribe to space-terminated topic is ugly. And there is a tab that
is also a whitespace.

  HOST is there for ...


It's nice to know what host this statistics is for. Having monotonic
sequence of dot-separated items, like in graphite/carbon
example.org.cpu means we don't know where hostname ends. The host
name is essential parameter to group data in a GUI.


  TYPE1 is needed because ... and values must be one of ...

  TYPE2 is used to ... must be ... and indicates ...


Those types mostly come from collectd, and which in turn inherited
types from RRD. Here is the reference:
http://collectd.org/wiki/index.php/Data_source

Actually type1 is only needed when type2 is counter. And it seems that
collectd mostly uses derive and gauge. There are also minimum and
maximum value, which can also influence right interpretation.

Even if we want to force counting of derivatives at sources instead of
at sink. There is a big difference between counter/derive/absolute
(which are mostly similar) and a gauge. When scaling graph with gauge
(say CPU usage) to have per-hour values, you would still have 1 to 100
percent of CPU usage (probably average). When scaling graph with
messages per second (which can be submitted using counter, derive or
absolute), you probably want to see messages per hour instead (the
value is a sum over period and is probably 3600 times bigger).

So I'm a bit confused on the types so far.


 I like ISO8601 timestamps

Me too. It will be in the next version.

 and path syntax (slash as separator) for names.


I want to be able to have a path as a component in the name. e.g.:

ESTP:example.org:disk:system/root:bytes_written:

Where the system/root is lvm partition (/dev/system/root aka
/dev/mapper/system-root).
This is also the reason I've not used dot, like in graphite, as dot is
used in hostnames.


Thanks for feedback!

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Statistics protocol v0.1

2012-06-03 Thread Paul Colomiets

Hi Marten,

On Fri, Jun 1, 2012 at 4:19 PM, Marten Feldtmann itli...@schrievkrom.de wrote:

 I could also life with one name, where I concatenate
 two names before sending and do the reverse work at the collector.

Yes. This is how hostnames usually work. They represent hierarchy with
dot-separated strings. And the means to separate hostname from other
stuff is provided.

 We also consider putting even more complicated data into
 that statistic-telegram and serialize them via TNetString
 (or other fast serialize methods NOT like JSON or XML).


This is what extensions are for in the spec. You can put arbitrary
data there. There is no reason to include in protocol the data that is
not universally understandable.

 What I actually do not like very much: white spaces as seperators in
 specifications.


Why? As you can see in examples, its much easier to parse
space-separated values with sscanf. And being able to parse values in
C means more high-performance tools would be developed.
Space-separation is also nice for human readability.


-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Statistics protocol v0.1

2012-06-01 Thread Paul Colomiets

Hi Marten,

On Fri, Jun 1, 2012 at 9:04 AM, Marten Feldtmann itli...@schrievkrom.de wrote:
 we are actually in the process of implementing a statistic system for
 our C#/Smalltalk application system using ZeroMQ and we have included
 the following information in our telegrams.


Thanks for sharing info. My questions about your use-case below. Would
you like to participate in developing a protocol? And what chance that
your system will make use of it?

 We used an UDP approach (subscribe method) and we had several statistic
 collectors within our network to save the information sent from all our
 0MQ communication nodes.


So you use UDP instead of zeromq with pub/sub? (Or is there zeromq
with UDP somewhere?)
Why you don't use pub/sub?

 We included the following informations in our implementation:

 - subscription filter

What it consists of?

 - location (computer) of the process sending this information
 (ip-number - no name)

Why do you use IP number instead names? I believe
it's internal internal policy in your company or somesuch.
Or is it just debugging info, along with node name?

 - name of the process
 - process id

Is PID just a debugging info, or is it meaningful? (do you aggregate
info from several processes?)

 - start-time of statistic interval (in ASCII to make it more readable
 in a format like: 01.06.2012T21:00:00.000+12 .. in that well known
 format) and including timezone information.

It's interesting from two points:

1. Textual data format may be nicer than unix timestamp. But
I'd prefer UTC only timestamp. As it's not intended to be presented
to user as is (except for debugging), its easier to deal with UTC
timestamps than with diverse timezones.

2. We used to send timestamp at the time of sending, not
the start of interval. Its easier  to produce, and it's more logical
when we send a counter, instead of a rate value (see below).

 - duration length of the statistic interval in milliseconds

This one I've obviously missed. Will add shortly.

 - symbolic name of the statistic producer (0MQ node)
 - sub-symbolic name of the statistic producer (0MQ node)

So, according to the text below, I think that symbolic name
it's DNS name of the node, and sub-symbolic name it's name
of the subsystem, inside of the process. Am I right?

 - number of bytes received/sent in that interval
 - number of telegrams received/sent in that interval


1. The proposed protocol splits the values one per message (its barely
useful for UDP, so some bulking should be probably implemented there)

2. We used to send counter value instead of bytes in interval. I mean:
you have a long integer counter of bytes, which is always only
incremented, and we calculate rate value by subtracting previous one
from it (yes, wrapping of the counter is also accounted). This is how
collectd usually works. There are pros and cons of both. This should
be discussed more.


All in all, it seems to fit the protocol nice, however, items has to
be sorted out to know what is essential and what is inherently a
debugging info (e.g. ip address and node name are duplicated info),
and what can be put in extension data as it's not universally
understandable.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] zmq_poll and fully asynchronous client-server

2012-06-01 Thread Paul Colomiets

Hi Jigar,

On Fri, Jun 1, 2012 at 3:55 PM, Jigar Gosar jiga...@directi.com wrote:
 *  Java's selector has a wakeup method, which immediately returns
 the selector.select(10ms)  blocking call. I need something similar
 so as to be able to achieve the above goal.


The exact equivalent to wakeup method is creating a pair of sockets.
One you add to a poll and writing to other one will wake up that poll.
You should use inproc:// socket for the task, if you want to do that
from another thread.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] new zeromq plugins for rsyslog

2012-06-01 Thread Paul Colomiets

Hi Brian,

On Fri, Jun 1, 2012 at 8:36 PM, Brian Knox bri...@talksum.com wrote:
 I just wanted to give everyone a heads up that our new zeromq input and
 output plugins for rsyslog are now in the official rsyslog repo, on head in
 the master branch ( http://www.rsyslog.com/doc/build_from_repo.html).
 There's a little info on our company blog about them (
 http://www.talksum.com/blog/ ) and we hope to make time to write up some
 better documentation and example use cases soon.

 The plugins are written against the CZMQ library and we have been testing
 them against libzmq head (soon to become libzmq 3.1).


What does rsyslog do with XPUB and XSUB sockets? Does it works like a
device? Otherwise it seems to have nothing to do with subscriptions
(and that's what makes XPUB/XSUB sockets different from non-X ones,
IIRC). Or does it uses subscriptions to filter logs?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Statistics reporting using zeromq/crossroads

2012-05-30 Thread Paul Colomiets

Hi Schmurfy,

On Wed, May 30, 2012 at 4:44 PM, Schmurfy schmu...@gmail.com wrote:
 I not really sure about using subscriptions, why do you want to use this
 schema ?
 Currently what I have is many collectd sending their data to another
 collectd, the applications sends their statistics to the master collectd for
 the most part (we have embedded systems each with its own collectd
 instance). So for my use the closest pattern would be PUSH/PULL, what is
 your use case for subscriptions ?


1. If you would have multiple processors of statistics data (or
multiple servers of collectd) it's much nicer to have data duplicated,
instead of going to random one at each packet. So it's matches pub/sub
much better.

2. I have a use case where I do business logic based on statistics
(basically load balancing). So I would like to subscribe only for data
needed to execute the business logic.

 After digging into the core of collectd when I implemented my zeromq plugin
 (and a lua plugin too) I really feel like it was designed to handle a really
 large amount of data and/or short cycles and it makes sacrifices toward that
 goal.

 I like libxs/zeromq but I don't think this would fit all need, if you don't
 already use these libraries into your application you may reconsider adding
 them only to provide statistics, in my experiments I tend to prefer allowing
 multiple entry points (that's one reason I like collectd btw). As an example
 sending an udp packet should be fairly easy in any language out there and
 you don't have to maintain a connection, if the application and the collectd
 daemon (or equivalent service) are close or if you consider some loss
 acceptable this is a this is a perfectly valid way to send your statistics
 :)


Changing the underlying protocol, is OK. But what I expect in perfect
world is if I change my webserver from mongrel to nginx, is that I can
get the same value example.org/http.requests from the statistics
even if transport protocol would change from zeromq to websocket, http
or udp.

 One other project you could look at is Graphite
 (http://graphite.wikidot.com/)

 For the naming I usually go with [hostname, application_name, metric_name],
 the metric_name could itself include anything:   [example.org, disk,
 sda1/write_operations].


Makes sense, will try something similar for start.

 Messagepack is great. But it's not very easy to parse in C (without
 dependencies)

 nothing is easy in C :p
 (I like the language but everything takes time with it)


I'm thinking about something that can be parsed by sscanf. Will
publish more concrete proposal soon.


 PS: Do you have a screenshot of jarred ?


I've just put them here:

https://github.com/tailhook/jarred/wiki

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Statistics reporting using zeromq/crossroads

2012-05-29 Thread Paul Colomiets

Hi Schmurfy,

On Tue, May 29, 2012 at 3:35 PM, Schmurfy schmu...@gmail.com wrote:
What I am trying to do is to have a common infrastructure for application
(my applications) and system (disk io, memory used, disk used, ...)
statistics, I shown what the collectd protocol could do but I am not
entirely satisfied with it either, here are my problems with collectd:

- I want to have statistics available as soon as possible: collectd has a
cycle time (default is 10s) and plugins have no way to know when the cycle
starts or end, they are notified when data is received or ready to be sent
but that's all. One problem I had with my zeromq attempt (my original branch
is on github: https://github.com/schmurfy/collectd/tree/zeromq but based on
collectd 4.x) is that you cannot send one frame per cycle which is what I
wanted to do with it (the original network plugins fills a buffer and when
the buffer reach a threshold it sends the packet so your values can be sent
now or at now+cycle time which means 10s later my a 10s cycle).

Well I have no problem with cycle time. We actually don't even flush
the statistics while viewing (which drops latency to about 15
minutes). However, we use collectd only for statistics and use
separate monitoring system (nagios). And actually all
monitoring/statistics systems I've seen have only bigger delay, not
smaller.

Making more tools which support libxs and using SURVEY sockets to get
fresh data, may fix the problem.

- I never used encryption, I consider it can be better handled at lower
level and bonus point is that you don't cripple your code with it

Ok.

- I don't like the need to predefine the types used, it is a pain when
using multiple collectd servers since they need to all have the same
configuration file to understand each other, I prefer a more open way.

Me too.

That
said you don't need to know the types definitions to actually parse the data
stream, the number of parts is in the packet itself, you need the
definitions to match each cell with its label.

Yes. The most obvious way to implement that is use
one packet per value.

- I am not really fond of the plugin/plugin_instance/type/type_instance,
most of the time I don't remember which one is supposed to be what and many
existing collectd plugins do not use all fields so it is more annoying than
anything else.

Me too. But I'm not sure what format is best here. For simple values it's:

example.org/cpu

Then there are plugins:

example.org/processes/zombie

Then there is a namespace inside a plugin (plugin instances in terms
of collectd):

example.org/disk_usage/sda1/free

Then there are complex values:

example.org/disk/sda1/write.operations
example.org/disk/sda1/write.bytes

And there are also host-pair values:

example.org-example.net/ping/round-trip-time
frontend1.tld-www.h1.tld/http/requests
frontend1.tld-www.h2.tld/http/requests
frontend2.tld-www.h1.tld/http/requests

It seems that collectd does things mostry right, except it puts type
into a name of the value. This is one of the things I want to fix.
Also pings usually should be viewed other way around (not by the host
which collects values, but by the host which is pinged), but probably
this case should be fixed in GUI.

Do you have other point of view on naming?

I don't really like text protocols, sure they may seem easier to parse (and
debug since a human can read them) but for statistics which could be sent at
a high rate you waste a lot of space (binary parsers are not that hard to
write). For reference here is my ruby
parser: https://github.com/schmurfy/rrd-grapher/blob/master/lib/rrd-grapher/notifier/parsers/ruby_parser.rb

I don't think of a system where rate of the data is really so big that
statistics can be slow, or bandwidth waste. Statistics are sent at
regular intervals (10s is a pretty big value), so it's easy to
calculate how much bandwidth you need. I'm not strongly against binary
protocol, but to be able to use subscriptions with zeromq you must do
the name of the value textual (at least without length-prefix). I also
do not like the collectd protocol in the following things:

1. It consists of unordered data parts. It's unclear whether order
matters (AFAICS, yes) and whether other fields are preserved after
value field (AFAICS, yes). Also it's unclear how to deal with
under-specified data (when not all fields are present).

2. It allows to concatenate several packets, which is bad for
subscriptions, and is bad when concatenating under-specified packets.

All in all it's easy to write wrong parser for the packets. And this
point is crucial for the wide adoption for the protocol.

I am currently doing some experiments using a hash like structure serialized
with messagepack and udp to transport, since messagepack support a lot of
languages it virtually means any language could serialize/deserialize very
easily the packets but I have not much to show currently since I am in the
early phases of the project.

[zeromq-dev] Statistics reporting using zeromq/crossroads

2012-05-27 Thread Paul Colomiets

Hi,

Every time I develop something using zeromq or libxs I come up with a
problem of how to report statistics. Pub/Sub pattern is suited for
statistics reporting very well, but the format of messages is unclear.
The result is ugly and diverse solutions for reporting real-time
statistics.

I think the community would benefit from having common format for
reporting statistics, so that all open-source software could use the
format. And various plugins can be developed for existing statistics
and monitoring systems.


The protocol should meet the following requirements:

1. Simple compact format. Probably textual or fixed-size binary. It
should be easy to craft messages without any dependencies.

2. Each message should bring compact representation of the type
information in itself (so that centralized system doesn't need to be
reconfigured for each new reported value and each software upgrade on
remote system)

3. Subscriber should be able to filter statistics data by pub/sub subscriptions

4. There should be solution to collect that statistics (draw graphs, etc.)


So I have the following questions:

1. What formats are available to (re)use?

2. What protocols and software do you use for your projects?

3. Would you like to participate in standardization of the protocol?



At our company we use collectd. And as zeromq plugin is not in the
mainstream for collectd, and it has complex binary format, we report
using some custom formats and a special application that translates
that into collectd protocol for unix sockets (which in turn nice
textual protocol). Same is done in zerogw (see zerogwstat utility).
This kind of proxying, and the point #2 from the requirements makes me
incomfortable.

--
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-05-14 Thread Paul Colomiets

Hi Gerhard,

On Mon, May 14, 2012 at 7:49 PM, Gerhard Lipp gel...@googlemail.com wrote:
 thanks for the diagram! I would like to locate the variables cntA  /
 cntB in source to understand what is going on (and why). Could you
 please point me in the right direction?


Look at src/signaller.cpp. When it's on linux, and eventfd is
supported, the real counter
is inside that eventfd. In other implementations the counter is number
of bytes that
are currently in pipe's buffer. In any case it's value is read inside
signaler_t::recv.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-05-02 Thread Paul Colomiets

Hi Gerhard,

On Wed, May 2, 2012 at 10:11 AM, Gerhard Lipp gel...@googlemail.com wrote:
 hello paul!

 i dont understand the background of your approach. why should the src
 fd's io handler check the dst's events (and vice versa)?

It's the simplest way I've found to solve a problem.

 even if this worked in this scenario, wouldn't it be a coincidence?

No, it's not coincidence.

 well, at least it is better than busy waiting / polling ...

Yes. There are various way to optimize presented code, I've
just picked up something on to of my head.

On Wed, May 2, 2012 at 11:26 AM, Gerhard Lipp gel...@googlemail.com wrote:
 btw, using the build in poller just works:

Yes the builtin poller works in intuitive way.
ZMQ_FD is meant for experts. So if you don't
understand how it works, you can just use
builtin poller without the problems.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-05-02 Thread Paul Colomiets

Hi Gerhard,

On Wed, May 2, 2012 at 12:45 PM, Gerhard Lipp gel...@googlemail.com wrote:

 I really appreciate any help and ideas to solve this issue! I just did
 not get the idea behind this attempt.
 Could you explain it in more detail (something particular to observe)?


Ok. Behind the scenes ZMQ_FD, is basically a counter, which wakes up
poll when is non-zero. The counter is reset on each getsockopt ZMQ_EVENTS,
zmq_send and zmq_recv.

The following diagram shows race condition with two sockets A and B,
in a scenario similar to yours:

https://docs.google.com/drawings/d/1F97jpdbYMjjb6-2VzRtiL2LpHy638-AEOyrUX84HL78/edit

Note: the last poll is entered with both counters set to zero, so it
will not wake up, despite the fact that there is pending message.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-04-27 Thread Paul Colomiets

Hi Gerhard,

On Thu, Apr 26, 2012 at 10:02 AM, Gerhard Lipp gel...@googlemail.com wrote:
 Hello Paul!

 On Wed, Apr 25, 2012 at 8:29 PM, Paul Colomiets p...@colomiets.name wrote:
 Hi Gerhard,

 On Wed, Apr 25, 2012 at 6:08 PM, Gerhard Lipp gel...@googlemail.com wrote:
 i figured out to boil down an example, which shows this bug.
 it consists of three files:
 1) x.lua doing the XREP XREQ stuff, must be started once
 2) rep.lua implementing a simple echo replier, must be started once
 3) req.lua making the request to rep.lua through x.lua. must be
 started TWICE to produce the error. THIS PROCESS LOCKS. uncommenting
 the ev.WRITE is a bad workaround to this issue.

 As far as I can see, it not a workaround. It's just the way ZMQ_FD works.
 Uze zmq_poll if you don't feel comfortable for that. The only way you can 
 change
 that is returning getopt(zmq.EVENTS) instead of hardcoding ev.READ + ev.WRITE


 According to the manual, the fd returned by zmq_getsockopt(ZMQ_FD)
 signals any pending events on the socket in an edge-triggered fashion
 by making the file descriptor become ready for reading.

 If ev.WRITE is required to get all ZMQ_POLLIN and/or ZMQ_POLLOUT
 events, the doc should be clearer. Anyhow, as the source looks like,
 the ZMQ_FD is the fd associated with the socket's mailbox, which is
 used for all kinds communication (state transitions?) inside of ZMQ. A
 selecting/polling user process should not wake up unnecessarily to
 avoid context switches, which are really expensive on our (embedded)
 device. Thus i'd like to minimize the wakeups by just specifying
 ev.READ.


Probably I don't understand the code. You must poll only for reading
on ZMQ_FD. But every zmq_send and zmq_recv cosumes mailbox.
Which means you must update you applications' state of readable
and writable flags (I mean your IO framework doesn't know that
socket became readable or writable).

If you don't care about ZMQ_POLLOUT event, you still must check
ZMQ_EVENTS for reading on each zmq_send.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Python / Zmq / Gevent

2012-04-27 Thread Paul Colomiets

Hi Antonio,

On Thu, Apr 26, 2012 at 6:34 PM, Antonio Teixeira
eagle.anto...@gmail.com wrote:
 Hello Paul.

 You can find it here
 http://lists.zeromq.org/pipermail/zeromq-dev/2012-April/016832.html

 Topic ZMQ Assert

 I will cook some demo code when i have a spare time

 Regards
 A/T


I don't know the core too much. But that assertion looks like
you either use zmq socket in two threads simultaneously or
you fork and try to use same zmq context in both parent and
child. Without the code it's hard to guess more.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-04-27 Thread Paul Colomiets

Hi Gerhard,

On Fri, Apr 27, 2012 at 11:41 AM, Gerhard Lipp gel...@googlemail.com wrote:
 Probably I don't understand the code. You must poll only for reading
 on ZMQ_FD. But every zmq_send and zmq_recv cosumes mailbox.
 Which means you must update you applications' state of readable
 and writable flags (I mean your IO framework doesn't know that
 socket became readable or writable).

 If you don't care about ZMQ_POLLOUT event, you still must check
 ZMQ_EVENTS for reading on each zmq_send.

 You are right, I am actually just waiting to be able to zmq_recv with
 ZMQ_NOBLOCK. I dont care about the ZMQ_POLLOUT in this example. As the
 docs state, either event is signaled by the mailbox (ZMQ_FD) becoming
 ready to read (ev.READ). That is why i am checking for ZMQ_POLLIN
 before entering the zmq_recv/zmq_send.


So the real problem is misleading documentation? I think it would
be nice if you'd update documentation in a way that's understandable
for you, and send a pull request.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-04-27 Thread Paul Colomiets

Hi Gerhard,

On Fri, Apr 27, 2012 at 12:37 PM, Gerhard Lipp gel...@googlemail.com wrote:
 I don't think it is a docu thing.

 What the docu says  (and what the source looks like)
 zmq_getsockopt(ZMQ_FD) returns a fd (the mailbox's), which becomes
 readable, whenever the corresponding socket might have become readable
 and/or writeable for operation with the NOBLOCK option. To check which
 of these conditions are true, you have to use
 zmq_getsockopt(ZMQ_EVENTS) and check for ZMQ_POLLIN / ZMQ_POLLOUT
 respectively.


This is totally true. The but it's silent on some things.

 If this is true, users should ONLY select/poll for the read event,
 e.g. using libev EV_READ, regardless if the user wants to
 zmq_recv(ZMQ_NOBLOCK) or zmq_send(ZMQ_NOBLOCK).

Yes. Only for EV_READ. I don't know how lua works with
libev so I've made an ill advice, sorry. (see below)


 I guess the solution/workaround of the example (using ev.READ +
 ev.WRITE) does not work reliable and under all circumstances, but just
 in this primitive scenario.


Adding ev.WRITE helps only because socket is *always* ready
for writing. So you are in a tight loop actually. The real missing
part of documentation is here.

https://github.com/zeromq/libzmq/pull/328/files

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-04-27 Thread Paul Colomiets

Hi Gerhard,

On Fri, Apr 27, 2012 at 2:10 PM, Gerhard Lipp gel...@googlemail.com wrote:
 Ok, so i must always check if there are more events to process before
 returning from the io handler (frankly I don't understand the
 explanation). A short test still shows the lock explained earlier:

Try the following:

local zmq = require'zmq'
local ev = require'ev'
local c = zmq.init(1)
local xreq = c:socket(zmq.XREQ)
xreq:bind('tcp://127.0.0.1:1')
local xrep = c:socket(zmq.XREP)
xrep:bind('tcp://127.0.0.1:13334')

local is_readable =
  function(sock)
 local events = sock:getopt(zmq.EVENTS)
 return events == zmq.POLLIN or events == (zmq.POLLIN + zmq.POLLOUT)
  end

local forward_io =
  function(src,dst)
 return ev.IO.new(
function(loop,io) -- called whenever src:getopt(zmq.FD) becomes readable
while is_readable(src) or is_readable(dst) do
   if is_readable(src) do
  repeat
 local data = assert(src:recv(zmq.NOBLOCK))
 local more = src:getopt(zmq.RCVMORE)  0
 dst:send(data,more and zmq.SNDMORE or 0)
  until not more
   end
   if is_readable(dst) do
  repeat
 local data = assert(dst:recv(zmq.NOBLOCK))
 local more = dst:getopt(zmq.RCVMORE)  0
 src:send(data,more and zmq.SNDMORE or 0)
  until not more
   end
end
end,
src:getopt(zmq.FD),
ev.READ
 )
  end
local xrep_io = forward_io(xrep,xreq)
local xreq_io = forward_io(xreq,xrep)
xreq_io:start(ev.Loop.default)
xrep_io:start(ev.Loop.default)
ev.Loop.default:loop()

If this works. You can optimize (and clarify) it more.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-04-25 Thread Paul Colomiets

Hi Gerhard,

On Wed, Apr 25, 2012 at 6:08 PM, Gerhard Lipp gel...@googlemail.com wrote:
 i figured out to boil down an example, which shows this bug.
 it consists of three files:
 1) x.lua doing the XREP XREQ stuff, must be started once
 2) rep.lua implementing a simple echo replier, must be started once
 3) req.lua making the request to rep.lua through x.lua. must be
 started TWICE to produce the error. THIS PROCESS LOCKS. uncommenting
 the ev.WRITE is a bad workaround to this issue.

As far as I can see, it not a workaround. It's just the way ZMQ_FD works.
Uze zmq_poll if you don't feel comfortable for that. The only way you can change
that is returning getopt(zmq.EVENTS) instead of hardcoding ev.READ + ev.WRITE

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] HA: HA: НА: s390x build failure

2012-04-25 Thread Paul Colomiets

Hi Sergey,

On Wed, Apr 25, 2012 at 7:35 PM, Sergey Hripchenko
shripche...@intermedia.net wrote:
 I was hoping that you have more exotic OS ^)

 About issue: zmq_sleep (1) should be _enough_ for everything.
 However, for example I found that:
 PUSH-connect()
 PUSH-recv()  0
 PUSH-disconnect()
 // and this will leave PUSH - session_base_t - tcp_connecter_t forever
 until you call some io functions like PUSH-recv(ZMQ_DONTWAIT)=-1
 // the TERM command simply _NOT_ propagaded from
 session_base_t::process_term_req()(called in application thread) to
 tcp_connecter_t::process_term()(called in ZMQ IO thread)

 Not sure if anyone interested in this issue...

I think you should try replace sleep() to zmq_poll on that socket,
this may fix the problem.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] missing events ZMQ_FD / ZMQ_EVENTS

2012-04-23 Thread Paul Colomiets

Hi Gerhard,

On Mon, Apr 23, 2012 at 3:53 PM, Gerhard Lipp gel...@googlemail.com wrote:
 Hello,

 I can observe the same behavior as stated here
 (http://lists.zeromq.org/pipermail/zeromq-dev/2011-November/014615.html).
 What I observe is also a XREP/XREQ (ROUTER/DEALER) prob, where the
 XREQ is waiting forever to receive a message (which has been
 definitely sent). When I poll (timer based) the ZMQ_EVENTs, the XREQ
 is readable as expected. I am using libev (select based) for doing IO
 and I am aware of the edge-based trigger behaviour (I am
 reading/forwarding messages until ZMQ_EVENTs does not include the
 ZMQ_POLLIN bit any more).

 What is the status of this issue?
 Unfortunately my setup is a bit complicated to share, but i would like
 to help as much as possible.


We are using zeromq with libev without any issues. The only non-obvious
thing is that even if you doing send to a socket, you need to check whether
it became readable (and vice versa). You can look at the code at:

https://github.com/tailhook/zerogw/blob/master/src/http.c:300

It looks like:

// Must wake up reading and on each send, because the way zmq sockets work
ev_feed_event(root.loop, route-zmq_forward.socket._watch, EV_READ);

For simplicity we are just feeding libev event when doing send, so it's
checked for reading at the next loop iteration (And we never block for
writing, if you care)

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Python / Zmq / Gevent

2012-04-23 Thread Paul Colomiets

Hi Antonio,

On Mon, Apr 23, 2012 at 7:30 PM, Antonio Teixeira
eagle.anto...@gmail.com wrote:
 Hello Paul.

 I will try to make reduce the code that does this :)
 By the way Paul you may want to check the logging module when using Gevent
 it SIGABORTS ZMQ , i wrote a previous mail regarding that :)


Can't find. Can you point me?

(We are using greenlets and zeromq successfully, so it's some gevent problem)

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Python / Zmq / Gevent

2012-04-20 Thread Paul Colomiets

Hi Antonio,

On Fri, Apr 20, 2012 at 6:15 PM, Antonio Teixeira
eagle.anto...@gmail.com wrote:
 So Everything works fine the the client doesn't use a device...
 I maybe missing something but its only working this way ( The context is
 been shared across greenlets.).


I've submitted a bug: https://github.com/zeromq/pyzmq/issues/199

It's emerged from the introspection of the code. If you'll provide a
code to reproduce a problem it would be nice. But yes, it really can
be because of broken implementation of the greenlet support in pyzmq.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Python / Zmq / Gevent

2012-04-19 Thread Paul Colomiets

Hi Antonio,

On Thu, Apr 19, 2012 at 6:42 PM, Antonio Teixeira
eagle.anto...@gmail.com wrote:
 Hello Once Again :)

 I have deployed the following scenario :

 REQ (ipc://IPC/SOCKET- Connect - Using Multiple Threads/ Greenlets) - (IPC
 ROUTER - Bind ipc://IPC/SOCKET) - (DEALER TCP - Connect:127.0.0.1:)
 - (Router TCP Bind :127.0.0.1:) - (IPC DEALER - Bind ipc://IPC/SOCKET2)
 - REP ( Using Multiple Threads/ Greenlets Connect - ipc://IPC/SOCKET2)

 A Thread make the place of the device with a simple:

 data = recv()
 send(data)

 All code based on this :
 https://github.com/zeromq/pyzmq/blob/master/examples/gevent/reqrep.py

 On The REP Side 10 Greenlets
 On The REQ Side 1 Greenlet

 100 Tasks ( A simple print and return)

 Everything Works Well.

 The same as above

 On The REP Side 10 Greenlets
 On The REQ Side 5 Greenlet

 It works well until ...
 Did Not Receive A Response The Destination Server Is Unreachable.
 Ok so Pieter and other at the mailing list pointed me that if the socket
 disconnects we clear the socket.

 self.workerSocket.setsockopt(zmq.LINGER, 0)
 self.workerSocket.close()

 And make a new one , well this still doesn't work.
 The REP Side is still online and available since i can use another machine
 with the client software ( the REP Part)  and it works perfectly until the
 same happens.

 So to The Guide we go.
 Set Linger 0 to ensure everything is dropped , checked OK
 Close the socket and started a new one , checked OK
 Use identity in case stuff gets funky , so one UUID for each worker inside
 the client set before we connect to the IPC , checked OK

 So after all this the problem remains but when I'm terminating the
 client(SIGKILL) i can see that some pending messages get dumped and sent
 to the Server ( REQ Part) maybe something jams the device or i have a
 misconfig ?


You seem to use single REQ socket for multiple requestors, right?
Either use XREQ socket or create a socket per greenlet.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] [crossroads-dev] [ANNOUNCE] paperjam -- device implementation for zeromq and crossroads

2012-04-18 Thread Paul Colomiets

Hi,

On Wed, Apr 18, 2012 at 9:17 AM, Martin Sustrik sust...@250bpm.com wrote:

 Btw, I believe implementing socket options is crucial atm. Without them
 it'll be hard to use it in real-world deployments.


Sure. It's top item on my priority list.

On Wed, Apr 18, 2012 at 2:27 PM, Ian Barber ian.bar...@gmail.com wrote:

 Very nice Paul! Is python just used here for the build and the tests?

Exactly.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] [crossroads-dev] [ANNOUNCE] paperjam -- device implementation for zeromq and crossroads

2012-04-18 Thread Paul Colomiets

Hi Pieter,

On Thu, Apr 19, 2012 at 12:03 AM, Pieter Hintjens p...@imatix.com wrote:
 Note that libzmq does include zmq_device(), we put it back in since
 people wanted it back and there was really no valid reason for
 removing it in the first place.


That's nice. But for my use case we have a lot of python processes
(not threads because of global interpreter lock), so in process device
is not useful for us.

I'm also too greedy to have device per thread, because at each
node we have device number proportional to total number of nodes.
However, I haven't done any benchmarks, so not sure if it's real
problem.

 It would be useful IMO to clean-up zmq_device(), add the monitor
 socket, .


Yeah. If I need in-process device I definitely choose built-in one.

I'm not sure monitor socket is a good idea. It looks like useful, but
I doubt it's nice addition into zeromq. I think we should wait until
it will prove itself useful. E.g. I haven't tried to use it in pyzmq.

 and remove the device type which was never really needed

IIRC you argued that we need to keep device type in ZDCF,
didn't you?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] [crossroads-dev] [ANNOUNCE] paperjam -- device implementation for zeromq and crossroads

2012-04-18 Thread Paul Colomiets

Hi Pieter,

On Thu, Apr 19, 2012 at 12:41 AM, Pieter Hintjens p...@imatix.com wrote:
 In ZDCF and the generic device we implemented socket options afair.

 If anyone actually wants this, I'll make it all work again.


Not sure what you mean. If you about my excuse for not implementing
socket options in version v0.1, then it's irrelevant any more. All documented
options are implemented in paperjam about 3 hours ago :)

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

[zeromq-dev] [ANNOUNCE] paperjam -- device implementation for zeromq and crossroads

2012-04-17 Thread Paul Colomiets

Hi,

We are searching for a long time for a replacement of the devices from
early zeromq2.x. Here is my answer.

https://github.com/tailhook/paperjam

Paperjam supports both zeromq2.x and libxs, and is able to send
messages received from zeromq socket to libxs socket and vice versa.
So may be used to migrate between zeromq and libxs.

It also supports monitored devices, inspired by pyzmq, that
basically means that there is third socket which receives copy of all
messages sent between frontend and backend. It is mostly useful for
debugging purposes.

Unlike old devices, paperjam provides standalone process with multiple
devices in single process (and with YAML'y config).

Future plans are modest: implement handling of various socket options,
add some statistics and maybe implement zeromq3 support.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Single-Point-of-Failure removal

2012-04-09 Thread Paul Colomiets

Hi Gregory,

On Mon, Apr 9, 2012 at 4:21 AM, Gregory Taylor gtay...@gc-taylor.com wrote:
 If I understand correctly, the possibility I outlined (changing the
 Worker-Announcer link to Push/Pull) would eliminate the duplicate messages,
 since Push/Pull is fair-queued when multiple connections are made (each
 Worker to multiple Announcers), and the Clients would be Sub'd to the two
 announcers (Pub).

 Am I on the right track with this?


Yes and No. It depends on needed reliability guarantees. If you turn that
into push/pull, and one announcer crashes, you will loose some messages.

Having pub/sub and two announcer client will receive every message twice,
but if announcer crashes it will not loose any message.

Similar considerations are applied to other parts of the architecture. You
might want req/rep between gateway and broker and broker and worker
(to be able to repeat request if it lost), if you'd better tolerate duplicates
than lost messages. But leave it as is (with push/pull) if duplicates are
more harmful than lost messages.

--
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] how to get it to run faster?

2012-04-02 Thread Paul Colomiets

Hi Sean,

On Mon, Apr 2, 2012 at 10:34 AM, Sean Ochoa sean.m.oc...@gmail.com wrote:
 Hey all.

 In my attempt at trying to create a message queue with a persistence layer,
 I may have slowed things down.

 Here's my code so far:  http://paste.pound-python.org/show/18411/

 I'm wondering if someone could help me tune this thing so that I could put
 the persistence layer back in and still get good performance:  20,000
 messages / sec or more.

 Any help is much appreciated!  I'm still learning how to use zeromq.


Is it slow even without persistence?

For faster persistence you need something a long the lines of (untested):

http://paste.pound-python.org/show/18412/

You also need to rotate log, error handling, etc, that I've skipped for brevity

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Dbus like bus based on zeromq?

2012-03-25 Thread Paul Colomiets

Hi Benjamin,

On Sun, Mar 25, 2012 at 11:29 AM, Benjamin Henrion b...@udev.org wrote:
 Hi,

 I was juts reading this article about a Dbus replacement:

 http://linuxfr.org/news/a-bus-un-autre-bus-dedie-gnu-linux-embarque

 Someone mentioned zeromq, how easy would be to have a daemon that replaces 
 dbus?

 How would the applications talk to such daemon? via json?

 I think the capability of zeromq to have ipc+tcp is very interesting
 in this context.


Yes switching transport is very interesting. But it's unclear
what problem do they solve. Text says there are too many context
switches and there are central deamon. Daemon can be turned
into coordinator instead proxy like dbus. But context switches
are still there (and even more as I expect) since zeromq does
io in separate thread.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Epolling on FD doesn't work for 2.1.11

2012-03-23 Thread Paul Colomiets

Hi Andrzej,

On Fri, Mar 23, 2012 at 12:31 PM, Andrzej K. Haczewski
ahaczew...@gmail.com wrote:
 There is one thing that bothers me though: why does the scheme I used
 works for ZeroMQ 3.1.0 and CrossroadsIO, as I tired both and they work
 with registering FD right away with no recv() calls in between
 connect() and epoll(), and it doesn't work for ZeroMQ 2.1.


I haven't look at the examples but the differences are, at least:

1. zeromq 3 uses eventfd instead of socketpair to wakeup application thread

2. zeromq 2 uses single notification channel to for both
zmq_send and zmq_recv, so you *must* try to do zmq_recv
after you have just done zmq_send, as Robert pointed out
(I'm not sure it's not the case for zmq3)

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] More on ZDCF

2012-03-23 Thread Paul Colomiets

Hi Steffen,

On Fri, Mar 23, 2012 at 8:01 PM, Steffen Mueller
m...@steffen-mueller.net wrote:
 Again, I'd appreciate review before I run with it in my Perl implementation.


Is there any good reason to specify both, socket types and device type?
I'd say device type is redundant. The point is even more stronger as
device type names are misleading. I'd say they should be
request-reply, publish-subscribe, push-pull if needed at all.

Is there any good reason to specify xrep vs rep socket types?
I think it's implementation defined of whether it supports request
multiplexing or other specific features of x kind of socket.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] More on ZDCF

2012-03-23 Thread Paul Colomiets

Hi Pieter,

On Sat, Mar 24, 2012 at 12:09 AM, Pieter Hintjens p...@imatix.com wrote:
 On Fri, Mar 23, 2012 at 5:07 PM, Paul Colomiets p...@colomiets.name wrote:

 Is there any good reason to specify both, socket types and device type?

 You can, and people sometimes do, mix these. Otherwise, indeed it's not 
 useful.


Sure. Checking that both sockets match accomplishes task, isn't it?
And the thing I really do mix is name of the device, comparing to socket
types. So I still propose to make all built-in types be implicit (and of course
all the application specific devices must be explicitly specified).

BTW, why only frontend socket type is specified in examples?

 Is there any good reason to specify xrep vs rep socket types?
 I think it's implementation defined of whether it supports request
 multiplexing or other specific features of x kind of socket.

 I renamed xrep/xreq to router/dealer.


If the page still shows old names. But still any practical difference
between xrep and router or xreq and dealer for devices?
There is no xpub/xsub sockets, is that a mistake?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Duplicate identities - #325

2012-03-23 Thread Paul Colomiets

Hi Pieter,

On Sat, Mar 24, 2012 at 12:45 AM, Pieter Hintjens p...@imatix.com wrote:
 I've just made a pull request for a fix for issue #325
 (https://zeromq.jira.com/browse/LIBZMQ-325) which caused libzmq to
 crash if two clients connected to a ROUTER or REP socket with the same
 identity.

 This assert doesn't affect 2.1 as far as I can tell.


Seems to need unittest. Even more needed if it was broken from 2.1 to 3.1.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Request for co-maintainer(s) for CZMQ

2012-03-21 Thread Paul Colomiets

Hi Felipe,

On Wed, Mar 21, 2012 at 4:48 PM, Felipe Cruz felipec...@loogica.net wrote:
 I'm also writing a http server that uses zmq to connect with workers.. (also
 using some czmq stuff).


Is it open-source project? And what's wrong with mongrel2 or zerogw?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Proposal for next stable release

2012-03-18 Thread Paul Colomiets

Hi Pieter,

On Sun, Mar 18, 2012 at 10:24 PM, Pieter Hintjens p...@imatix.com wrote:
 Would you take a look and critique it? http://rfc.zeromq.org/spec:16


At least this item:

The project SHALL NOT use topic branches.

Needs more explanation.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Socket type checking

2011-11-15 Thread Paul Colomiets

Hi Martin,

On Tue, Nov 15, 2011 at 11:19 AM, Martin Sustrik sust...@250bpm.com wrote:
 On 11/13/2011 09:28 PM, Paul Colomiets wrote:

 Anyone any ideas about how to do this in a backward compatible way btw?

 Sure, introduce another message flag. Messages that contain it can be
 sent at any time, and checked against topology id set as socket
 option, each time such message comes in. Also sockets insert them
 before first message in any pipe. And for multicast they are just
 periodially sent. As it's just sanity check, not security one, we
 could live with messages coming from pipes which didn't sent topology
 id (but can't actually sent, because I believe messages with reserved
 flag set will drop connection).

 Makes sense.

 However, keep in mind the thing that's periodically sent by PGM is a 6 byte
 MD hash rather than a user-defined string.

It's inconvenient. Does sending topology id as ordinary message feel too wrong?

 Also new implementation that don't set
 topology id, should accept any one (so you can upgrade nodes one by
 one), eventually this way of using sockets should be deprecated.

 I've described a fallback mechanism in the SP mailing list:

 http://groups.google.com/group/sp-discuss-group/msg/e880d1b2c6b02ca4

 It should be able solve this problem.


I'm not sure I understand fallback.

 As I've described in SP mailing list, I'm not happy with UUIDs, but if
 other way is more complex, I could live with that. (Although, I don't
 see a problem of keeping ordinary message with topology id around, to
 insert into every new pipe)

 Again, my rationale for using UUDs is explained in the email mentioned
 above.

Will answer there.

 Still, to solve the egg problem, rather than trying to address the whole
 breakfast, the community may settle on sending strings or such.

 And please, do both checks: socket type and application's id, it will
 help a lot.

 Yes. The fallback mechanism should solve that. Note that it works because
 specifying topology ID implies the pattern ID (NASDAQ stock quotes implies
 PUB/SUB) but not vice versa.

Well the problem I'm trying to solve is following. We have a device
having two ports one for pub and one for sub, they are called equally
either by intention or by coincidence. Then connecting pub to pub will
emit no error, which is strange.

So if we would not implement both checks you will need to invent a
name for each socket rather than each topology. Checking both also
helps catching programming errors. Again, in zerogw in some places we
have a choice between pub and push,  and if you configure names right,
that doesn't means you have configured types correctly.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Socket type checking

2011-11-13 Thread Paul Colomiets

Hi Martin,

On Sat, Nov 12, 2011 at 9:16 AM, Martin Sustrik sust...@250bpm.com wrote:
 On 11/12/2011 08:15 AM, Martin Sustrik wrote:

 We have no good idea how to solve the breakfast problem, so let's move
 the breakfast discussion to the SP mailing list and focus on the egg
 here, namely on what kind of type checking we are able to provide today
 and in a backward compatible way.

 Anyone any ideas about how to do this in a backward compatible way btw?


Sure, introduce another message flag. Messages that contain it can be
sent at any time, and checked against topology id set as socket
option, each time such message comes in. Also sockets insert them
before first message in any pipe. And for multicast they are just
periodially sent. As it's just sanity check, not security one, we
could live with messages coming from pipes which didn't sent topology
id (but can't actually sent, because I believe messages with reserved
flag set will drop connection). Also new implementation that don't set
topology id, should accept any one (so you can upgrade nodes one by
one), eventually this way of using sockets should be deprecated.

As I've described in SP mailing list, I'm not happy with UUIDs, but if
other way is more complex, I could live with that. (Although, I don't
see a problem of keeping ordinary message with topology id around, to
insert into every new pipe)

And please, do both checks: socket type and application's id, it will
help a lot.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Socket type checking

2011-11-11 Thread Paul Colomiets

2011/11/11 Ilja Golshtein ilej...@narod.ru:
 Hello.

 I am sorry to ask, but is this type check really useful?

 Assuming the goal is to make 0mq more user friendly and to reduce chances of 
 erroneous setup,
 I suggest we should introduce socket names (identities is much better word, 
 but
 is it already in use) and check this names are matched.

 Type match is too weak even for sanity check and it could be implemented only 
 because
 it is not too difficult ...

I'd second this proposal, as I've described in SP mailing list. There
is slight difference in proposed semantics. I'd like to have single
topology_id which is shared between endpoints and devices. So type
check is also useful to decide is it right place to connect (e.g. it's
input or output port in a device).

The problem is that's it's another step to configure at each socket.
But it can be a plus for migration, like:

s1, s2 = zmq.socket(zmq.REQ), zmq.socket(zmq.REQ)
s1.setsockopt(zmq.TOPOLOGY_ID, dummy_service)
s1.connect(ipc://dummy1) # speaks new protocol, providing type checks
s2.connect(ipc://dummy2) # speaks old protocol

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] 0MQ/3.1

2011-11-10 Thread Paul Colomiets

Hi Mikko,

On Thu, Nov 10, 2011 at 12:11 AM, Mikko Koppanen
mikko.koppa...@gmail.com wrote:
 Hi,

 I never got the merge to work well but I can't remember what the issue
 was back then. With this setup from github perspective am I still
 maintaining three different forks from which I send pull requests to
 three different repos?


Basically yes. Locally you can pull all three to a single repo, and
push to all three, but
in github you have three ones.

Merges probably may work better if maintainers would cherry-pick from
another branch (or repo)
instead of extracting and applying patches. Though, I'm not sure about that.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

[zeromq-dev] Reliability of zeromq based project

2011-11-09 Thread Paul Colomiets

Hi,

We have some design problem with our zeromq application. We have two
clusters A and B. Each serving it's own pool of clients. We have RPC
based on zeromq request-reply pattern beetween them. RPC is used for
serving 1 to 10 percent of requests. And we expect at least 1000
requests per second at each cluster. The problem we encountered is
when cluster B is down. We have to timeout each request for example
for 5 seconds. This obviosly takes down cluster A because each worker
in cluster A is wating for RPC. Ideally we want requests which can be
served without RPC to B to be served with minimum latency changes. We
have several options:

1. Lower the timeout. One that probably solves problem of failed
cluster B is about 10ms or even 1ms. Which probably will fail most
requests under high load (at cluster B there is it's own request
queue, even if we prioritizing RPC over serving clients)

2. Make all workers asynchronous. Then we can set right HWM and begin
to get failures immediately after buffers are full. It's huge amount
of work (Well and the reason I like zeromq, is because usually
synchronous code using zeromq is as fast as asynchronous one)

3. Place a device between them, count unanswered requests and fail
following replies immediately until some timeout. The problem with
this one is when we have uUnanswered_request_count lower than number
of workers at A each worker will wait for response anyway. If we have
other way around, then if each worker at A will request RPC roughly at
the same time, we will get failures (the latter is fairly common as
client requests which do not need RPC to B are must faster)

4. Place monitoring and do not send requests to B. This solution
solves problem when B is totally down (like no pings, etc.), but not
when it's overloaded by requests. (We are probably monitoring another
service like ICMP, not RPC, as the only way to monitor RPC is in (3)).

5. We can also split topologies that need RPC to B and ones that don't
need. So the latter ones are much faster. Which means client must
choose topology, that doesn't sound very good. Also it sometimes can't
be determined ahead of time.

We are currently thinking about some smarter device for (3), but it
would be great if there were a better way. With traditional networking
the most dumb way would be to try to connect at each request, which
often will fail much faster than timeouted requests in zeromq. Also
you could connect each 10th request if previous connection/request
attempt failed and so on.

I'm open to both good ideas and some practical experience of solving
similar issues.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Reliability of zeromq based project

2011-11-09 Thread Paul Colomiets

Hi Ilja,

On Wed, Nov 9, 2011 at 11:44 AM, Ilja Golshtein ilej...@narod.ru wrote:
 Hi Paul.

 Although your description is not complete of course,
 my choice is somewhere near (2) although I understand your grief
 about extra complexity.

And what tools do you use? Are you happy  node.js or  tornado user?

 It is terrible to block threads waiting for response from another
 box (and I'd say from another process) in most production alike cases.
 Even if timeout exists.

Ah, well, most today's database wrappers are synchronous. And
databases are almost always separate processes, and usually on
separate boxes. Similarly there are various problems with different
asynchronous frameworks, biggest one is probably unreadable code. So
in perfect world you would probably be right, but there are not so
many tools to do it right now.

 It would be great to delegate all (or some) further processing to another 
 cluster
 instead of waiting.

One good alternate we use sometimes is just push data  from A to B and
wait for another push from B to A (and it can be wrapped around
request reply if needed). This works for some stuff, but there are a
lot of cases where we should hold a lock or keep some complex state
while doing request.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] 0MQ/3.1

2011-11-09 Thread Paul Colomiets

On Wed, Nov 9, 2011 at 6:56 PM, Mikko Koppanen mikko.koppa...@gmail.com wrote:
 the single biggest disadvantage from my point of view is related to
 the workflow using three separate repositories. If everything was in a
 single repository we could finally get rid of manually having to
 export/merge diff files between directories. Also, at the moment
 people working on all repositories have to go through three different
 directories to make sure that git clones are up to date before
 starting to work on something.

 Another added benefit of a single repository is being able to follow
 progress and commits through different branches with more ease. As far
 as I know most of the tools, such as gitk, are also geared towards a
 single repository.

It may be harder to follow work on github, but not on local copy. just do

git remote add zeromq2-1 git://github.com/zeromq/zeromq2-1
git remote add zeromq2-2 git://github.com/zeromq/zeromq2-2
git remote add libzmq git://github.com/zeromq/libzmq

(exact urls are untested)

And you can do all of the following:

# show combined log of all projects
git log --all
# update data from all repositories
git fetch --all
# work on any version's master branch
git checkout -b master zeromq2-1/master
# merge changes from all the master branches into current branch
git merge zeromq2-1/master zeromq2-2/master libzmq/master
# pull also works
git pull zeromq2-1 master; git pull zeromq2-2 master

Sure, you can cherry-pick from any branch, and gitk will also show all
of them. What else do you want?

On Wed, Nov 9, 2011 at 10:23 PM, Pieter Hintjens p...@imatix.com wrote:
 There is no latest production master of 0MQ, and trying to present
 this almost throttled our release cycle, last year. As we've seen, the
 freedom to maintain multiple conflicting realities (2.1 vs. 3.0 vs.
 4.0) is essential. The alternative would by now have been a choice
 between forced upgrade, or no experimentation.


Sure. You can change default branch for git repository. Just delete
master branch. Do 2.1.x, 2.2.x, 3.0.x and 3.1.x branches. Set
default either 3.1.x branch or one named unstable, so that it's
clear that it's just for experimentation. git flow itself makes
develop branch default, not master. You can also add stable branch
which is synchronized with 3.1.x or whatever latest stable is, so
new projects can just grab it and work, instead of examining what the
best release is. If you do not like 3.1.x notation you can use
libzmq3-1-current or v3.1-master notation or whatever you like.
Also you can use folders for feature branches. Usually they may be
called feature/yyy but if you want feature for some version you can
adopt for-3.1.x/yyy naming scheme.

The only problem with several repositories is github forks. People
will be forced to have several forks and even may be to sync all their
zmq forks, for example to propose pull request for several versions.

Just my 2 cents. Not to force either way. Just to educate people that
neither way is really a problem for git.

-- 
Paul



-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] multiplexer pattern

2011-09-16 Thread Paul Colomiets

On Fri, Sep 16, 2011 at 11:13 PM, MinRK benjami...@gmail.com wrote:
 * clients can make requests of particular engines
 * engine replies propagate to requesting client
 * engines and clients can come and go over time (message loss for vanished
 endpoints is fine)
 * clients and engines may not bind, and only connect to the mux and Hub (mux
 also connects to hub)
 IDENTITY made this pattern *extremely* easy, and I don't see the right way
 to build a MUX in its absence.

We have similar setup today, without identities. We use unix sockets
for connections between workers and router, and each worker has unique
socket (actually we have several workers per one named service, which
seems you call engine, but it's not important). It's hard if you have
too much engines. But you can somewhat soften the pain by aggregating
them into groups with an intermediate device (which is similarly to
your router mostly suffles message parts in some way)

With zeromq 4.0 you probably can do similar with ZMQ_GENERIC. But you
need to keep track of identities yourself. This is compensated by the
bonus: you can keep track of several workers per service (engine) and
you can queue messages instead loosing them when there are no workers
available (may be there are other bonuses too).

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] recovering from an unanswered reply

2011-08-21 Thread Paul Colomiets

Hi,

On Sat, Aug 20, 2011 at 8:08 PM, Pieter Hintjens p...@imatix.com wrote:
 On Sat, Aug 20, 2011 at 5:46 PM, Mathijs Kwik bluescreen...@gmail.com wrote:

 Cool, I needed some confirmation that close/reconnect isn't evil/
 frowned upon.

 It depends.

 Could be good design, could be bad design. For example, reconnecting
 is a simple way to handle certain error conditions (see Lazy Pirate
 pattern), but frequent close/reconnect in normal situations may
 exhaust system resources (running out of file handles or sockets).

 General advice would be never design anything that's not a precise
 minimal answer to a problem you can identify and have to solve.

And in my opinion the minimal answer would be to use XREQ.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] process control revisited

2011-08-07 Thread Paul Colomiets

Hi Andrew,

On Sat, Aug 6, 2011 at 10:09 PM, Andrew Hume and...@research.att.com wrote:
 each server is now a black box. all data flow into a server goes through
 one of 2-3 portal processes (think 0MQ device), and there is a global config
 specifying how to map the key field for each of these data flow types into
 a server. thus, any process needing to send a datum d with key k, can simply
 look up how to map k into a sever name and the port number for the
 portal for d on that server.

As a conincidence we have migrated to roughly similar setup a week ago.
It really simplifies configuration a lot. We use two ports for different
patterns (req/rep and push/pull), and that's pretty much all the
network configuration. All the unix sockets are created automatically.
Althrought I should mention that we have more traffic going from frontend
servers directly into backend servers, and only percent of it (more
complex interactions between components) going throught the devices.

To have this work we had to push destination address to the XREQ
socket before the data. It smells a bit like a hack, but works good
for us.

I also think we all could benefit if similar device would be publicly
available. We have prototyped one in python. If it's configuration and
routing concepts will sustain some time, we will probably rewrite
it in C and publish as open-source project.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] [PATCH] Commands are now stored in ypipes instead of socketpairs

2011-07-03 Thread Paul Colomiets

Hi Martin,


On Sun, Jul 3, 2011 at 2:40 PM, Martin Sustrik sust...@250bpm.com wrote:
 Hi all,

 This patch aims to fix the long-standing problems with asserts in
 mailbox.cpp due to insufficient OS socket buffers.

 It does so at the expense of adding ~0.5us to the latency (tested on Linux
 2.6.32).

Isn't it better to use eventfd instead of socketpair on linux?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] [PATCH] Race condition in eventfd signaler fixed

2011-07-03 Thread Paul Colomiets

Hi Martin,

Why do you check for 2? There can be any value  1. Why do you check
the value any way? If I understand the code correctly, it puts pipe in
active state, and reads until there are no more messages, so you just
don't care about the number.

On Sun, Jul 3, 2011 at 4:32 PM, Martin Sustrik sust...@250bpm.com wrote:

 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev





-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] [PATCH] Race condition in eventfd signaler fixed

2011-07-03 Thread Paul Colomiets

Hi Martin,

On Sun, Jul 3, 2011 at 5:40 PM, Martin Sustrik sust...@250bpm.com wrote:

 Note that signal is not removed immediately from the eventfd when reader is
 activated, rather it's left lingering there so that polling (ZMQ_FD) reports
 that assocaited socket is readable (POLLIN).

Thanks for the explanation, it seems pretty sane now. But does the statement
above mean that you are going to turn ZMQ_FD to level triggering?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Survey: use of 0MQ request reply

2011-06-16 Thread Paul Colomiets

On Wed, Jun 15, 2011 at 12:06 PM, Pieter Hintjens p...@imatix.com wrote:
 Hi all,

 Following some discussion on IRC about reviewing the REQ/REP designs,
 I'd like to measure how people are using these. So, a short survey,
 please answer inline, if you can.

 1. Are you using ROUTER (XREP) sockets in custom devices or brokers?

Probably yes. If you think of zerogw as a broker, a cross-protocol broker
actually. Also planning to write another broker for reliability and
load-balancing.

 2. Are you using ROUTER sockets in end-nodes (application code)? If
 so, can you explain how and why?

Yes, when writing asynchronous aplications. Usually no for synchronous.
We are trying to use monitoring to restart processes, and detect
crashed nodes. Until we will implement or own load-balancing.

 3. Are you using DEALER sockets in end-nodes? If so, can you explain
 how and why?


Yes. Always instead REQ. To have a timeouts on request.

 4. Are you using the REQ-REP multi-hop functionality, either with a
 queue device, or a custom device? Can you explain how and why?

Yes. Currently queue device. We need that for many to many connections
to cut total number of TCP streams from tens of thouthands to hundreds.
Also eases configuration. Also planning to implement own device for
reliability and better load-balancing.

 5. How do you handle failures (lost requests or replies)?

Timing out and either resending request or propagating failure
depending of type of the request.

 6. Anything you feel is missing or inadequate in the REQ/REP designs?
 Please explain with a use case.

Wrote a lot about that. Req/Rep is almost unusable, but failure is different
for each one.

For REQ inability to recover from failure/timeout
(I've seen some mentions of timeouts in ml, are they implemented?).

For REP it's inability to send presence/heartbeat/status update info.
The best solution is use pub/sub pair for that, but there are drawbacks:
1. increased traffic, BTW sending info that is usually already known,
unless process crashed or something  happened
2. another port, which makes configuration complex (we use from two
to five different request-reply services on each of our quite small projects,
each service also has several nodes, and that's not counting other
patterns, so adding another layer of complexity is quite distracting).
3. unable to report presence immediately, only on next heartbeat (we
don't know when reconnect takes place in zeromq)

I guess usecases are quite clear from above.

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] ZeroMQ integration with Boost

2011-05-27 Thread Paul Colomiets

On Fri, May 27, 2011 at 11:38 AM, Maciej Gajewski maciej.gajew...@tibra.com
 wrote:

 I've managed to successfully observe 0MQ sockets with boost asio io_service
 in following way:

 I. The file descriptor returned by getsockopt ZMQ_FD is a descriptor of
 internal stream socket used to send commands from io thread to. You can wrap
 it in boost stream socket.
 Unfortunately boost stream socket closes the descriptor when destroyed, so
 I had to modify stream descriptor service in a way that it does not close
 the socket


Wouldn't it be easier to dup() socket? This way both zeromq and boost can
close their file descriptor separately.


-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Should zeromq handle OOM? (Re: [PATCH] Fixed OOM handling while writing to a pipe)

2011-05-21 Thread Paul Colomiets

On Sat, May 21, 2011 at 2:31 AM, Pieter Hintjens p...@imatix.com wrote:

I think what you need to do is make a test bench that proves the case.
 Perhaps an EC2 instance with low virtual memory, or somesuch.
 Something people can play with.

 The problem is that fix all failures will take big amount of time and it
would be quite huge refactoring (minor in terms of functinality, but huge in
numer of lines of code), which whould take a time and would be hard to merge
afterwards.

Something realistic I can try to hack is:
1. turn off disconnecting on OOM (so that new connection would not crash
application)
2. fix everything about sending/receiving messages
3. setup a two zmq_streamer-like devices (old code and new code) opened to
the exernal world, so that anybody can test, set small ulimit -a for both
4. setup a slow consumer which also shows statistics from both devices

Will having this example working convince you?

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

[zeromq-dev] Should zeromq handle OOM? (Re: [PATCH] Fixed OOM handling while writing to a pipe)

2011-05-20 Thread Paul Colomiets

Hi Pieter,

I've changed subject to let others follow discussion easier.

On Fri, May 20, 2011 at 1:30 PM, Pieter Hintjens p...@imatix.com wrote:

 On Fri, May 20, 2011 at 12:21 PM, Martin Sustrik sust...@250bpm.com
 wrote:

  There's one important point to be made: 0MQ currently behaves 100%
  predictably in OOM condition -- it terminates the process. User is then
  free to restart the process or take whatever emergency measures are
  necessary.
 
  Any patches to OOM handling should preseve this 100% predictability.
  zmq_send() can return ENOMEM instead of terminating the process,
  however, it must do so consistently. Introducing undefined behaviour
  under OOM conditions is not an option.

 Sorry to say this rather late, but before we change the behavior of
 0MQ under OOM conditions, I'd want the consensus of users here.

 It is a radical change in semantics to go from asserting, to
 continuing with an error response. We cannot make such changes without
 being certain there is a consensus of approval for them.

 My own experience goes strongly against handling OOM in any way except
 assertion. We explored this quite exhaustively in OpenAMQ and found
 that returning errors in case of OOM was very fragile. It is not even
 clear that an application can deal with such errors sanely, since many
 system calls will themselves fail if memory is exhausted. We tried
 hard to make this work, and in the end had to choose for assert as
 the only robust answer.

 It's particularly important for services because most of the time
 there is a problem that must be raised and resolved, whether it's the
 too-low default VM size, or the lack of HWMs on queues, or too-slow
 subscribers, etc.

 The only exception to assertion, afaics, is for allocation requests
 that are clearly unreasonable. And even then, assertion seems the
 right response if these requests are internal. If they're driven by
 user data (i.e. someone sending a 4GB message to a service), the
 correct response is detecting over-sized messages and discarding them
 (and we have this code in 2.2 and 3.0).

 tl,dr - +1 for asserting on OOM, -1 for returning ENOMEM.



The problem with asserting on OOM is that you excluding zeromq for using in
whole class of applications. All today's fast performance databases use
writeback cache. And it's totally bad for them to not to be able to flush
the dirty cache (well, its technically possible by installing handler on
SIGABRT, but is much less reliable). You exclude all kind of databases:
persistent queues, caches, whatever. Probably this is not the only kind of
applications is excluded, just something came to my mind.

So I'm -1 on asserting and +1 for ENOMEM
(but the situation that two core developers has exacly opposite opinion is
unfortunate)

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] [PATCH] Fixed OOM handling while writing to a pipe

2011-05-20 Thread Paul Colomiets


 Hi Martin,


On Fri, May 20, 2011 at 12:57 PM, Martin Sustrik sust...@250bpm.com wrote:

 Hi Paul,


  In case of OOM zmq_send will return EAGAIN, because there no way to
 return other error from pipe (writer_t). Do you think it's worth fixing?
 This
 basically means writer_t::write should return 0 and set errno instead
 of returning bool. (Or should it return ENOMEM directly?)


 Yes. Definitely. If error is to be returned it should be correct error.

Ok



 As for returning the error the most common POSIX practice is to return
 negative number and set errno. Return value of 0 is used for success.

Yup, but citing http://www.unix.org/whitepapers/reentrant.html:

  In addition, all POSIX.1c functions avoid using errno and, instead, return
the error number directly as the function return value, with a return value
of zero indicating that no error was detected. This strategy is, in fact,
being followed on a POSIX-wide basis for all new functions.

but, nevermind, it should be consistent with other functions



  Sometimes write fail even if `more` is set on previous message, this
 will trigger some assertions in zeromq code (they will be fixed in the
 future patches),


 What assertions? If send fails we should revert the state to what it was
 before send() was called. That shouldn't trigger any assert.


e.g. lb.cpp:118
your code expects pipe-write never fail after first message. I would
improve that, you just said you want small incremental steps, and this patch
really fixed crash in one of my test scripts.



  and probably can trigger assertions in user code.


 Same as above.


Well, I usually do something like:

rc = zmq_send(msg, ZMQ_SNDMORE);
if(rc) return false;
rc = zmq_send(msg2, 0);
assert(!rc);

Because previously sending all subsequent message parts could never fail.


 One additional issue: If you change function prototype to return error, you
 should check the error at each place the function is called from.


Sure. Am I missed something?


  But as currently nobody expects zeromq to work under OOM,
 it's probably fine to live with this for some time.


 There's one important point to be made: 0MQ currently behaves 100%
 predictably in OOM condition -- it terminates the process. User is then free
 to restart the process or take whatever emergency measures are necessary.

 Any patches to OOM handling should preseve this 100% predictability.
 zmq_send() can return ENOMEM instead of terminating the process, however, it
 must do so consistently. Introducing undefined behaviour under OOM
 conditions is not an option.

Sure. Termination in lb.cpp:118 will be quite unfortunate, but I've said
it's single step, and I think it's ok for some development version.

All in all, I'll fix the patch, if we decide it's needed at all (discussion
is in another thread).

-- 
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Improving zeromq in OOM conditions

2011-05-17 Thread Paul Colomiets

On Tue, May 17, 2011 at 11:26 AM, Martin Sustrik sust...@250bpm.com wrote:

 1. Use finite default HWM.
 2. Use finite default MAXMSGSIZE.
 3. Implement a MAXCONNECTIONS option with finite default.
 4. Think hard about whether it makes sense to allow infinite as an valid
 option for any of the above.

Ok, I'd like to configure applications in the following way:
1. I should have up to 10 connections max (which is very small cluser)  in
10 application sockets (we build applications as many interconnected
services)
2. Maximum msg size is 1Mb (which is ok, even for some web pages)
3. HWM should be about 100
Which means I should reserve 10Gb for this application, which under usual
load takes less than 1Gb (most messages are 10Kb). When application is
designed well, and zerogw has no memory-related asserts, I can set ulimit
for application (or run it on 2Gb machine with overcommit turned off), and
be sure that when lots of big messages are sent to the application it will
be a bit slow for some time, but will not crash, dropping hundreds of
messages.


 However, that's my personal opinion. If you still feel like you should
 handle the OOM situation when it hits, feel free to try. However, don't do
 any guesswork. With OOM handling the guesses mostly turn out to be false.
 Make a test instead and fix the problem you'll get.


You don't fear, sorry :) I do start with testing, but the aim is to remove
all alloc_assert's anyway.


 Also keep in mind that your sophisticated OOM handling is likely to be
 spoiled by OS OOM killer hitting in and killing the whole process.

No. There are usually two cases:

1. overcommit is on: malloc always returns valid pointer, OOM killer stops
applications on memory access
2. overcommit is off: malloc returns NULL on OOM, no applications are killed

There is exception to rule 1, if ulimit is reached malloc returns NULL, but
application is not killed.

Anyway you don't scare me, so let's stop bikeshedding and speak about
implemenation (will start another thread for that shortly).

--
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Improving zeromq in OOM conditions

2011-05-16 Thread Paul Colomiets

On Tue, May 17, 2011 at 1:06 AM, Paul Colomiets p...@colomiets.name wrote:

 On Mon, May 16, 2011 at 10:38 AM, Martin Sustrik sust...@250bpm.comwrote:


 1. src/msg.cpp:56

 Fixed in one of patches attached to previous email

Sorry, it's not true :) I was skimming the code in hurry.

Well, this one is something actually closes connection. Do you think we
should do anything better here? I mean code in decoder.cpp:77 and
decoder.cpp:112.
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

[zeromq-dev] Improving zeromq in OOM conditions

2011-05-15 Thread Paul Colomiets

Hi,

As Martin encouraged me to fix zeromq in out of memory conditions. Here are
first patches and first questions.

There are a lot of explicit and implicit (e.g. inserting in STL container)
memory allocations in consturctors in zeromq code. As long as we are
encouraged not to use exceptions in zeromq code, we can't gracefully
propagate exceptions from there. So I see three options:

1. Refactor code to have all the memory allocations in `init()` method
(other name?)
2. Allow throwing and catching exceptions in code which is not on critical
path
3. Move memory allocation code to overriden `new` (which will probably turn
it into a mess)

BTW, if catching exceptions is discouraged at all, we need to rewrite all
code which uses STL containers.

Thoughs?

--
Paul


0001-zmq_msg_init_data-returns-ERRNO-instead-aborting.patch
Description: Binary data


0003-Better-handling-of-memory-error-in-resolve_ip_hostna.patch
Description: Binary data
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] HWM default

2011-05-14 Thread Paul Colomiets

On Thu, May 12, 2011 at 9:56 AM, Martin Sustrik sust...@250bpm.com wrote:

 On 05/09/2011 02:41 PM, Paul Colomiets wrote:

  Well, I have no practical experience, but a lot of alloc_assert's in the
 code says that it will be aborted. BTW, it will be so not only if
 overcommit_memory is set right, but also if ulimit is set for the
 application's memory. Which is awfully bad IMO.


 What else would you do if you run out of memory? Any recovery mechanism in
 user space is likely to fail because there's no memory available :) Killing
 (and possibly restarting) the app seems like a reasonable option to me.


On the application side you can flush buffers, flush cache, close
connections. On zeromq side you can stop accepting connections, stop
processing incoming data, drop incoming messages, drop messages already in
queue (delivery is not reliable anyway), wait (for data to be processed),
notify user about memory failure. Enought?

For some applications it's crucial, e.g. for databases which has a write
cache, for persistent queues which postpone their writes, for game servers
having some state in memory, whatever. You right that it's reasonable for
some *applications*. But it's quite mad for networking library. It will be
fixed in kernel implementation anyway (may be using other methods). Still
I'm pretty sure it should be fixed in a library.
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] HWM default

2011-05-09 Thread Paul Colomiets

On Mon, May 9, 2011 at 9:17 AM, Blair Bethwaite
blair.bethwa...@monash.eduwrote:


 As an aside, any idea of the 0MQ behaviour under flooding with no HWM
 but sane kernel virtual memory settings (i.e.,
 /proc/sys/vm/overcommit_memory=2)...?


Well, I have no practical experience, but a lot of alloc_assert's in the
code says that it will be aborted. BTW, it will be so not only if
overcommit_memory is set right, but also if ulimit is set for the
application's memory. Which is awfully bad IMO.

 We should distinguish deadlocks inside 0MQ (such as one introduced by the
 shutdown functionality) which should be considered 0MQ bugs and deadlocks in
 applications on top of 0MQ (the ones we are discussing now) which should be
 considered application bugs.


Don't see why shutdown would introduce deadlocks inside 0mq

But thinking of it a bit more about default HWM, it seems that issues are
same using TCP or unix domain sockets. And for future in-kernel
implementation HWM should probably always be set. So it's probably ok to
have reasonable default.
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] HWM default

2011-05-08 Thread Paul Colomiets

I should second Andrew Hume. The HWM is really easy way to get deadlock for
novice. And if you are proficient with zeromq, you know when to set HWM and
which value suites your usage.

Also should remind that deadlock probability was the reason why my
shutdown proposal was rejected.

On Sun, May 8, 2011 at 9:15 AM, Martin Sustrik sust...@250bpm.com wrote:

 Hi all,

 I though of changing the default value for HWM to say 1000 in 0MQ/3.0.

 The rationale being that unlimited buffer is a stability threat. In
 other words, default socket when overloaded or DoS'ed will run out of
 memory and crash the application rather than behave decently.

 So, I think, we should offer stable behaviour as default and leave it to
 user to opt out if infinite buffer is really what he wants.

 Martin
 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev

___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Improving message patterns

2011-04-13 Thread Paul Colomiets



13.04.2011, 09:10, Martin Sustrik sust...@250bpm.com:

 Most people want some kind of route to address mechanism so that they
 can send messages to specific instances of services. Once you start
 working on a solution for that problem you'll find out that you are
 basically duplicating the IP's routing functionality, distribution of
 routing tables etc. So the question at the moment is: What exactly
 should 0MQ provide in this scenario that is not already covered by IP?
 When we have the answer, it will be much easier to think about the solution.

Yes, you can say that this is covered by IP. But isn't what covered by IP is
also covered by ethernet?

What we want is logical addresses at zeromq level, which means both
the possibility that two services would be on same machine, and the
posibility that two machines will be used as a single logical node.

Same happens with the network at every level: we theoretically could leave
with single site per ip and give a machine several IP adresses, but usually
we use domain names to both: group sites on single IP and load-balance
single site to different IPs.

--
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] 0MQ protocol stack

2011-04-08 Thread Paul Colomiets

Hi Martin,

08.04.2011, 08:52, Martin Sustrik sust...@250bpm.com:
 On 04/06/2011 09:45 PM, Paul Colomiets wrote:

  We can introduce notion of out-of-band messages. This should be a
  flag on messages which are for other means look like normal. They can
  be used for connection initiation and heartbeating. (May be also
  subcriptions, but I don't sure). This way some options sent at
  initiation could probably be updated later.

 You mean using OOB to send/recv hop-to-hop messages, right?

Right.


 The problem is that once the option is available in the API, people
 would start using it for business logic. Those applications would then
 become tied to single-hop topology, with no way to scale.

Just don't let read thease messages by users. Connection init
should just transmit some options set with setsockopt. And
heartbeating probably could also be read by getsockopt (e.g. time of
the last successful heartbeat).

Heartbeating and connection init are inherently hop-to-hop, So
whats wrong with that?

Probably setting some socket option (documented as
internal implementation detail which could change) may enable
receiving OOB messages, if you really need them (it's useful
for monitoring or may be for authorizing connections?).

--
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

Re: [zeromq-dev] Important: backward incompatible changes for 0MQ/3.0!

2011-04-04 Thread Paul Colomiets

Hi Martin,

04.04.2011, 09:16, Martin Sustrik sust...@250bpm.com:
 Hi Paul,

  The documentation is actually a bit misleading. After
  you call shutdown(s, SHUT_RD) you *can* read, up to the point
  when shutdown called. It means everything already buffered will
  be read, and you will read until 0 (zero) returned from read call.

 What implementation is that? Both POSIX spec and Stevens seem to suggest
 that you can't read after shutdown(s, SHUT_RD).

This is Linux behavior. Just noted another behavior on FreeBSD.
Have currently no access to other platforms for test. But if
linux does that for TCP, why zeromq can't?


  2. The handshake with all the peers during the shutdown can take
  arbitrary long time and even cause a deadlock.
  Probably yes. It's up to you, to use it. Many small application will
  actually never care. Many huge applications will probably use failure
  resistancy to overcome reconfiguration problem. But there are plenty
  of others where you would stop entire system, when you add, remove
  or replace a node from configuration if you have no chance to
  shutdown socket cleanly. And time is not always very important.

 Well, I don't like introducing an API that works well for small apps and
 deadlocks for large apps. Let's rather think of something that works
 consistently in either scenario.

I don't understand your fear of deadlocks. You are always polling before
sending/receiving messages. If you don't you are in a bit trouble anyway.
IO thread works fully in non-blocking manner, so it can't deadlock on
sending/receiving bytes on the wire. If you wan't not to shutdown
properly you can always do that.


  Consider the following scenario. You have a node A which
  pushes work for node B along with 10 other nodes. And
  you should remove node B (probably because of replacing it
  with C). Node A has bound socket. Currently you have two
  ways:

  * stop producing on A until all the work is consumed by B, then
  disconnect it, connect C and continue. It can take a lot of time
  while other workers are also stopped
  * nevermind loosing messages and react on them downstream,
  which takes a lot of time to notice (because of timeout), and
  probably some more time to find lost messages in logs at node A.

 You have to keep in mind that messages may be lost at any point on route
 from A to B. Thus, you can't count on being notified about the loss. The
 only way to handle it is timeout. Btw, even in simple single-hop
 scenario, TCP spec mandates keep-alives of at least 2 hours. So, if B is
 killed brutally (such as when power is switched off) A won't be notified
 for at least 2 hours.


Yes. I do have timeouts. The timeout is about 5 seconds. I have
lots of messages per second (at least hundred and up to 10 thousand in
critical places), this means that TCP connection will break very fast if
a machine goes down. If I'd have small application which can be idle
for 2 hours I'd never care to pause all producers to make any reconfiguration
or software updates.


 1. Request/reply. In this case requester can re-send request after
 timeout have expired. There are couple of nice properties of this
 system: It's fully end-to-end so you are resilient against middle node
 failures. What you get is actually a TCP-level reliability (As long as
 the app is alive it will ultimately get the data through). The downside
 is that at some rare circumstances, a message may be processed twice
 Which does not really matter as the services are assumed to be stateless.


Actually about 50% of services I code are stateful. And indeed it's quite easy 
to
implement bookkeeping needed to re-send request. It could easily be done
in libzapi or in language binding or in your own project. Probably the socket
option would work for small applications you've said you don't care of :)
For big applications I guess they would write it in their own way (e.g. I
usually send both stateful and idempotent requests using same socket,
determining which is one by application specific means).

Frankly, it would be great if I could just send request and receive reply and
don't make loop along with zmq_poll. But you can't disable EINTR, so
loop will obviously there anyway unless it's language binding (which can
throw exception) or networking library with own main loop (which can
reap signal itself and resume), and both of them can do this sort of thing
today and will not become much simpler either way.

 2. Publish/subscribe. In this case we assume infinite feed of messages
 that individual subscribers consume. When shutting down, the subscriber
 has to cut off the feed at some point.  Whether it cuts off immediately,
 dropping the messages on the fly or whether it waits for messages on the
 fly to be delivered is irrelevant. The only difference is that the point
 of cut off is slightly delayed in the latter case.


Currently if subscriber disconnected for some reason, it can connect and
continue where it left over (if its identity is

Re: [zeromq-dev] Important: backward incompatible changes for 0MQ/3.0!

2011-04-03 Thread Paul Colomiets

Hi Martin,

03.04.2011, 11:58, Martin Sustrik sust...@250bpm.com:
 Hi Paul,

  I would say the question is how can we improve reliability of 0mq (NB:
  not make it perfect, just improve it) without dragging all this madness in.
  That was exacly my intention. May be I've not clear about that. I'm thinking
  about API similar to posix shutdown. First we call:

  zmq_shutdown(sock, SHUT_RD)

 Ok. Couple of points:

 1. Your proposal doesn't map to POSIX shutdown semantics. POSIX shutdown
 is a non-blocking operation, ie. it initiates a half-close and returns
 immediately. No more messages can be read/written.


Well I don't undrestand your point. Zeromq shutdown also must be 
non-blocking. The documentation is actually a bit misleading. After
you call shutdown(s, SHUT_RD) you *can* read, up to the point
when shutdown called. It means everything already buffered will
be read, and you will read until 0 (zero) returned from read call.

 2. The handshake with all the peers during the shutdown can take
 arbitrary long time and even cause a deadlock.

Probably yes. It's up to you, to use it. Many small application will
actually never care. Many huge applications will probably use failure
resistancy to overcome reconfiguration problem. But there are plenty
of others where you would stop entire system, when you add, remove
or replace a node from configuration if you have no chance to
shutdown socket cleanly. And time is not always very important.

Consider the following scenario. You have a node A which
pushes work for node B along with 10 other nodes. And
you should remove node B (probably because of replacing it
with C). Node A has bound socket. Currently you have two
ways:

* stop producing on A until all the work is consumed by B, then
disconnect it, connect C and continue. It can take a lot of time
while other workers are also stopped
* nevermind loosing messages and react on them downstream,
which takes a lot of time to notice (because of timeout), and
probably some more time to find lost messages in logs at node A.

With shutdown function, you shutdown socket at B. Consume
messages until shutdown message comes (probably you can
even forward them back to the A, to be injected again), it doesn't
matter if it takes time, because replacement node can be
connected immediately, and other nodes still work.

It is only single scenario where it's crucial, there are plenty of
others.

Ah, well, if you care about zmq_close needing to send this
message, then I would say, if you have an oustanding queue
of messages, then it doesn't matter of sending few more
bytes. And if you have no queue, then write call for socket
will return immediately and OS will take care of it, so it wouldn't
add considerable time to application shutdown.

 This kind of thing is
 extremely vulnerable to DoS attacks.


Why? Timeouts are also applied to the last request served.
An application is much more vulnerable if it must entirely stop
one producer to replace one of consumers (see above). And
of course if you remove the only (or last) consumer from the
chain, then it's vulnerable. But with described semantics you
can start new one within minimum amount of time (I guess
it's about 100 ms, the time needed for other side to reconnect).

 3. Note that the intention here is to improve the reliability rather
 than make 0MQ reliable. See previous email for the detailed
 discussion. Given the case is that we are trying to provide some
 heuristic semi-reliable behaviour, it should not change the API in any way.


Why you want heuristics instead of clean behavior. Of course it
will actually not improve reliability against crashes or network failures,
It will improve ability to reconfigure application on the fly.

May be socket option would work, instead of entirely new function, if
it's your major concern.

  Probably when we add a sentinel messages we can do PUB/SUB more
  reliable. When connection from publisher is closed unexpectedly we
  can send application EIO error (or whatever we choose). For tcp we know
  when connection is broken, for ipc it is broken only on application crash
  and we also know it, for pgm we have retry timeout. Also we have to
  inject this kind of message when queue is full and we loose some
  message. This way you don't need to count messages to know when
  to die if messages stream is broken (and don't need to duplicate complex
  bookkeeping when there are several publishers). For devices it's up
  to the application on whats to do with error. It have to forward it as some
  application specific message if it needs to.

 The problem here is that PUB/SUB allows for multiple publishers. Thus
 numbering the messages wouldn't help. The real solution AFAICS is
 breaking the pub/sub pattern into two distinct patterns: true pub/sub
 with a single stream of messages (numbering makes sense here) and
 aggregation where streams from multiple publishers are aggregated as
 they are forwarded towards the consumer (no point in

Re: [zeromq-dev] Important: backward incompatible changes for 0MQ/3.0!

2011-03-31 Thread Paul Colomiets



30.03.2011, 12:12, Martin Sustrik sust...@250bpm.com:
 Can you spell out more clearly what's the problem
 with zmq_content_t?)

The only problem is another memory allocation. If I want to use
another memory allocation technique for performance reasons, allocation
of content_t structure can make all the benefits negligible.

Probably API for it can be introduced, like zmq_msg_metadata_size(),
to get size of metadata structure, and zmq_msg_init_metadata() which
would accept a data pointer prefixed with data size. Not sure if
this complication is worth it in practice, though.


 As for the recv() side, there's no way at the moment to specify what
 allocation mechanism should be used.

 You are right that it should be done on socket level (setting alloc/free
 function pointers using setsockopt, for example). 

Great news! And thanks for good explanation.


BTW, if we discussing features for 3.0. I'd like to propose few things.

1. Have you considered implementing devices inside the IO thread?
   I'm sure it's controversal, but there are lots of cases where
  devices are unavoidable, but adding a tens of context switches
  for each message affects performance very negatively (we have
  a recv call on each zmq_recv and each zmq_send of single
  message part).

2. Currently there is no way to close PUSH/PULL or PUB/SUB
  socket or even probably a REQ/REP socket without loosing
  a message. The only combination that works is XREP/XREQ
  (and XREP/XREP), and this while doing all the bookkeeping
  yourself. I know that it's intentionally, because by design my
  system needs to be failure resistant and so on. But there
  are lots of use cases which needs that. E.g I want a local
  process start, process single message and die peacefully.
  Today, I'll loose more messages than process. Second use
  case, is if I have a device which load balances messages
  between local processes by ipc://. I know that the only reason
  for processes to disconnect is when I want to restart process
  to get software update (crashes are quite rare). Okay, it's even
  more controversal example. I have client which will
  repeat request if there is a network trouble for sure. But restart
  is going to loose at least tens of messages and timeout is
  for very hard edge cases, because of latency. And really what's
  the purpose of message queue, which can't queue messages
  for me? :)
  
--
Paul
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev

1 2 >

1 - 100 of 104 matches

Mail list logo