Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-30 Thread andrea crotti
2012/8/29 Julie Anderson julie.anderson...@gmail.com:


 I understand your frustration. I don't put the code here because I don't
 want to, but because I am legally unable to. If you have a boss or employer
 you can understand that. :) I will try to come up with a simple version to
 do the same thing. I should not be hard to do a ping-pong in Java.


I really don't believe you're not legally able to paste some code
frankly.  If there is some sensitive data in the tests you're doing
delete it it and use dummy data.  Do you think they are going to fire
you because you wrote some code which has absolutely no relevance with
your company in a mainling list?

If not rewrite it when you're at home, I'm sure it's not that hard,
complaining about something not being able to prove anything is not
that good imho..
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-30 Thread Julie Anderson
As I said in the text you quoted: I will try to come up with a simple
version to do the same thing.

But Stuart did that for me in C. My thanks to him.

I am not complaining about anything... Just trying to understand why the
extra latency is necessary. There are already some very good answers here
about that. This extra latency by itself does not make ZeroMQ bad or slow.
I think Robert was the one that addressed that very well. The minority of
financial systems (hedge funds and exchanges) will care about 10
microseconds.

On Thu, Aug 30, 2012 at 10:10 AM, andrea crotti
andrea.crott...@gmail.comwrote:

 2012/8/29 Julie Anderson julie.anderson...@gmail.com:
 
 
  I understand your frustration. I don't put the code here because I don't
  want to, but because I am legally unable to. If you have a boss or
 employer
  you can understand that. :) I will try to come up with a simple version
 to
  do the same thing. I should not be hard to do a ping-pong in Java.
 

 I really don't believe you're not legally able to paste some code
 frankly.  If there is some sensitive data in the tests you're doing
 delete it it and use dummy data.  Do you think they are going to fire
 you because you wrote some code which has absolutely no relevance with
 your company in a mainling list?

 If not rewrite it when you're at home, I'm sure it's not that hard,
 complaining about something not being able to prove anything is not
 that good imho..
 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev

___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-30 Thread Bennie Kloosteman
On Thu, Aug 30, 2012 at 11:56 PM, Julie Anderson 
julie.anderson...@gmail.com wrote:

 As I said in the text you quoted: I will try to come up with a simple
 version to do the same thing.

 But Stuart did that for me in C. My thanks to him.

 I am not complaining about anything... Just trying to understand why the
 extra latency is necessary. There are already some very good answers here
 about that. This extra latency by itself does not make ZeroMQ bad or slow.
 I think Robert was the one that addressed that very well. The minority of
 financial syInbox (634)stems (hedge funds and exchanges) will care about
 10 microseconds.



Exchanges will ,  Hedge funds wont ( since they are dealing with at least
100 micro seconds from the exchange + the links and their business logic) .
Anyway except for the exchange itself  ( which doesn't deal with links in
their quotes) I haven't seen a system that beats 1ms consistently  ( though
there are probably a hand full ) in a real life environment over real WAN
links . 10 us is 1% of  that.  Unless you host at the exchange or have some
special traffic shaped connection your also lucky to get 1ms through  their
routers and firewall. So get to 1 ms .. if you can  , then worry about the
micro seconds..

Ben
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-30 Thread Pieter Hintjens
On Thu, Aug 30, 2012 at 12:13 AM, Julie Anderson
julie.anderson...@gmail.com wrote:

 Just tested ZeroMQ and Java NIO in the same machine.

You're comparing apples to a factory that can process apples into
juice at the rate of millions a second.

For that extra latency in 0MQ you get things like message batching,
asynch i/o, routing patterns. The cost could be brought down (see
Martin Sustrik's nano project, which brings it way down) by
redesigning the 0MQ internals.

Having said this, it's probably worth taking a profiler to 0MQ and
seeing if the critical path can't be improved somewhat.

-Pieter
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


[zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Julie Anderson
Just tested ZeroMQ and Java NIO in the same machine.

The results:
*
- ZeroMQ:*

message size: 13 [B]
roundtrip count: 10
average latency: *19.620* [us] *== ONE-WAY LATENCY*

*- Java NIO Selector:* (EPoll)

Average RTT (round-trip time) latency of a 13-byte message: 15.342 [us]
Min Time: 11.664 [us]
99.999% percentile: *15.340* [us] *== RTT LATENCY*

*Conclusion:* That's *39.240 versus 15.340* so ZeroMQ overhead on top of
TCP is *156%* or *23.900 nanoseconds* !!! That's excessive. I would expect
1 or 2 microseconds there.

So my questions are:

1) What does ZeroMQ do under the rood that justifies so many extra clock
cycles? (I am really curious to know)

2) Do people agree that 23 microseconds are just too much?

-Julie
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Brian Knox
As far as I see you haven't included your test methodology or your test
code.  Without any information about your test I can't have any opinion on
your results.  Maybe I missed an earlier email where you included
information about your test environment and methodology?

Brian

On Wed, Aug 29, 2012 at 11:13 AM, Julie Anderson 
julie.anderson...@gmail.com wrote:

 Just tested ZeroMQ and Java NIO in the same machine.

 The results:
 *
 - ZeroMQ:*

 message size: 13 [B]
 roundtrip count: 10
 average latency: *19.620* [us] *== ONE-WAY LATENCY*

 *- Java NIO Selector:* (EPoll)

 Average RTT (round-trip time) latency of a 13-byte message: 15.342 [us]
 Min Time: 11.664 [us]
 99.999% percentile: *15.340* [us] *== RTT LATENCY*

 *Conclusion:* That's *39.240 versus 15.340* so ZeroMQ overhead on top of
 TCP is *156%* or *23.900 nanoseconds* !!! That's excessive. I would
 expect 1 or 2 microseconds there.

 So my questions are:

 1) What does ZeroMQ do under the rood that justifies so many extra clock
 cycles? (I am really curious to know)

 2) Do people agree that 23 microseconds are just too much?

 -Julie


 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev


___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Chuck Remes

On Aug 29, 2012, at 10:13 AM, Julie Anderson wrote:

 Just tested ZeroMQ and Java NIO in the same machine.
 
 The results:
 
 - ZeroMQ:
 
 message size: 13 [B]
 roundtrip count: 10
 average latency: 19.620 [us] == ONE-WAY LATENCY
 
 - Java NIO Selector: (EPoll)
 
 Average RTT (round-trip time) latency of a 13-byte message: 15.342 [us]
 Min Time: 11.664 [us]
 99.999% percentile: 15.340 [us] == RTT LATENCY
 
 Conclusion: That's 39.240 versus 15.340 so ZeroMQ overhead on top of TCP is 
 156% or 23.900 nanoseconds !!! That's excessive. I would expect 1 or 2 
 microseconds there.
 
 So my questions are: 
 
 1) What does ZeroMQ do under the rood that justifies so many extra clock 
 cycles? (I am really curious to know)
 
 2) Do people agree that 23 microseconds are just too much?

As a favor to me, please rerun the tests so that at least 1 million (10 million 
is better) messages are sent. This shouldn't take more than a few minutes to 
run. Thanks.

Secondly, are you using the local_lat and remote_lat programs that are included 
with zeromq or did you write your own? If you wrote your own, please share the 
code.

Thirdly, a pastie containing the code for both tests so others could 
independently reproduce your results would be very handy.

cr

___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Julie Anderson
New numbers (fun!). Firstly, to make sure I was comparing apples with
apples, I modified my tests to compute one-way trip instead of round-trip.
I can't paste code, but I am simply using a Java NIO (non-blocking I/O)
optimized with busy spinning to send and receive tcp data. This is
*standard* Java NIO code, nothing too fancy. You can google around for Java
NIO. I found this
linkhttp://www.cordinc.com/blog/2010/08/java-nio-server-example.htmlthat
shows the basics. You can also do the same thing in C as you can see
herehttp://stackoverflow.com/questions/27247/could-you-recommend-some-guides-about-epoll-on-linux/6150841#6150841
.

My test now consists of:

- JVM A sends a message which consist of the ascii representation of a
timestamp in nanos.
- JVM B receives this message, parses the long, computer the one-way
latency and echoes back the message to JVM A.
- JVM A receives the echo, parses the ascii long and makes sure that it
matches the one it sent out.
- Loop back and send the next message.

So now I have both times: one-way and round-trip.

I ran my test for 1 million messages over loopback.

For ZeroMQ I am using the local_lat and remote_lat programs included with
latest zeromq from here: git://github.com/zeromq/libzmq.git

The results:

*- ZeroMQ:*

./local_lat tcp://lo: 13 100
./remote_lat tcp://127.0.0.1: 13 100

message size: 13 [B]
roundtrip count: 100
average latency: *19.674* [us] * this is one-way*

*- Java NIO:* (EPoll with busy spinning)

Round-trip: Iterations: 1,000,000 | Avg Time: *16552.15 nanos* | Min Time:
12515 nanos | Max Time: 129816 nanos | 75%: 16290 nanos | 90%: 16369 nanos
| 99%: 16489 nanos | 99.999%: *16551 nanos*

One-way trip: Iterations: 1,110,000 | Avg Time: *8100.12 nanos* | Min Time:
6150 nanos | Max Time: 118035 nanos | 75%: 7966 nanos | 90%: 8010 nanos |
99%: 8060 nanos | 99.999%: *8099 nanos*

*Conclusions:* That's *19.674 versus 8.100* so ZeroMQ overhead on top of
TCP is *142%* or *11.574 nanoseconds* !!! That's excessive. I would expect
1 microsecond overhead there.

So questions remain:

1) What does ZeroMQ do under the rood that justifies so many extra clock
cycles? (I am really curious to know)

2) Do people agree that 11 microseconds are just too much?

My rough guess: ZeroMQ uses threads? (the beauty of NIO is that it is
single-threaded, so there is always only one thread reading and writing to
the network)

-Julie

On Wed, Aug 29, 2012 at 10:24 AM, Chuck Remes li...@chuckremes.com wrote:


 On Aug 29, 2012, at 10:13 AM, Julie Anderson wrote:

 Just tested ZeroMQ and Java NIO in the same machine.

 The results:
 *
 - ZeroMQ:*

 message size: 13 [B]
 roundtrip count: 10
 average latency: *19.620* [us] *== ONE-WAY LATENCY*

 *- Java NIO Selector:* (EPoll)

 Average RTT (round-trip time) latency of a 13-byte message: 15.342 [us]
 Min Time: 11.664 [us]
 99.999% percentile: *15.340* [us] *== RTT LATENCY*

 *Conclusion:* That's *39.240 versus 15.340* so ZeroMQ overhead on top of
 TCP is *156%* or *23.900 nanoseconds* !!! That's excessive. I would
 expect 1 or 2 microseconds there.

 So my questions are:

 1) What does ZeroMQ do under the rood that justifies so many extra clock
 cycles? (I am really curious to know)

 2) Do people agree that 23 microseconds are just too much?


 As a favor to me, please rerun the tests so that at least 1 million (10
 million is better) messages are sent. This shouldn't take more than a few
 minutes to run. Thanks.

 Secondly, are you using the local_lat and remote_lat programs that are
 included with zeromq or did you write your own? If you wrote your own,
 please share the code.

 Thirdly, a pastie containing the code for both tests so others could
 independently reproduce your results would be very handy.

 cr


 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev


___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread gonzalo diethelm
Julie, it is a little exasperating that you keep posting these numbers (and 
related questions) but, to date, have not shown the CODE used to get them. It 
is not possible to give a meaningful answer to your questions without looking 
at the EXACT code you are using. Furthermore, it would be very useful to be 
able to RUN the same code in one's machine, to ascertain whether the behavior 
is the same as you are reporting, and maybe fix something in 0MQ.

Best regards,

--
Gonzalo Diethelm
DCV Chile

From: zeromq-dev-boun...@lists.zeromq.org 
[mailto:zeromq-dev-boun...@lists.zeromq.org] On Behalf Of Julie Anderson
Sent: Wednesday, August 29, 2012 1:19 PM
To: ZeroMQ development list
Subject: Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO 
Epoll (with measurements)

New numbers (fun!). Firstly, to make sure I was comparing apples with apples, I 
modified my tests to compute one-way trip instead of round-trip. I can't paste 
code, but I am simply using a Java NIO (non-blocking I/O) optimized with busy 
spinning to send and receive tcp data. This is *standard* Java NIO code, 
nothing too fancy. You can google around for Java NIO. I found this 
linkhttp://www.cordinc.com/blog/2010/08/java-nio-server-example.html that 
shows the basics. You can also do the same thing in C as you can see 
herehttp://stackoverflow.com/questions/27247/could-you-recommend-some-guides-about-epoll-on-linux/6150841#6150841.

My test now consists of:

- JVM A sends a message which consist of the ascii representation of a 
timestamp in nanos.
- JVM B receives this message, parses the long, computer the one-way latency 
and echoes back the message to JVM A.
- JVM A receives the echo, parses the ascii long and makes sure that it matches 
the one it sent out.
- Loop back and send the next message.

So now I have both times: one-way and round-trip.

I ran my test for 1 million messages over loopback.

For ZeroMQ I am using the local_lat and remote_lat programs included with 
latest zeromq from here: 
git://github.com/zeromq/libzmq.githttp://github.com/zeromq/libzmq.git

The results:

- ZeroMQ:

./local_lat tcp://lo: 13 100
./remote_lat tcp://127.0.0.1:http://127.0.0.1: 13 100

message size: 13 [B]
roundtrip count: 100
average latency: 19.674 [us]  this is one-way

- Java NIO: (EPoll with busy spinning)

Round-trip: Iterations: 1,000,000 | Avg Time: 16552.15 nanos | Min Time: 12515 
nanos | Max Time: 129816 nanos | 75%: 16290 nanos | 90%: 16369 nanos | 99%: 
16489 nanos | 99.999%: 16551 nanos

One-way trip: Iterations: 1,110,000 | Avg Time: 8100.12 nanos | Min Time: 6150 
nanos | Max Time: 118035 nanos | 75%: 7966 nanos | 90%: 8010 nanos | 99%: 8060 
nanos | 99.999%: 8099 nanos

Conclusions: That's 19.674 versus 8.100 so ZeroMQ overhead on top of TCP is 
142% or 11.574 nanoseconds !!! That's excessive. I would expect 1 microsecond 
overhead there.

So questions remain:

1) What does ZeroMQ do under the rood that justifies so many extra clock 
cycles? (I am really curious to know)

2) Do people agree that 11 microseconds are just too much?

My rough guess: ZeroMQ uses threads? (the beauty of NIO is that it is 
single-threaded, so there is always only one thread reading and writing to the 
network)

-Julie
On Wed, Aug 29, 2012 at 10:24 AM, Chuck Remes 
li...@chuckremes.commailto:li...@chuckremes.com wrote:

On Aug 29, 2012, at 10:13 AM, Julie Anderson wrote:


Just tested ZeroMQ and Java NIO in the same machine.

The results:

- ZeroMQ:

message size: 13 [B]
roundtrip count: 10
average latency: 19.620 [us] == ONE-WAY LATENCY

- Java NIO Selector: (EPoll)

Average RTT (round-trip time) latency of a 13-byte message: 15.342 [us]
Min Time: 11.664 [us]
99.999% percentile: 15.340 [us] == RTT LATENCY

Conclusion: That's 39.240 versus 15.340 so ZeroMQ overhead on top of TCP is 
156% or 23.900 nanoseconds !!! That's excessive. I would expect 1 or 2 
microseconds there.

So my questions are:

1) What does ZeroMQ do under the rood that justifies so many extra clock 
cycles? (I am really curious to know)

2) Do people agree that 23 microseconds are just too much?

As a favor to me, please rerun the tests so that at least 1 million (10 million 
is better) messages are sent. This shouldn't take more than a few minutes to 
run. Thanks.

Secondly, are you using the local_lat and remote_lat programs that are included 
with zeromq or did you write your own? If you wrote your own, please share the 
code.

Thirdly, a pastie containing the code for both tests so others could 
independently reproduce your results would be very handy.

cr


___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.orgmailto:zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev





- 
DeclaraciĆ³n de confidencialidad: Este Mensaje esta destinado para
el uso de la o las personas o entidades a quien ha sido dirigido y
puede

Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Julie Anderson
Here are the UDP numbers for whom it may concern. As one would expect much
better than TCP.

RTT: (round-trip time)

Iterations: 1,000,000 | Avg Time: *10373.9 nanos* | Min Time: 8626 nanos |
Max Time: 136269 nanos | 75%: 10186 nanos | 90%: 10253 nanos | 99%: 10327
nanos | 99.999%: 10372 nanos

OWT: (one-way time)

Iterations: 2,221,118 | Avg Time: *5095.66 nanos* | Min Time: 4220 nanos |
Max Time: 135584 nanos | 75%: 5001 nanos | 90%: 5037 nanos | 99%: 5071
nanos | 99.999%: 5094 nanos

-Julie


On Wed, Aug 29, 2012 at 12:18 PM, Julie Anderson 
julie.anderson...@gmail.com wrote:

 New numbers (fun!). Firstly, to make sure I was comparing apples with
 apples, I modified my tests to compute one-way trip instead of round-trip.
 I can't paste code, but I am simply using a Java NIO (non-blocking I/O)
 optimized with busy spinning to send and receive tcp data. This is
 *standard* Java NIO code, nothing too fancy. You can google around for Java
 NIO. I found this 
 linkhttp://www.cordinc.com/blog/2010/08/java-nio-server-example.htmlthat 
 shows the basics. You can also do the same thing in C as you can see
 herehttp://stackoverflow.com/questions/27247/could-you-recommend-some-guides-about-epoll-on-linux/6150841#6150841
 .

 My test now consists of:

 - JVM A sends a message which consist of the ascii representation of a
 timestamp in nanos.
 - JVM B receives this message, parses the long, computer the one-way
 latency and echoes back the message to JVM A.
 - JVM A receives the echo, parses the ascii long and makes sure that it
 matches the one it sent out.
 - Loop back and send the next message.

 So now I have both times: one-way and round-trip.

 I ran my test for 1 million messages over loopback.

 For ZeroMQ I am using the local_lat and remote_lat programs included with
 latest zeromq from here: git://github.com/zeromq/libzmq.git

 The results:

 *- ZeroMQ:*

 ./local_lat tcp://lo: 13 100
 ./remote_lat tcp://127.0.0.1: 13 100

 message size: 13 [B]
 roundtrip count: 100
 average latency: *19.674* [us] * this is one-way*

 *- Java NIO:* (EPoll with busy spinning)

 Round-trip: Iterations: 1,000,000 | Avg Time: *16552.15 nanos* | Min
 Time: 12515 nanos | Max Time: 129816 nanos | 75%: 16290 nanos | 90%: 16369
 nanos | 99%: 16489 nanos | 99.999%: *16551 nanos*

 One-way trip: Iterations: 1,110,000 | Avg Time: *8100.12 nanos* | Min
 Time: 6150 nanos | Max Time: 118035 nanos | 75%: 7966 nanos | 90%: 8010
 nanos | 99%: 8060 nanos | 99.999%: *8099 nanos*

 *Conclusions:* That's *19.674 versus 8.100* so ZeroMQ overhead on top of
 TCP is *142%* or *11.574 nanoseconds* !!! That's excessive. I would
 expect 1 microsecond overhead there.

 So questions remain:


 1) What does ZeroMQ do under the rood that justifies so many extra clock
 cycles? (I am really curious to know)

 2) Do people agree that 11 microseconds are just too much?

 My rough guess: ZeroMQ uses threads? (the beauty of NIO is that it is
 single-threaded, so there is always only one thread reading and writing to
 the network)

 -Julie

 On Wed, Aug 29, 2012 at 10:24 AM, Chuck Remes li...@chuckremes.comwrote:


 On Aug 29, 2012, at 10:13 AM, Julie Anderson wrote:

 Just tested ZeroMQ and Java NIO in the same machine.

 The results:
 *
 - ZeroMQ:*

 message size: 13 [B]
 roundtrip count: 10
 average latency: *19.620* [us] *== ONE-WAY LATENCY*

 *- Java NIO Selector:* (EPoll)

 Average RTT (round-trip time) latency of a 13-byte message: 15.342 [us]
 Min Time: 11.664 [us]
 99.999% percentile: *15.340* [us] *== RTT LATENCY*

 *Conclusion:* That's *39.240 versus 15.340* so ZeroMQ overhead on top of
 TCP is *156%* or *23.900 nanoseconds* !!! That's excessive. I would
 expect 1 or 2 microseconds there.

 So my questions are:

 1) What does ZeroMQ do under the rood that justifies so many extra clock
 cycles? (I am really curious to know)

 2) Do people agree that 23 microseconds are just too much?


 As a favor to me, please rerun the tests so that at least 1 million (10
 million is better) messages are sent. This shouldn't take more than a few
 minutes to run. Thanks.

 Secondly, are you using the local_lat and remote_lat programs that are
 included with zeromq or did you write your own? If you wrote your own,
 please share the code.

 Thirdly, a pastie containing the code for both tests so others could
 independently reproduce your results would be very handy.

 cr


 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev



___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Robert G. Jakabosky
On Wednesday 29, Julie Anderson wrote:
 So questions remain:
 
 1) What does ZeroMQ do under the rood that justifies so many extra clock
 cycles? (I am really curious to know)

ZeroMQ is using background IO threads to do the sending/receiving.  So the 
extra latency is do to passing the messages between the application thread and 
the IO thread.


 2) Do people agree that 11 microseconds are just too much?

No.  A simple IO event loop using epoll is fine for a IO (network) bound 
application, but if you need to do complex work (cpu bound) mixed with non-
blocking IO, then ZeroMQ can make it easy to scale.

Also try comparing the latency of Java NIO using TCP/UDP against ZeroMQ using 
the inproc transport using two threads in the same JVM instance.

With ZeroMQ it is easy to do thread to thread, process to process, and/or 
server to server communication all at the same time using the same interface.

Basically ZeroMQ has different use-case then a simple IO event loop.

-- 
Robert G. Jakabosky
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Julie Anderson
See my comments below:

On Wed, Aug 29, 2012 at 4:06 PM, Robert G. Jakabosky
bo...@sharedrealm.comwrote:

 On Wednesday 29, Julie Anderson wrote:
  So questions remain:
 
  1) What does ZeroMQ do under the rood that justifies so many extra clock
  cycles? (I am really curious to know)

 ZeroMQ is using background IO threads to do the sending/receiving.  So the
 extra latency is do to passing the messages between the application thread
 and
 the IO thread.


This kind of thread architecture sucks for latency sensitive applications.
That's why non-blocking I/O exists. That's my humble opinion and the
numbers support it.



  2) Do people agree that 11 microseconds are just too much?

 No.  A simple IO event loop using epoll is fine for a IO (network) bound
 application, but if you need to do complex work (cpu bound) mixed with non-
 blocking IO, then ZeroMQ can make it easy to scale.


Totally agree, but that has nothing to do with a financial application.
Financial applications do not need to do complex CPU bound analysis like a
image processing application would need. Financial application only cares
about LATENCY and network I/O.



 Also try comparing the latency of Java NIO using TCP/UDP against ZeroMQ
 using
 the inproc transport using two threads in the same JVM instance.


What is the problem with inproc? Just use a method call in the same JVM or
shared memory for different JVMs. If you want inter-thread communication
there are blazing-fast solutions in Java for that too. For example, I would
be surprised if ZeroMQ can come close to Disruptor for inter-thread
communication.



 With ZeroMQ it is easy to do thread to thread, process to process, and/or
 server to server communication all at the same time using the same
 interface.


This generic API is cool, but it is solving a problem financial systems do
not have and creating a bigger problem by adding latency.



 Basically ZeroMQ has different use-case then a simple IO event loop.


I thought ZeroMQ flagship customers were financial institutions. Then maybe
I was wrong.



 --
 Robert G. Jakabosky
 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev

___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Chuck Remes

On Aug 29, 2012, at 4:46 PM, Julie Anderson wrote:

 See my comments below:

And mine too.

 On Wed, Aug 29, 2012 at 4:06 PM, Robert G. Jakabosky bo...@sharedrealm.com 
 wrote:
 On Wednesday 29, Julie Anderson wrote:
  So questions remain:
 
  1) What does ZeroMQ do under the rood that justifies so many extra clock
  cycles? (I am really curious to know)
 
 ZeroMQ is using background IO threads to do the sending/receiving.  So the
 extra latency is do to passing the messages between the application thread and
 the IO thread.
 
 
 This kind of thread architecture sucks for latency sensitive applications. 
 That's why non-blocking I/O exists. That's my humble opinion and the numbers 
 support it.

What numbers? Only you have produced them so far. 

We have been quite patient with you. It appears you have some experience, so 
I'm confused as to why you refuse to provide any code for the rest of us to run 
to duplicate your results. If the roles were reversed I am certain you would 
want to run it yourself.

If you want our help, don't tell us to google some code to run. If it's 
really that easy then provide a link and make sure that your numbers are coming 
from the exact same code. Until someone else can independently verify your 
numbers then everything you have written is just smoke.


 Also try comparing the latency of Java NIO using TCP/UDP against ZeroMQ using
 the inproc transport using two threads in the same JVM instance.
 
 What is the problem with inproc? Just use a method call in the same JVM or 
 shared memory for different JVMs. If you want inter-thread communication 
 there are blazing-fast solutions in Java for that too. For example, I would 
 be surprised if ZeroMQ can come close to Disruptor for inter-thread 
 communication.

I would be surprised too. Zeromq doesn't solve the same problem as Disruptor.

 Basically ZeroMQ has different use-case then a simple IO event loop.
 
 I thought ZeroMQ flagship customers were financial institutions. Then maybe I 
 was wrong.

Don't be insulting. It doesn't help you or inspire anyone here to look into 
your claims.

cr

___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Julie Anderson
See my comments below:

On Wed, Aug 29, 2012 at 5:28 PM, Chuck Remes li...@chuckremes.com wrote:


 On Aug 29, 2012, at 4:46 PM, Julie Anderson wrote:

 See my comments below:


 And mine too.

 On Wed, Aug 29, 2012 at 4:06 PM, Robert G. Jakabosky 
 bo...@sharedrealm.com wrote:

 On Wednesday 29, Julie Anderson wrote:
  So questions remain:
 
  1) What does ZeroMQ do under the rood that justifies so many extra clock
  cycles? (I am really curious to know)

 ZeroMQ is using background IO threads to do the sending/receiving.  So the
 extra latency is do to passing the messages between the application
 thread and
 the IO thread.


 This kind of thread architecture sucks for latency sensitive applications.
 That's why non-blocking I/O exists. That's my humble opinion and the
 numbers support it.


 What numbers? Only you have produced them so far.

 We have been quite patient with you. It appears you have some experience,
 so I'm confused as to why you refuse to provide any code for the rest of us
 to run to duplicate your results. If the roles were reversed I am certain
 you would want to run it yourself.

 If you want our help, don't tell us to google some code to run. If it's
 really that easy then provide a link and make sure that your numbers are
 coming from the exact same code. Until someone else can independently
 verify your numbers then everything you have written is just smoke.



I understand your frustration. I don't put the code here because I don't
want to, but because I am legally unable to. If you have a boss or employer
you can understand that. :) I will try to come up with a simple version to
do the same thing. I should not be hard to do a ping-pong in Java.



 Also try comparing the latency of Java NIO using TCP/UDP against ZeroMQ
 using
 the inproc transport using two threads in the same JVM instance.


 What is the problem with inproc? Just use a method call in the same JVM or
 shared memory for different JVMs. If you want inter-thread communication
 there are blazing-fast solutions in Java for that too. For example, I would
 be surprised if ZeroMQ can come close to Disruptor for inter-thread
 communication.


I would be surprised too. Zeromq doesn't solve the same problem as
 Disruptor.


Disruptor solves inter-thread communication without synchronization latency
(light blocking using memory barriers). So if you have two threads and need
them to talk to each other as fast as possible you would use disruptor.
That's what I thought the colleague was addressing here: *Also try
comparing the latency of Java NIO using TCP/UDP against ZeroMQ using the
inproc transport using two threads in the same JVM instance.*



 Basically ZeroMQ has different use-case then a simple IO event loop.


 I thought ZeroMQ flagship customers were financial institutions. Then
 maybe I was wrong.


 Don't be insulting. It doesn't help you or inspire anyone here to look
 into your claims.



Insulting? I really think I was not insulting anyone or anything, but if
you got that impression please accept my sincere apologies.

Nothing is perfect. I am just trying to understand ZeroMQ approach and its
overhead on top of the raw network latency. Maybe a single-threaded ZeroMQ
implementation for the future using non-blocking I/O?



 cr


 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev


___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Robert G. Jakabosky
On Wednesday 29, Julie Anderson wrote:
 See my comments below:
 
 On Wed, Aug 29, 2012 at 4:06 PM, Robert G. Jakabosky
 
 bo...@sharedrealm.comwrote:
  On Wednesday 29, Julie Anderson wrote:
   So questions remain:
   
   1) What does ZeroMQ do under the rood that justifies so many extra
   clock cycles? (I am really curious to know)
  
  ZeroMQ is using background IO threads to do the sending/receiving.  So
  the extra latency is do to passing the messages between the application
  thread and
  the IO thread.
 
 This kind of thread architecture sucks for latency sensitive applications.
 That's why non-blocking I/O exists. That's my humble opinion and the
 numbers support it.

If low-latency is the most important thing for your application, then use a 
custom protocol  highly tuned network code.

ZeroMQ is not a low-level networking library it provide some high-level 
features that are not available with raw sockets.

If you are planing on doing high-frequency trading, then you will need to 
write your own networking code (or FPGA logic) to squeeze out every last 
micro/nanosecond.  ZeroMQ is not going to be the right solution to every use-
case.


   2) Do people agree that 11 microseconds are just too much?
  
  No.  A simple IO event loop using epoll is fine for a IO (network) bound
  application, but if you need to do complex work (cpu bound) mixed with
  non- blocking IO, then ZeroMQ can make it easy to scale.
 
 Totally agree, but that has nothing to do with a financial application.
 Financial applications do not need to do complex CPU bound analysis like a
 image processing application would need. Financial application only cares
 about LATENCY and network I/O.

Not all Financial application care only about latency.  For some system it is 
important to scale out to very large number of subscribers and large volume of 
messages.

When comparing ZeroMQ to raw network IO for one connection, ZeroMQ will have 
more latency overhead.  Try your test with many thousands of connections with 
subscriptions to lots of different topics, then ZeroMQ will start to come out 
ahead.


  Also try comparing the latency of Java NIO using TCP/UDP against ZeroMQ
  using
  the inproc transport using two threads in the same JVM instance.
 
 What is the problem with inproc? Just use a method call in the same JVM or
 shared memory for different JVMs. If you want inter-thread communication
 there are blazing-fast solutions in Java for that too. For example, I would
 be surprised if ZeroMQ can come close to Disruptor for inter-thread
 communication.

ZeroMQ's inproc transport can be used in an event loop along side the TCP and 
IPC transports.  With ZeroMQ you can mix-and-match transports as needed.  If 
you can do all that with custom code with lower latency, then do it.  ZeroMQ 
is for people who don't have the experience to do that kind of thread-safe 
programming, or just want to scale out there application.


  With ZeroMQ it is easy to do thread to thread, process to process, and/or
  server to server communication all at the same time using the same
  interface.
 
 This generic API is cool, but it is solving a problem financial systems do
 not have and creating a bigger problem by adding latency.

ZeroMQ is not adding latency for no reason.  If you think that the latency can 
be eliminated, then go ahead and change the core code to not use IO threads.

  Basically ZeroMQ has different use-case then a simple IO event loop.
 
 I thought ZeroMQ flagship customers were financial institutions. Then maybe
 I was wrong.

ZeroMQ is competing with other Message-oriented middleware, like RabbitMQ, 
SwiftMQ, JMS, or other Message queuing systems.  These systems are popular 
with financial institutions.


-- 
Robert G. Jakabosky
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Robert G. Jakabosky
On Wednesday 29, Julie Anderson wrote:
 
 Nothing is perfect. I am just trying to understand ZeroMQ approach and its
 overhead on top of the raw network latency. Maybe a single-threaded ZeroMQ
 implementation for the future using non-blocking I/O?

You might be interested in xsnano [1] which is an experimental project to try 
different threading models (should be possible to support single-thread 
model).  I am not sure how far along it is.


1. https://github.com/sustrik/xsnano



-- 
Robert G. Jakabosky
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Steven McCoy
On 29 August 2012 17:46, Julie Anderson julie.anderson...@gmail.com wrote:

 ZeroMQ is using background IO threads to do the sending/receiving.  So the
 extra latency is do to passing the messages between the application
 thread and
 the IO thread.


 This kind of thread architecture sucks for latency sensitive applications.
 That's why non-blocking I/O exists.


Non-blocking IO is a legacy from early structured programming, it just so
happens it fits well into a pro-actor model for IOCP performance.  Many
people prefer to develop code using the reactor model and 0mq fits well
there.


  Totally agree, but that has nothing to do with a financial application.
 Financial applications do not need to do complex CPU bound analysis like a
 image processing application would need. Financial application only cares
 about LATENCY and network I/O.




That's not a financial application, that's an automated trading
application.  99.9% of financial applications do not remotely care about
latency, your sweeping generalisation just wiped out domains such as hedge
fund analytics and news.


 What is the problem with inproc? Just use a method call in the same JVM or
 shared memory for different JVMs. If you want inter-thread communication
 there are blazing-fast solutions in Java for that too. For example, I would
 be surprised if ZeroMQ can come close to Disruptor for inter-thread
 communication.




This is relying on pro-actor model and cores to waste.  We had a discussion
of Disruptor on the mailing list in the past, its quite nice but doesn't
remotely scale up for complicated applications like 0mq can.


 This generic API is cool, but it is solving a problem financial systems do
 not have and creating a bigger problem by adding latency.


Financial systems have a huge problem with integration, look at the billion
dollar messaging industry of TIBCO, IBM and Informatica.


 I thought ZeroMQ flagship customers were financial institutions. Then
 maybe I was wrong.


I think its HPC, although I'm certainly using it for financial institutions
for applications when latency and network IO are surprisingly irrelevant
but memory IO, flexibility, simplicity and low learning curve are.

-- 
Steve-o
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Michel Pelletier
On Wed, Aug 29, 2012 at 3:45 PM, Julie Anderson
julie.anderson...@gmail.com wrote:

 I understand your frustration.

It's abundantly clear to me that whatever expertise you have on the
absolute fastest and most trivial way to send data between two
programs, you do not understand Chuck's frustration.

 I don't put the code here because I don't
 want to, but because I am legally unable to. If you have a boss or employer
 you can understand that.

We ask the same understanding of you.  We are all busy and have
limited time, part of which we invest in working on a free library and
providing support for free to people like you.  If you don't want us
to help you, then leave us alone.  If you want to help us, then follow
the rules.  Please demonstrate your test methodology that will allow
us to reproduce your performance claims.

-Michel
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Stuart Brandt
Not sure I want to step into the middle of this, but here we go. I'd be 
really hesitant to base any evaluation of ZMQ's suitability for a highly 
scalable low latency application on local_lat/remote_lat. They appear to 
be single threaded synchronous tests which seems very unlike the kinds 
of applications being discussed (esp. if you're using NIO). More 
realistic is a network connection getting slammed with lots of 
concurrent sends and recvswhich is where lots of mistakes can be 
made if you roll your own.

As a purely academic discussion, though, I've uploaded raw C socket 
versions of a client and server that can be used to mimic local_lat and 
remote_lat -- at least for TCP sockets. On my MacBook, I get ~18 
microseconds per 40 byte packet across a test of 100 packets on 
local loopback. This is indeed about half of what I get with 
local_lat/remote_lat on tcp://127.0.0.1.
   http://pastebin.com/4SSKbAgx   (echoloopcli.c)
   http://pastebin.com/rkc6itTg  (echoloopsrv.c)

There's probably some amount of slop/unfairness in there since I cut a 
lot of corners, so if folks want to pursue the comparison further, I'm 
more than willing to bring it closer to apples-to-apples.
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Julie Anderson
See my comments below:



 They appear to
 be single threaded synchronous tests which seems very unlike the kinds
 of applications being discussed (esp. if you're using NIO). More
 realistic is a network connection getting slammed with lots of
 concurrent sends and recvswhich is where lots of mistakes can be
 made if you roll your own.


That's the point of NIO, which attracted me from the very beginning. Of
course it is not the solution for everything but for fast clients you can
SERIALIZE them inside a SINGLE thread so all reads and writes from all of
them are non-blocking and thread-safe. This is not just faster (debatable,
but specifically for fast and not too many clients) but most importantly
easier to code, optimize and keep it clean/bug free. Anyone who has done
serious multithreading programming in C or even Java knows it is not easy
to get it right and context-switches + blocking is the root of all latency
and bugs.



 As a purely academic discussion, though, I've uploaded raw C socket
 versions of a client and server that can be used to mimic local_lat and
 remote_lat -- at least for TCP sockets. On my MacBook, I get ~18
 microseconds per 40 byte packet across a test of 100 packets on
 local loopback. This is indeed about half of what I get with
 local_lat/remote_lat on tcp://127.0.0.1.
http://pastebin.com/4SSKbAgx   (echoloopcli.c)
http://pastebin.com/rkc6itTg  (echoloopsrv.c)

 There's probably some amount of slop/unfairness in there since I cut a
 lot of corners, so if folks want to pursue the comparison further, I'm
 more than willing to bring it closer to apples-to-apples.



Very awesome!!! Are 18 micros the round-trip time or one-way time? Are you
waiting to send the next packet ONLY after you get the ack from the
previous one sent? Sorry but C looks like japanese to me. :)))

-Julie



 ___
 zeromq-dev mailing list
 zeromq-dev@lists.zeromq.org
 http://lists.zeromq.org/mailman/listinfo/zeromq-dev

___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Robert G. Jakabosky
On Wednesday 29, Stuart Brandt wrote:
 Not sure I want to step into the middle of this, but here we go. I'd be
 really hesitant to base any evaluation of ZMQ's suitability for a highly
 scalable low latency application on local_lat/remote_lat. They appear to
 be single threaded synchronous tests which seems very unlike the kinds
 of applications being discussed (esp. if you're using NIO). More
 realistic is a network connection getting slammed with lots of
 concurrent sends and recvswhich is where lots of mistakes can be
 made if you roll your own.

local_lat/remote_lat both have two threads (one for the application and one 
for IO).  So each request message goes from:
1. local_lat to IO thread
2. IO thread send to tcp socket
- network stack.
3. recv from tcp socket in remote_lat's IO thread
4. from IO thread to remote_lat
5. remote_lat back to IO thread
6. IO thread send to tcp socket
- network stack.
7. recv from tcp socket in local_lat's IO thread
8. IO thread to local_lat.

So each message has to pass between threads 4 times (1,4,5,8) and go across 
the tcp socket 2 times (2-3, 6-7).

I think it would be interesting to see how latency is effected when there are 
many clients sending requests to a server (with one or more worker threads).  
With ZeroMQ it is very easy to create a server with one or many worker threads 
and handle many thousands of clients.  Doing the same without ZeroMQ is 
possible, but requires writing a lot more code.  But then writing it yourself 
will allow you to optimize it to your needs (latency vs throughput).

 As a purely academic discussion, though, I've uploaded raw C socket
 versions of a client and server that can be used to mimic local_lat and
 remote_lat -- at least for TCP sockets. On my MacBook, I get ~18
 microseconds per 40 byte packet across a test of 100 packets on
 local loopback. This is indeed about half of what I get with
 local_lat/remote_lat on tcp://127.0.0.1.
http://pastebin.com/4SSKbAgx   (echoloopcli.c)
http://pastebin.com/rkc6itTg  (echoloopsrv.c)
 
 There's probably some amount of slop/unfairness in there since I cut a
 lot of corners, so if folks want to pursue the comparison further, I'm
 more than willing to bring it closer to apples-to-apples.

echoloop*.c is testing throughput not latency, since it sends all messages at 
once instead of sending one message and waiting for it to return before 
sending the next message.  Try comparing it with local_thr/remote_thr.

-- 
Robert G. Jakabosky
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Bennie Kloosteman



 2) Do people agree that 11 microseconds are just too much?



Nope once you go cross machine that 11 micro seconds become irrelevant .
The fastest exchange im aware of for frequent trading is 80 micro seconds
(+ transport costs) best case ,  so who are you talking to  and if your not
doing frequent trading than mili seconds are fine,  The rest of your system
and algorithms are far more crucial so IMHO your wasting time in the wrong
place.  For example you can use ZeroMQ to build an asynch pub sub solution
that can do market scanning in parallel from different machines a lot
faster than if did all the tcp/ip yourself.

ZeroMQ uses a different system for messages of less than 30 bytes eg they
are copied..  Im also unaware of any messages so small in the financial
industry .  Crossmachine will add the TCP/IP header  which  some transports
optomize out on the same machine, unless your looking at only at the IPC
case I would re run your tests with 100M  64 and 256 byte messages cross
machine .   As far as interprocess communication goes there are better ways
,  ( eg writing direct to the destination semi polled lockless buffer using
256/512 bit SIMD non temporal writes would blow away anything java can do )
   but they are all dedicated solutions   and dont play nicely with other
messages coming from the IP stack and that is the challenge for
communication frameworks .  if you keep reinventing the wheel with custom
solutions sure you can get better results but at what cost , will you
finish ..and obviously tuning your higher level algorithms gets better
results than the low level stuff.  Once you whole system with business
logic is sub mili second and that is not enough than I would revisit  the
lower level transport.

Lastly building a low latency message system on Java is dangerous  .Java
creates messages very quickly but if they are not disposed quickly eg under
peak load or some receivers are slower than  you get a big permanent memory
pool than you are in trouble - you wont see this in micro benches.  I had
one complete system that worked great and fast and than had huge GC pauses
and were talking almost seconds here , pretty much defeating any gains.  So
unless you manage the memory yourself  ( eg a byte array and serialise it
so the GC is not aware of it ) you are better of using a system to store
the messages outside of javas knowledge and C++ / ZeroMQ is a good  for
that.



Ben
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Bennie Kloosteman
On Thu, Aug 30, 2012 at 10:35 AM, Julie Anderson 
julie.anderson...@gmail.com wrote:

 See my comments below:



 They appear to
 be single threaded synchronous tests which seems very unlike the kinds
 of applications being discussed (esp. if you're using NIO). More
 realistic is a network connection getting slammed with lots of
 concurrent sends and recvswhich is where lots of mistakes can be
 made if you roll your own.


 That's the point of NIO, which attracted me from the very beginning. Of
 course it is not the solution for everything but for fast clients you can
 SERIALIZE them inside a SINGLE thread so all reads and writes from all of
 them are non-blocking and thread-safe. This is not just faster (debatable,
 but specifically for fast and not too many clients) but most importantly
 easier to code, optimize and keep it clean/bug free. Anyone who has done
 serious multithreading programming in C or even Java knows it is not easy
 to get it right and context-switches + blocking is the root of all latency
 and bugs.


If you cant handle state that is true.. but if each thread has its own
state or caches state than you don't have to deal with it .  if you want
scalable high performance  you want asynch  , many threads. Putting them
all on one thread will eventually give you grief  either capacity or you
will hit some slow work and then  forced to off load stuff on threads and
that's where the bugs come as it was not your design .  Good async designs
which minimize state have few bugs , web servers are a great example .

re  context-switches + blocking is the root of all latency and bugs. 

.Context switches are less of an issue these days due to  processor
improvements and especially when you have more cores than active threads.
Use your cores , in 4 years you will have 100 thread standard servers and
your using one . Blocking is an issue but putting it one thread is IMHO
more risky  for a complex / high performance app .
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Stuart Brandt
Inline

On 8/29/2012 10:37 PM, Robert G. Jakabosky wrote:
 echoloop*.c is testing throughput not latency, since it sends all 
 messages at once instead of sending one message and waiting for it to 
 return before sending the next message. Try comparing it with 
 local_thr/remote_thr. 

Echoloopcli does a synchronous send, then a synchronous recv , then does 
it all again.  Echoloopsrv does a synchronous recv, then a synchronous 
send, then does it all again.  I stuck a while loop around the send call 
because it isn't guaranteed to complete with all bytes of my 40 byte 
packet having been sent. But since my send queue never maxes out, the 
'while' around send is overkill -- I get exactly 100 sends 
interleaved with 100 recvs.

On 8/29/2012 10:35 PM, Julie Anderson wrote:

 Very awesome!!! Are 18 micros the round-trip time or one-way time? Are 
 you waiting to send the next packet ONLY after you get the ack from 
 the previous one sent? Sorry but C looks like japanese to me. :)))


Round-trip.
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

2012-08-29 Thread Robert G. Jakabosky
On Wednesday 29, Stuart Brandt wrote:
 Inline
 
 On 8/29/2012 10:37 PM, Robert G. Jakabosky wrote:
  echoloop*.c is testing throughput not latency, since it sends all
  messages at once instead of sending one message and waiting for it to
  return before sending the next message. Try comparing it with
  local_thr/remote_thr.
 
 Echoloopcli does a synchronous send, then a synchronous recv , then does
 it all again.  Echoloopsrv does a synchronous recv, then a synchronous
 send, then does it all again.  I stuck a while loop around the send call
 because it isn't guaranteed to complete with all bytes of my 40 byte
 packet having been sent. But since my send queue never maxes out, the
 'while' around send is overkill -- I get exactly 100 sends
 interleaved with 100 recvs.

ah, sorry I over looked the outer loop.  So it is doing request/response, 
instead of bulk send/recv like I had though.

-- 
Robert G. Jakabosky
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev