[zeromq-dev] zeromq EPGM PUB threads eating 100% of each core

2017-11-15 Thread 21848863
Hi,
 We are using zeromq (3.2.0) on linux Redhat 6.4-6.7 and widnows 2008-2012 
X64. We are 
using the EPGM PUB/SUB model. We use one master node(linux) and two slave 
node(linux + windows) and one monitor node(windows).
 On each node,we use three same pub ports,and each sub the same multicast 
ips and ports.two pub ports for business process and one pub port  for heart 
beat.   If the monitor find something wrong(heart beat not receive in 10 
seconds) in one node,it will issue instruction to  control the destination node 
to close the heart beat pub port and  reopen it.


 From time to time, we have all of our PUB threads eating 100% of each 
core.   We don't know yet what exactly triggers this phenomenon and therefore 
we can't reproduce it.


 Here are some callstacks of the three PUB threads.


Thread 1 (process 42443):
#0  0x7f70b2986b55 in pgm_rate_remaining2 () from libzmq.so.3
#1  0x7f70b296d43a in pgm_getsockopt () from libzmq.so.3
#2  0x7f70b2934b68 in zmq::pgm_socket_t::get_tx_timeout() () from 
libzmq.so.3
#3  0x7f70b29333ee in zmq::pgm_sender_t::out_event() () from libzmq.so.3
#4  0x7f70b2938cac in zmq::poller_base_t::execute_timers() () from 
libzmq.so.3
#5  0x7f70b2926abb in zmq::epoll_t::loop() () from libzmq.so.3
#6  0x7f70b294747c in thread_routine () from libzmq.so.3
#7  0x003bbb407851 in start_thread () from /lib64/libpthread.so.0
#8  0x003bbb0e890d in clone () from /lib64/libc.so.6


Thread 1 (process 42443):
#0  0x003bbb07a8f3 in malloc () from /lib64/libc.so.6
#1  0x003bc94bd09d in operator new(unsigned long) () from 
/usr/lib64/libstdc++.so.6
#2  0x7f70b29390a4 in std::_Rb_tree, std::_Select1st >, std::less, 
std::allocator 
> >::_M_insert(std::_Rb_tree_node_base*, std::_Rb_tree_node_base*, 
std::pair const&) () 
from libzmq.so.3
#3  0x7f70b2939165 in std::_Rb_tree, std::_Select1st >, std::less, 
std::allocator 
> >::insert_equal(std::pair const&) () from libzmq.so.3
#4  0x7f70b2938baf in zmq::poller_base_t::add_timer(int, 
zmq::i_poll_events*, int) () from libzmq.so.3
#5  0x7f70b29333fd in zmq::pgm_sender_t::out_event() () from libzmq.so.3
#6  0x7f70b2938cac in zmq::poller_base_t::execute_timers() () from 
libzmq.so.3
#7  0x7f70b2926abb in zmq::epoll_t::loop() () from libzmq.so.3
#8  0x7f70b294747c in thread_routine () from libzmq.so.3
#9  0x003bbb407851 in start_thread () from /lib64/libpthread.so.0
#10 0x003bbb0e890d in clone () from /lib64/libc.so.6


Thread 2 (process 42448):
#0  0x7f70b2920a82 in zmq::clock_t::rdtsc() () from libzmq.so.3
#1  0x7f70b2920b69 in zmq::clock_t::now_ms() () from libzmq.so.3
#2  0x7f70b2938b90 in zmq::poller_base_t::add_timer(int, 
zmq::i_poll_events*, int) () from libzmq.so.3
#3  0x7f70b29333fd in zmq::pgm_sender_t::out_event() () from libzmq.so.3
#4  0x7f70b2938cac in zmq::poller_base_t::execute_timers() () from 
libzmq.so.3
#5  0x7f70b2926abb in zmq::epoll_t::loop() () from libzmq.so.3
#6  0x7f70b294747c in thread_routine () from libzmq.so.3
#7  0x003bbb407851 in start_thread () from /lib64/libpthread.so.0
#8  0x003bbb0e890d in clone () from /lib64/libc.so.6


Thread 2 (process 42448):
#0  0x7f70b297820b in pgm_send () from libzmq.so.3
#1  0x7f70b293532f in zmq::pgm_socket_t::send(unsigned char*, unsigned 
long) () from libzmq.so.3
#2  0x7f70b293334e in zmq::pgm_sender_t::out_event() () from libzmq.so.3
#3  0x7f70b2938cac in zmq::poller_base_t::execute_timers() () from 
libzmq.so.3
#4  0x7f70b2926abb in zmq::epoll_t::loop() () from libzmq.so.3
#5  0x7f70b294747c in thread_routine () from libzmq.so.3
#6  0x003bbb407851 in start_thread () from /lib64/libpthread.so.0
#7  0x003bbb0e890d in clone () from /lib64/libc.so.6


Thread 2 (process 42448):
#0  0x7f70b296bf87 in pgm_rwlock_reader_unlock () from libzmq.so.3
#1  0x7f70b296de98 in pgm_getsockopt () from libzmq.so.3
#2  0x7f70b2934b68 in zmq::pgm_socket_t::get_tx_timeout() () from 
libzmq.so.3
#3  0x7f70b29333ee in zmq::pgm_sender_t::out_event() () from libzmq.so.3
#4  0x7f70b2938cac in zmq::poller_base_t::execute_timers() () from 
libzmq.so.3
#5  0x7f70b2926abb in zmq::epoll_t::loop() () from libzmq.so.3
#6  0x7f70b294747c in thread_routine () from libzmq.so.3
#7  0x003bbb407851 in start_thread () from /lib64/libpthread.so.0
#8  0x003bbb0e890d in clone () from /lib64/libc.so.6


Thread 2 (process 42448):
#0  0x003bbb40a659 in pthread_mutex_unlock () from /lib64/libpthread.so.0
#1  0x7f70b297560d in pgm_mutex_unlock () from libzmq.so.3
#2  0x7f70b29783e6 in pgm_send () from libzmq.so.3
#3  0x7f70b293532f in zmq::pgm_socket_t::send(unsigned char*, unsigned 
long) () from libzmq.so.3
#4  0x7f70b293334e in zmq::pgm_sender_t::out_event() () from libzmq.so.3
#5  0x7f70b2938cac in zmq::poller_base_t::execute_timers() () from 
libzmq.so.3
#6  0x7f70b2926abb in zmq::epoll_t::loop() 

[zeromq-dev] Assertion failed: input_stopped (..\..\..\..\src\stream_engine.cpp:444)

2017-11-15 Thread Jensen, Jesper
Hi

I have this subscriber and under windows (only place tested) version 4.2.2 and 
running it from a command line only I get an exception:

KernelBase.dll!7649c54f()UnknownNo 
symbols loaded.
   [Frames below may be incorrect and/or missing, no symbols loaded 
for KernelBase.dll]   Annotated Frame
   hwclient_wait.exe!zmq::zmq_abort() Line 83   C++
Symbols loaded.
> hwclient_wait.exe!zmq::stream_engine_t::restart_input() Line 444  
>  C++Symbols loaded.
   hwclient_wait.exe!zmq::session_base_t::write_activated() Line 
300  C++Symbols loaded.
   hwclient_wait.exe!zmq::pipe_t::process_activate_write() Line 273 
   C++Symbols loaded.
   hwclient_wait.exe!zmq::object_t::process_command() Line 82
C++Symbols loaded.
   hwclient_wait.exe!zmq::io_thread_t::in_event() Line 86 C++   
 Symbols loaded.
   hwclient_wait.exe!zmq::select_t::loop() Line 317   
C++Symbols loaded.
   hwclient_wait.exe!zmq::select_t::worker_routine() Line 393   
  C++Symbols loaded.
   hwclient_wait.exe!thread_routine() Line 46 C++
Symbols loaded.
   msvcr100d.dll!_callthreadstartex() Line 314  C   
  Symbols loaded.
   msvcr100d.dll!_threadstartex(void * ptd) Line 297 C  
   Symbols loaded.
   kernel32.dll!75f4336a() UnknownNo symbols loaded.
   ntdll.dll!770098f2()  UnknownNo symbols 
loaded.
   ntdll.dll!770098c5()  UnknownNo symbols 
loaded.

The exception happens here  src\stream_engine.cpp

void zmq::stream_engine_t::restart_input ()
{
zmq_assert (input_stopped);

As far as I can see the other place that calls restart_input does a check if 
input_stopped is set to true but the zmq::session_base_t::write_activated 
don't. My question is should it? Or am I doing something wrong in the code 
below.


My test subscriber:

#define ZMQ_STATIC

#include 
#include 
#include 

#include 
#include 
#include 
#include 
#include 

using namespace std;

#define CATCH_ERROR_T  \
   catch (zmq::error_t &e) \
   {   \
  printf("Exception caught: %s", e.what());\
   }

int main()
{
   //  Prepare our context and socket
   try
   {
  zmq::context_t context(1);

  zmq::socket_t socket(context, ZMQ_SUB);
  try
  {
 socket.setsockopt(ZMQ_SUBSCRIBE, "", 0);
  }
  CATCH_ERROR_T;

  try
  {
 int hwm = 10;
 socket.setsockopt(ZMQ_RCVHWM, &hwm, sizeof(hwm));
  }
  CATCH_ERROR_T;

  try
  {
 int no = 250;
 socket.setsockopt(ZMQ_HEARTBEAT_IVL, &no, sizeof(no));
  }
  CATCH_ERROR_T;

  try
  {
 int no = 500;
 socket.setsockopt(ZMQ_HEARTBEAT_TIMEOUT, &no, sizeof(no));
  }
  CATCH_ERROR_T;

  std::cout << "Connecting to pub server!" << std::endl;
  try
  {
 socket.connect("tcp://localhost:");
  }
  CATCH_ERROR_T;

  Sleep(1000);

  int seq = 0;
  int last = -1;
  for (;;) {
 //  Get the reply.
 zmq::multipart_t sub;
 try
 {
sub.recv(socket);
 }
 CATCH_ERROR_T;

 if (sub.size() == 2)
 {
string t(static_cast(sub.at(0).data()), sub.at(0).size());
memcpy(&seq, sub.at(1).data(), sizeof(seq));
++last;
if (last != seq)
{
   std::cout << "Sequence error last " << last << " Sequence: " << 
seq << std::endl;
   last = seq;
}
if (!(seq % 10))
{
   std::cout << "Type: " << t << " Sequence: " << seq << std::endl;
}
 }
 else
 {
std::cout << "Invalid format!" << std::endl;
 }
  }
   }
   CATCH_ERROR_T;

   return 0;
}

Jesper K


___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Design problem of distributed environment

2017-11-15 Thread Matej Puk
I am trying to implement some simple example using pyre but with no luck so
far. Do you have any experience with pyre or zyre?

2017-11-15 17:42 GMT+01:00 Matej Puk :

> Thanks for the answer. ZeroMQ is something new for me and I had no idea
> something like Zyre existed so thanks for bringing it up. It looks like
> best solution for my problem so far.
>
> To your question of how am I defining a neighbor. Well this simulator
> which I am developing is supposed to simulate distributed environment on
> single or more machines. It is actually a diploma thesis
> so there is not so much time for developing it to be able to simulate
> environment on more machines but assumption is that in some distant future
> it will.
>
> Now I am focused to develop version which will correctly simulate
> environment on single machine. Entities in this environment are threads.
> Every thread represent single entity and have behavior.
> At beginning I should say that this project is a simulator for distributed
> algorithms. So in basic user writes algorithm in simplified language that I
> developed than compiler compiles this language and creates
> data about behavior of entities described in algorithm. Part of compiling
> is also processing file with environment topology where every entity has
> it's own id, ip, port and neighbors.
>
> So every entity know whom are it's neighbors thus knowing ip addresses and
> ports.
>
> But problem was that I think I can not approach this problem via patterns.
> Because user has ability to enter whatever algorithm he decides which means
> behavior of entities is not predictable and must be flexible.
>
> I will look into Zyre documentation but at first glance I think it offers
> everything that I need. I wanted some simple solution because someone for
> example suggested that every entity should have socket for each neighbor,
> but I do not think that when I have environment where are 20 entities and
> every one of them has 19 sockets on single machine is right approach.
>
> 2017-11-15 15:10 GMT+01:00 Wes Young :
>
>> I am currently fighting with some design issues for my project and as
>> newbie to zeroMQ I am here to look for advice. My project should simulate
>> distributed algorithm in distributed environment.
>>
>>
>> welcome :)
>>
>> For example algorithm like brodcasting. I need to implement communication
>> between nodes in this environment and I chose to use zeroMq.
>>
>> My question is what do you think is the best approach when you want to
>> design environment where entity must do following:
>>
>>- be able to receive messages from neighbours,
>>- be able to send message to all or maybe few selected neighbours.
>>
>> some of this may depend how you define “a neighbor” (e.g.: pre-configured
>> env, or zero-conf env). if you’re looking for little-to-zero configuration,
>> may wanna checkout zyre:
>>
>> https://rfc.zeromq.org/spec:36/ZRE
>> https://github.com/zeromq/zyre
>>
>> My first idea was to give very entity a ROUTER socket. Is it possible to
>> achieve it with only routers?
>>
>> I believe so- but you may wanna start by reading about the different
>> patterns first and prototyping some simple one to one, one to two patterns
>> to get your hands dirty first, get a feel for the different socket types:
>>
>> http://zguide.zeromq.org/page:all#Messaging-Patterns
>>
>> You probably start out with REQ/REP which is easy- then move into async
>> (router/dealer) then into something like Zyre where the learning curve is a
>> little steeper (but probably closer to what you want long term if I’m
>> understanding your question).
>>
>> As I looked at ROuter it can send message to multiple nodes and also can
>> receive messages from multiple nodes.
>>
>> Every entity knows who is it's neoghbor and have information about IP's
>> and ports.
>>
>> Yea, sounds like a problem ZYRE solves- but if you’re new may wanna work
>> your way into it a bit with some of the simpler types first..
>>
>> make sense?
>>
>> —
>> wes
>> wesyoung.me
>>
>> ___
>> zeromq-dev mailing list
>> zeromq-dev@lists.zeromq.org
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Design problem of distributed environment

2017-11-15 Thread Matej Puk
Thanks for the answer. ZeroMQ is something new for me and I had no idea
something like Zyre existed so thanks for bringing it up. It looks like
best solution for my problem so far.

To your question of how am I defining a neighbor. Well this simulator which
I am developing is supposed to simulate distributed environment on single
or more machines. It is actually a diploma thesis
so there is not so much time for developing it to be able to simulate
environment on more machines but assumption is that in some distant future
it will.

Now I am focused to develop version which will correctly simulate
environment on single machine. Entities in this environment are threads.
Every thread represent single entity and have behavior.
At beginning I should say that this project is a simulator for distributed
algorithms. So in basic user writes algorithm in simplified language that I
developed than compiler compiles this language and creates
data about behavior of entities described in algorithm. Part of compiling
is also processing file with environment topology where every entity has
it's own id, ip, port and neighbors.

So every entity know whom are it's neighbors thus knowing ip addresses and
ports.

But problem was that I think I can not approach this problem via patterns.
Because user has ability to enter whatever algorithm he decides which means
behavior of entities is not predictable and must be flexible.

I will look into Zyre documentation but at first glance I think it offers
everything that I need. I wanted some simple solution because someone for
example suggested that every entity should have socket for each neighbor,
but I do not think that when I have environment where are 20 entities and
every one of them has 19 sockets on single machine is right approach.

2017-11-15 15:10 GMT+01:00 Wes Young :

> I am currently fighting with some design issues for my project and as
> newbie to zeroMQ I am here to look for advice. My project should simulate
> distributed algorithm in distributed environment.
>
>
> welcome :)
>
> For example algorithm like brodcasting. I need to implement communication
> between nodes in this environment and I chose to use zeroMq.
>
> My question is what do you think is the best approach when you want to
> design environment where entity must do following:
>
>- be able to receive messages from neighbours,
>- be able to send message to all or maybe few selected neighbours.
>
> some of this may depend how you define “a neighbor” (e.g.: pre-configured
> env, or zero-conf env). if you’re looking for little-to-zero configuration,
> may wanna checkout zyre:
>
> https://rfc.zeromq.org/spec:36/ZRE
> https://github.com/zeromq/zyre
>
> My first idea was to give very entity a ROUTER socket. Is it possible to
> achieve it with only routers?
>
> I believe so- but you may wanna start by reading about the different
> patterns first and prototyping some simple one to one, one to two patterns
> to get your hands dirty first, get a feel for the different socket types:
>
> http://zguide.zeromq.org/page:all#Messaging-Patterns
>
> You probably start out with REQ/REP which is easy- then move into async
> (router/dealer) then into something like Zyre where the learning curve is a
> little steeper (but probably closer to what you want long term if I’m
> understanding your question).
>
> As I looked at ROuter it can send message to multiple nodes and also can
> receive messages from multiple nodes.
>
> Every entity knows who is it's neoghbor and have information about IP's
> and ports.
>
> Yea, sounds like a problem ZYRE solves- but if you’re new may wanna work
> your way into it a bit with some of the simpler types first..
>
> make sense?
>
> —
> wes
> wesyoung.me
>
> ___
> zeromq-dev mailing list
> zeromq-dev@lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] Design problem of distributed environment

2017-11-15 Thread Wes Young
> I am currently fighting with some design issues for my project and as newbie 
> to zeroMQ I am here to look for advice. My project should simulate 
> distributed algorithm in distributed environment.

welcome :)

> For example algorithm like brodcasting. I need to implement communication 
> between nodes in this environment and I chose to use zeroMq.
> 
> My question is what do you think is the best approach when you want to design 
> environment where entity must do following:
> be able to receive messages from neighbours,
> be able to send message to all or maybe few selected neighbours.
some of this may depend how you define “a neighbor” (e.g.: pre-configured env, 
or zero-conf env). if you’re looking for little-to-zero configuration, may 
wanna checkout zyre:

https://rfc.zeromq.org/spec:36/ZRE 
https://github.com/zeromq/zyre My first idea 
was to give very entity a ROUTER socket. Is it possible to achieve it with only 
routers?
> 
I believe so- but you may wanna start by reading about the different patterns 
first and prototyping some simple one to one, one to two patterns to get your 
hands dirty first, get a feel for the different socket types:

http://zguide.zeromq.org/page:all#Messaging-Patterns 


You probably start out with REQ/REP which is easy- then move into async 
(router/dealer) then into something like Zyre where the learning curve is a 
little steeper (but probably closer to what you want long term if I’m 
understanding your question).
> As I looked at ROuter it can send message to multiple nodes and also can 
> receive messages from multiple nodes.
> 
> Every entity knows who is it's neoghbor and have information about IP's and 
> ports.
> 
Yea, sounds like a problem ZYRE solves- but if you’re new may wanna work your 
way into it a bit with some of the simpler types first..

make sense?

—
wes
wesyoung.me


signature.asc
Description: Message signed with OpenPGP
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] polling performances on windows

2017-11-15 Thread brunobodin .
Hi Francesco,

thanks for the heads up.


Bruno

On Tue, Nov 14, 2017 at 7:26 PM, Francesco 
wrote:

> Hi Bruno,
> I noticed your email and this reminds me of a performance issue with
> polling that I hit recently with ZMQ.
> Maybe you can be interested to the following excerpt:
>
> -
> just to update on this topic, in case it's useful to others: it turned out
> that the bottleneck [...] was more on the fact that I was calling
> zmq_poll() before each zmq_msg_recv() (to implement a timeout on the
> recv()).
> The way I fixed the problem was to implement a slightly smarter polling
> logic: I try to receive with ZMQ_DONTWAIT flag. Then if 0 messages are
> received, next time a poll() operation with my desidered timeout will be
> performed before attempting the recv(). If 1 message is received instead
> then the next time the same recv() (always with ZMQ_DONTWAIT flag) will
> be  repeated.
> -
>
> of this email thread: https://lists.zeromq.org/pipermail/zeromq-dev/2017-
> October/031974.html
> 
> This changed substantially the performances of my application. But perhaps
> looking at your testing application, you are adding the zmq_poll()
> operation on purpose, to test its impact so maybe what I wrote above does
> not solve anything for you, not sure :)
>
> HTH,
> Francesco
>
>
> 2017-11-13 20:55 GMT+01:00 brunobodin . :
>
>> Hi all
>>
>> I ran a couple of test in order to evaluate the cost of polling (on
>> windows). To do so, I added polling to the local_lat and local_thr tests.
>>
>> The code is here
>>
>> https://github.com/bbdb68/libzmq/tree/test_polling_cost
>>
>> and here is what I noticed :
>> * since the fix about mempcy of FD_SET structure, the performances of
>> local_thr are excellent,
>> close to 1Gb/s ie the hardware thoughtput.
>>
>> * when I add the polling, the latency tests seems unaffected, while the
>> thr test falls to 200Mb/s,
>>   that is a 5x drop
>>
>> So here are my questions
>>
>> * is this way of testing polling meaningful ?
>> * how do you explain the difference between latency and thoughput tests
>> behaviour ?
>> * what are the result on a linux box ?
>>
>> Thanks
>>
>> Bruno
>>
>>
>>
>> 
>>  Garanti
>> sans virus. www.avast.com
>> 
>> <#m_5550454348728784588_m_-6006190082856671083_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>
>> ___
>> zeromq-dev mailing list
>> zeromq-dev@lists.zeromq.org
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
> ___
> zeromq-dev mailing list
> zeromq-dev@lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev


Re: [zeromq-dev] CZMQ VS2015 build errors

2017-11-15 Thread Luca Boccassi
On Fri, 2017-11-10 at 20:17 +, Stephen Gray wrote:
> > Have you checked this issue and the solution at the bottom?
> > 
> > https://github.com/zeromq/czmq/issues/1617
> > 
> > --
> > Kind regards,
> > Luca Boccassi
> 
> Thanks for the tip Luca.
> 
> It cleared the original error but threw up a different one related to
> arguments of incorrect type https://pastebin.com/3NYMkKTA 
> 
> With regards,
> Stephen

This is now fixed as well.

Kind regards,
Luca Boccassi

signature.asc
Description: This is a digitally signed message part
___
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev