RabbitMqIO pull/consume + high cpu on no message

Mostafa Aghajani Thu, 08 Apr 2021 15:27:36 -0700

Hi,
I am quite new to Beam framework and working on using RabbitMQ as input source 
for my pipeline. I have some challenges and questions:


  *   When there is no message in the queue to process, the flink runner (and 
direct runner too) frequently (many times per second) runs the createReader 
function of UnboundedSource and then call the start method of the reader. I 
checked the source code and this can be a heavy operation on MQ channel which 
will lead to the connection to be dropped and errors + super high CPU usage. Is 
this normal or am I missing something?
  *   The provided RabbitMQ io in SDK is using basicGet function of rabbitmq 
amqp lib to fetch single item from the queue, which according to rabbitmq docs 
is not recommend at all. So why we are using the pull approach here and not 
using the consume and push from server? I tried this but since I put the 
basicConsume inside the start method, it will create many consumers leading to 
other issues. I have two questions here: 1- Is this expected that the the 
runner on each loop (after all produced messages are passed through pipeline or 
there is no message) create a new reader and start it? I expected this to 
happen only once and the runner just ask reader with calling "advance" 
function. 2- Is it a good idea to implement the reader using recommended push 
by server approach (as a consumer using basicConsume) and put the items in a 
memory queue for further calls of advance functionn? Then it would also 
requires the consumer and channel to be static or shared and not attached to 
the UnboundedReader if a new instance is going to be created frequently.


Best Regards
Mostafa Aghajani

RabbitMqIO pull/consume + high cpu on no message

Reply via email to