RabbitMQ and CheckpointMark feasibility

Daniel Robert Thu, 07 Nov 2019 12:07:46 -0800

(Background: I recently upgraded RabbitMqIO from the 4.x to 5.x library.As part of this I switched to a pull-based API rather than thepreviously-used push-based. This has caused some nebulous problems soput up a correction PR that I think needs some eyes fairly quickly asI'd consider master to be broken for rabbitmq right now. The PR keepsthe upgrade but reverts to the same push-based implementation as in 4.x:https://github.com/apache/beam/pull/9977 )

Regardless, in trying to get the pull-based API to work, I'm finding theinteractions between rabbitmq and beam with CheckpointMark to befundamentally impossible to implement so I'm hoping for some input here.

CheckointMark itself must be Serializable, presumably this means it getsshuffled around between nodes. However 'Channel', the tunnel throughwhich it communicates with Rabbit to ack messages and finalize thecheckpoint, is non-Serializable. Like most other CheckpointMarkimplementations, Channel is 'transient'. When a new CheckpointMark isinstantiated, it's given a Channel. If an existing one is supplied tothe Reader's constructor (part of the 'startReader()' interface), thechannel is overwritten.

*However*, Rabbit does not support 'ack'ing messages on a channel otherthan the one that consumed them in the first place. Attempting to do soresults in a '406 (PRECONDITION-FAILED) - unknown delivery tag'. (Seehttps://www.grzegorowski.com/rabbitmq-406-channel-closed-precondition-failed).

Truthfully, I don't really understand how the current implementation isworking; it seems like a happy accident. But I'm curious if someonecould help me debug and implement how to bridge there-usable/serializable CheckpointMark requirement in Beam with thislimitation of Rabbit.


Thanks,
-Daniel Robert

RabbitMQ and CheckpointMark feasibility

Reply via email to