ppawel commented on PR #30218:
URL: https://github.com/apache/beam/pull/30218#issuecomment-2037745694

   > We have to recreate the consumer/session to properly handle checkpoints, 
otherwise there were data losses. This is due to the limitation that message 
acknowledge in Jms is per session. All messages that delivered by the time of 
acknowledging a message within a session will be marked as acknowledged.
   
   I've never looked so deep into JMS spec, to be honest it is a bit strange if 
it in fact works like this with regards to acks (that it's enough to ack one 
single message). I think this could also be subject to testing between 
implementations, I can check it at some point with my case (Solace broker) but 
first need to deal somehow with this deadlock situation.
   
   > Agree with the analysis and I think the cause is same here: 
https://github.com/apache/beam/pull/30218/files#diff-a63812b51f93708cc60430f314b496ae1110425c6a8ae4c85e59573cfb8f0938R204-R207
   
   OK but as I understand, this is only for the direct runner, in a runner like 
Dataflow, finalizing the checkpoint might or might not happen at some point and 
"fix" the deadlock but the root cause will still be there.
   
   > Is there a way to release the messages in internal buffer but do not 
acknowledge the receive message within the same session?
   
   I don't think it's possible at JMS API level to do this, and even at the 
implementation level I don't see any easy access to those internal 
queues/buffers in both clients. There are some parameters to control how big is 
the buffer etc. but I think JmsIO should ideally work regardless of such 
parameters. After all, those buffers are there for a reason (performance).
   
   I see the comment in 
`org.apache.beam.sdk.io.jms.JmsIOTest#testCheckpointMark` regarding testing 
without "prefetch" - I think it would be good to also test some scenarios 
including prefetch and this internal buffer. I can't promise anything due to 
time constraints but I might try to implement a test case that reproduces my 
issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to