jwomeara commented on issue #3087:
URL: https://github.com/apache/accumulo/issues/3087#issuecomment-1322596544

   I'm not asking for any default behavior to be changed.  If the way it's 
written today works for 99% of the use cases, then I wouldn't argue for 
changing that.  
   
   The problem I'm having is that in the event that there is a networking 
issue, or my zookeeper instances are (temporarily) unavailable, my code will 
get stuck in the createBatchWriter call indefinitely, removing my ability to 
react to the situation.  Giving me a way to take back control, either by 
specifying a certain number of retries or a timeout, would be useful.
   
   In the specific use case I'm dealing with, I have a limited number of 
rabbitmq consumer threads that read audit messages from a queue and then write 
those messages to accumulo using a batch writer.  I am getting into a situation 
where all of my consumer threads are getting locked up waiting on 
createBatchWriter to return, and even when the zookeeper instances return, the 
call still hangs.  If I were able to configure accumulo to throw an exception 
instead, then I would be able to react and possibly write my audit messages 
somewhere else, like HDFS, or even attempt creating a new batch writer.  The 
way it is now, I am locking up all of my threads and backing up rabbitmq, with 
my only recourse being to roll the audit service.
   
   If adding a timeout option is a no-go, then I will have to resort to 
creating the batch writer in a separate thread that I am able to kill myself 
after a certain amount of time.  That seems sloppy to me, but I don't think I 
have any other options without making a change to accumulo.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to