Dear Jordan, 

Technical details are briefly: 

        * Application is build with Spring Boot and Zookeeper 3.4.9

        * Curator is used to elect leader (LeaderLatch) and to provide
distributed queue that is lock-aware. Here is queue initialization code:

    private void distributedQueueInit() throws Exception {
        FileConsumer consumer = new FileConsumer(receiver);
        FileReferenceSerializer serializer = new
FileReferenceSerializer();
        QueueBuilder<FileReference> builder =
QueueBuilder.builder(client, consumer, serializer, queuePath);
        builder.lockPath(lockPath);
        fileRefs = builder.buildQueue();
        fileRefs.start();    
    } 
        * FileConsumer implements QueueConsumer and its processMessage() is
very short:

   @Override
    public void consumeMessage(FileReference reference) throws Exception
{
        log.info("Processing file: " + reference.getReference() + "|" +
reference.getRefType());
        receiver.processFile(reference);
    }   * The problem is at receiver.processFile(reference) function.  It
has HTTP call to external resource that retrieves file by reference and
save it on disk. When problem arise (it happened once for 3 days) I can
see at my logs that "Processing file:" is fired. Next log message is
after file is saved to disk. When error happened - it did not appear. At
the same time I can see lock and message in the queue that are not
processed. I thing there are only 2 reasons for consumer to die :

        * HTTP call takes too much time and consumer dies
        * HTTP connection dies without exception and consumer does not receive
any error and hangs on

   If you need more details I can provide information about HTTP call,
but I think for such kind of problem it is not relevant. I am curious
about what to do next because there are may be more than 2 reasons for
such a behavior. What do you think? 

Regards, 

Vadim

On 2016-11-16 17:31, Jordan Zimmerman wrote:

> Can you send a code snippet or test that shows the issue? 
> 
> -Jordan 
> 
>> On Nov 16, 2016, at 4:35 AM, Vadim <[email protected]> wrote: 
>> 
>> Hello all, 
>> 
>> My name is Vadim and I am new curator user. I am very happy with curator, 
>> but think that may be using it in a wrong way a bit. Particularly  I am 
>> using Distributed Queue receipt and code runs well until Consumer silently 
>> dies. Messages for queue are small and does not exceed 50 bytes. I use queue 
>> to distribute tasks between Workers. 
>> 
>> My QueueConsumer method consumeMessate() calls part of the code that may 
>> fail silently (without exception). Since message delivery at my scenario is 
>> "durable" I can see locks that are never freed. Curator does not "cure" such 
>> a stale consumer as well. Am I doing something wrong? 
>> 
>> I have an idea to call code inside "consumeMessage"  at separate blocking 
>> thread that has a timeout. Thus when my code fails silently -- consumer will 
>> get timeout exception. But this is not an elegant solution I think. What do 
>> you think? 
>> 
>> Thank you in advance, 
>> 
>> Vadim.
 

Reply via email to