[ 
https://issues.apache.org/jira/browse/STORM-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139831#comment-16139831
 ] 

Jungtaek Lim commented on STORM-2231:
-------------------------------------

[~chemist] [~kevinconaway]
I guess I found suspicious spot, but possible fixes may affect performance so 
would like to spend some time to do performance tests on fixes.
I didn't reproduce the issue and don't know it is easy to reproduce, so if one 
of you can help testing and verifying the patch it should be really helpful.

> NULL in DisruptorQueue while multi-threaded ack
> -----------------------------------------------
>
>                 Key: STORM-2231
>                 URL: https://issues.apache.org/jira/browse/STORM-2231
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>    Affects Versions: 1.0.1, 1.1.0
>            Reporter: Alexander Kharitonov
>            Priority: Critical
>
> I use simple topology with one spout (9 workers) and one bolt (9 workers).
> I have topology.backpressure.enable: false in storm.yaml.
> Spouts send about 10 000 000 tuples in 10 minutes. Pending for spout is 80 
> 000.
> Bolts buffer theirs tuples for 60 seconds and flush to database and ack 
> tuples in parallel (10 threads).
> I read that OutputCollector can be used in many threads safely, so i use it.
> I don't have any bottleneck in bolts(flushing to database) or spouts(kafka 
> spout), but about 2% of tuples fail due to tuple processing timeout (fails 
> are recordered in spout stats only).
> I am sure that bolts ack all tuples. But some of acks don't come to spouts.
> While multi-threaded acking i see many errors in worker logs like that:
> 2016-12-01 13:21:10.741 o.a.s.u.DisruptorQueue [ERROR] NULL found in 
> disruptor-executor[3 3]-send-queue:853877
> I tried to use synchronized wrapper around OutputCollector to fix the error. 
> But it didn't help.
> I found the workaround that helps me: i do all processing in bolt in multiple 
> threads but call OutputCollector.ack methods in a one single separate thread.
> I think Storm has an error in the multi-threaded use of OutputCollector.
> If my topology has much less load, like 500 000 tuples per 10 minutes, then  
> i don't lose any acks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to