Also, * what's your setting for topology.message.timeout? * You said you're seeing this in indexing and enrichment, what enrichments do you have in place? * Is ES being taxed heavily? * What's your ES batch size for the sensor?
On Fri, Apr 21, 2017 at 10:46 AM, Casey Stella <[email protected]> wrote: > So you're seeing failures in the storm topology but no errors in the > logs. Would you mind sending over a screenshot of the indexing topology > from the storm UI? You might not be able to paste the image on the mailing > list, so maybe an imgur link would be in order. > > Thanks, > > Casey > > On Fri, Apr 21, 2017 at 10:34 AM, Ali Nazemian <[email protected]> > wrote: > >> Hi Ryan, >> >> No, I cannot see any error inside the indexing error topic. Also, the >> number of tuples is emitted and transferred to the error indexing bolt is >> zero! >> >> On Sat, Apr 22, 2017 at 12:29 AM, Ryan Merriman <[email protected]> >> wrote: >> >>> Do you see any errors in the error* index in Elasticsearch? There are >>> several catch blocks across the different topologies that transform errors >>> into json objects and forward them on to the indexing topology. If you're >>> not seeing anything in the worker logs it's likely the errors were captured >>> there instead. >>> >>> Ryan >>> >>> On Fri, Apr 21, 2017 at 9:19 AM, Ali Nazemian <[email protected]> >>> wrote: >>> >>>> No everything is fine at the log level. Also, when I checked resource >>>> consumption at the workers, there had been plenty resources still >>>> available! >>>> >>>> On Fri, Apr 21, 2017 at 10:04 PM, Casey Stella <[email protected]> >>>> wrote: >>>> >>>>> Seeing anything in the storm logs for the workers? >>>>> >>>>> On Fri, Apr 21, 2017 at 07:41 Ali Nazemian <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> After I tried to tune the Metron performance I have noticed the rate >>>>>> of failure for the indexing/enrichment topologies are very high (about >>>>>> 95%). However, I can see the messages in Elasticsearch. I have tried to >>>>>> increase the timeout value for the acknowledgement. It didn't fix the >>>>>> problem. I can set the number of acker executors to 0 to temporarily fix >>>>>> the problem which is not a good idea at all. Do you have any idea what >>>>>> have >>>>>> caused such issue? The percentage of failure decreases by reducing the >>>>>> number of parallelism, but even without any parallelism, it is still >>>>>> high! >>>>>> >>>>>> Cheers, >>>>>> Ali >>>>>> >>>>> >>>> >>>> >>>> -- >>>> A.Nazemian >>>> >>> >>> >> >> >> -- >> A.Nazemian >> > >
