I would start by increasing task timeout Config.*NIMBUS_TASK_LAUNCH_SECS* and Config.*NIMBUS_TASK_TIMEOUT_SECS*
would so supervisor doesn't mark the task as dead and restart. On Aug 24, 2016 2:17 AM, "Simon Cooper" <simon.coo...@featurespace.co.uk> wrote: > We’re decompressing and deserializing several hundreds-of-megabytes files > containing data (statistical classifier definitions, mostly) that the bolt > needs to do its thing. The bolt can’t process events without deserializing > and indexing the data in those files, which could take anything up to > several minutes. This can’t easily be farmed out to an external service, > due to various processing and infrastructure limitations > > > > SimonC > > > > *From:* Hart, James W. [mailto:jwh...@seic.com] > *Sent:* 23 August 2016 15:04 > *To:* user@storm.apache.org > *Subject:* RE: Running a long task in bolt prepare() method > > > > Can you elaborate on what kind work is being done at startup? > > > > If you are building some kind of cacheable lookup data, I would build that > elsewhere in a persistent cache, like redis, and then fetch and access it > through redis. > > > > *From:* Simon Cooper [mailto:simon.coo...@featurespace.co.uk > <simon.coo...@featurespace.co.uk>] > *Sent:* Tuesday, August 23, 2016 9:36 AM > *To:* user@storm.apache.org > *Subject:* RE: Running a long task in bolt prepare() method > > > > We’ve got a similar issue, where the prepare() takes a long time (could be > up to several minutes), and the bolt can’t process tuples until that is > completed. The topology seems to send in tuples before the prepare is > completed, and things go wrong > > > > We’re having to implement our own mechanism for notification – an external > way for the bolt to report to the spout that it is ready. This is also an > issue on multi-worker topologies where one of the workers goes down, is > recreated, and it’s several minutes before it can process tuples. > > > > It would be good if there was a way for storm to deal with this, so we > don’t have to implement our own back-channel back to the spout… > > > > SimonC > > > > *From:* Andrea Gazzarini [mailto:gxs...@gmail.com <gxs...@gmail.com>] > *Sent:* 23 August 2016 13:08 > *To:* user@storm.apache.org > *Subject:* Re: Running a long task in bolt prepare() method > > > > Not sure if there's a "built-in" approach in Storm for doint that. After > make sure there isn't, I'd do the following > > - I'd start such long task asynchronously in the prepare method and > I'd register a callback > - if the execute method logic depends on the completion of such task, > I'd use a basic state pattern with two states ON/OFF (where the off state > is basically a NullObject). The callback would be responsible to switch > the bolt state from OFF (initial state) to ON (working state) > > Best, > Andrea > > On 23/08/16 09:12, Xiang Wang wrote: > > Hi All, > > > > I am trying to do some long-time initialisation task in bolt prepare() > method in local mode. > > > > I always got error like this: > > *WARN o.a.s.s.o.a.z.s.p.FileTxnLog - fsync-ing the write ahead log in > SyncThread:0 took 1197ms which will adversely effect operation latency. See > the ZooKeeper troubleshooting guide* > > > > And then the task fails. > > > > Could anyone tell me how to fix this problem? Or is it a good practice to > run long-time task in prepare() method? If not, what is supposed to be the > correct way to do it? > > > > Many thanks for your kind help. > > > > Best, > > Xiang > > ------------------------------- > > Xiang Wang, PhD Candidate > > Database Research Group > > School of Computer Science and Engineering > > The University of New South Wales > > SYDNEY, AUSTRALIA > > > > This message, and any files/attachments transmitted together with it, is > intended for the use only of the person (or persons) to whom it is > addressed. It may contain information which is confidential and/or > protected by legal privilege. Accordingly, any dissemination, distribution, > copying or use of this message, or any part of it or anything sent together > with it, other than by intended recipients, may constitute a breach of > civil or criminal law and is hereby prohibited. Unless otherwise stated, > any views expressed in this message are those of the person sending it and > not the sender's employer. No responsibility, legal or otherwise, of > whatever nature, is accepted as to the accuracy of the contents of this > message or for the completeness of the message as received. Anyone who is > not the intended recipient of this message is advised to make no use of it > and is requested to contact Featurespace Limited as soon as possible. Any > recipient of this message who has knowledge or suspects that it may have > been the subject of unauthorised interception or alteration is also > requested to contact Featurespace Limited. >