We’re decompressing and deserializing several hundreds-of-megabytes files 
containing data (statistical classifier definitions, mostly) that the bolt 
needs to do its thing. The bolt can’t process events without deserializing and 
indexing the data in those files, which could take anything up to several 
minutes. This can’t easily be farmed out to an external service, due to various 
processing and infrastructure limitations

SimonC

From: Hart, James W. [mailto:[email protected]]
Sent: 23 August 2016 15:04
To: [email protected]
Subject: RE: Running a long task in bolt prepare() method

Can you elaborate on what kind work is being done at startup?

If you are building some kind of cacheable lookup data, I would build that 
elsewhere in a persistent cache, like redis, and then fetch and access it 
through redis.

From: Simon Cooper [mailto:[email protected]]
Sent: Tuesday, August 23, 2016 9:36 AM
To: [email protected]<mailto:[email protected]>
Subject: RE: Running a long task in bolt prepare() method

We’ve got a similar issue, where the prepare() takes a long time (could be up 
to several minutes), and the bolt can’t process tuples until that is completed. 
The topology seems to send in tuples before the prepare is completed, and 
things go wrong

We’re having to implement our own mechanism for notification – an external way 
for the bolt to report to the spout that it is ready. This is also an issue on 
multi-worker topologies where one of the workers goes down, is recreated, and 
it’s several minutes before it can process tuples.

It would be good if there was a way for storm to deal with this, so we don’t 
have to implement our own back-channel back to the spout…

SimonC

From: Andrea Gazzarini [mailto:[email protected]]
Sent: 23 August 2016 13:08
To: [email protected]<mailto:[email protected]>
Subject: Re: Running a long task in bolt prepare() method


Not sure if there's a "built-in" approach in Storm for doint that. After make 
sure there isn't,  I'd do the following

  *   I'd start such long task asynchronously in the prepare method and I'd 
register a callback
  *   if the execute method logic depends on the completion of such task, I'd 
use a basic state pattern with two states ON/OFF (where the off state is 
basically a NullObject). The callback would be responsible to switch  the bolt 
state from OFF (initial state) to ON (working state)
Best,
Andrea
On 23/08/16 09:12, Xiang Wang wrote:
Hi All,

I am trying to do some long-time initialisation task in bolt prepare() method 
in local mode.

I always got error like this:
WARN  o.a.s.s.o.a.z.s.p.FileTxnLog - fsync-ing the write ahead log in 
SyncThread:0 took 1197ms which will adversely effect operation latency. See the 
ZooKeeper troubleshooting guide

And then the task fails.

Could anyone tell me how to fix this problem? Or is it a good practice to run 
long-time task in prepare() method? If not, what is supposed to be the correct 
way to do it?

Many thanks for your kind help.

Best,
Xiang
-------------------------------
Xiang Wang, PhD Candidate
Database Research Group
School of Computer Science and Engineering
The University of New South Wales
SYDNEY, AUSTRALIA

This message, and any files/attachments transmitted together with it, is 
intended for the use only of the person (or persons) to whom it is addressed. 
It may contain information which is confidential and/or protected by legal 
privilege. Accordingly, any dissemination, distribution, copying or use of this 
message, or any part of it or anything sent together with it, other than by 
intended recipients, may constitute a breach of civil or criminal law and is 
hereby prohibited. Unless otherwise stated, any views expressed in this message 
are those of the person sending it and not the sender's employer. No 
responsibility, legal or otherwise, of whatever nature, is accepted as to the 
accuracy of the contents of this message or for the completeness of the message 
as received. Anyone who is not the intended recipient of this message is 
advised to make no use of it and is requested to contact Featurespace Limited 
as soon as possible. Any recipient of this message who has knowledge or 
suspects that it may have been the subject of unauthorised interception or 
alteration is also requested to contact Featurespace Limited.

Reply via email to