We tried implementing our own thread safe output collector but it seems 
ridiculous for such a concurrent system. Why don’t they implement it in core 
Storm? They can have the current one, and then build a separate one called 
Concurrent_OutputCollector or some such. I was hoping they fixed that in 
version 1.0 and I missed it.

From: Stephen Powis <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, April 28, 2016 at 10:58 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: thread safe output collector

So the Spout documentation (assuming its correct...) here 
(http://storm.apache.org/releases/current/Concepts.html#spouts) mentions this:

"The main method on spouts is nextTuple. nextTuple either emits a new tuple 
into the topology or simply returns if there are no new tuples to emit. It is 
imperative that nextTuple does not block for any spout implementation, because 
Storm calls all the spout methods on the same thread."

When developing a custom spout we interpreted it to mean that any "real work" 
done by a spout should be done in a separate thread, and decided on the 
following pattern which seems some what relevant to what you are trying to do 
in your bolts.

On Spout prepare, we create a concurrent/thread safe queue.  We then create a 
new Thread passing it a reference to our thread safe queue.  This thread 
handles finding new data that needs to be emitted.  When that thread finds 
data, it adds it to the shared queue.  When the spout's nextTuple() method is 
called, it looks for data on the shared queue and emits it.

I imagine doing async processing in a bolt using one or more threads could work 
with a similar pattern.  On prepare you setup your thread(s) with references to 
a shared queue.  The bolt passes work to be completed to the thread(s), the 
thread(s) communicate back to the bolt the result via a shared queue.  Add in 
the concept of tick tuples to ensure your bolt checks for completed work on a 
regular basis?

Is there a better way to do this?

On Thu, Apr 28, 2016 at 11:22 AM, Julien Nioche 
<[email protected]<mailto:[email protected]>> wrote:
Thanks for the clarification

On 28 April 2016 at 15:12, P. Taylor Goetz 
<[email protected]<mailto:[email protected]>> wrote:
The documentation is wrong. See:

https://issues.apache.org/jira/browse/STORM-841

At some point it looks like the change made there got reverted. I will reopen 
it to make sure the documentation is corrected.

OutputCollector is NOT thread-safe.

-Taylor

On Apr 28, 2016, at 9:06 AM, Stephen Powis 
<[email protected]<mailto:[email protected]>> wrote:


"Its perfectly fine to launch new threads in bolts that do processing 
asynchronously. 
OutputCollector<http://storm.apache.org/releases/current/javadocs/org/apache/storm/task/OutputCollector.html>
 is thread-safe and can be called at any time."


From the docs for 0.9.6: 
http://storm.apache.org/releases/0.9.6/Concepts.html#bolts

On Thu, Apr 28, 2016 at 9:03 AM, P. Taylor Goetz 
<[email protected]<mailto:[email protected]>> wrote:
IIRC there was discussion about making it thread safe, but I don't believe it 
was implemented.

-Taylor

On Apr 28, 2016, at 3:52 AM, Julien Nioche 
<[email protected]<mailto:[email protected]>> wrote:

Hi Stephen

I asked the same question in February but did not get a reply

https://mail-archives.apache.org/mod_mbox/storm-user/201602.mbox/%3cca+-fm0urpf3fuerozywpzmxu-kdbgf-zj3wbyr8evsaqjc6...@mail.gmail.com%3E

Anyone who could confirm this?

Thanks

On 27 April 2016 at 14:05, Steven Lewis 
<[email protected]<mailto:[email protected]>> wrote:
I have conflicting information, and have not checked personally but has the 
output collector finally been made thread safe for emitting in version 1.0 or 
0.10? I know it was a huge problem in 0.9.5 when trying to do threading in a 
bolt for async future calls and emitting once it returns.

This email and any files transmitted with it are confidential and intended 
solely for the individual or entity to whom they are addressed. If you have 
received this email in error destroy it immediately. *** Walmart Confidential 
***



--

Open Source Solutions for Text Engineering

http://www.digitalpebble.com<http://www.digitalpebble.com/>
http://digitalpebble.blogspot.com/
#digitalpebble<http://twitter.com/digitalpebble>





--

Open Source Solutions for Text Engineering

http://www.digitalpebble.com<http://www.digitalpebble.com/>
http://digitalpebble.blogspot.com/
#digitalpebble<http://twitter.com/digitalpebble>


This email and any files transmitted with it are confidential and intended 
solely for the individual or entity to whom they are addressed. If you have 
received this email in error destroy it immediately. *** Walmart Confidential 
***

Reply via email to