Reusing Bolts in Trident

2014-05-01 Thread Tarek Amr
Hello, In one of the project I am working on, we use a basic topology that include a spot that collects some real-time updates, then we do some processing on that stream in a number of the intermediate bolts before saving them to the database. Now, we want to generate some statistics for this

Re: Question on Acking

2014-05-01 Thread Nishu
Hi, Currently I am also working on KafkaSpout, but my bolt is not emitting any message. Kafka Topic has various messages.When I consume messages from kafka consumer on terminal, it shows all the messages. But while executing Topology, getting following logs : 68503 [Thread-18-spout] INFO

Re: Machine specs

2014-05-01 Thread Software Dev
Seems like all of these setups involve a small number of CPU's??? Does storm typically require more RAM than CPU.. ie which is usually the bottleneck? On Wed, Apr 30, 2014 at 8:54 PM, Michael Rose mich...@fullcontact.com wrote: In AWS, we're fans of c1.xlarges, m3.xlarges, and c3.2xlarges, but

Topologies are disappearing??? How to debug?

2014-05-01 Thread Software Dev
Over the last several days some/all of our topologies are disappearing from Nimbus. What are some possible explanations for this? Where should I look to debug this problem? Thanks

Re: Topologies are disappearing??? How to debug?

2014-05-01 Thread Derek Dagit
Make sure you do not have a second nimbus daemon running by accident. I saw this one time after someone had launched nimbus on a different host, yet the file system on which nimbus was storing its state was an NFS mount. It took a comically long time to figure out that a the second remote

Best practice for shutting down storm

2014-05-01 Thread P Ghosh
I have few topologies running. The spout puts the ID of the object it is emitting into an WIP list in REDIS. When the spout gets the ack or fail method called, it takes it out of the WIP list. The environment and application are undergoing lot of changes.. and as a result I'm required to

Re: Machine specs

2014-05-01 Thread Cody A. Ray
I hate to give this answer, but I think it really depends on your application. If you're doing distributed machine learning or video compression or something that's CPU heavy, then it'll be CPU heavy. If you're doing pre-aggregation or rolling windows or other CPU-light analysis, you're more

Re: Topologies are disappearing??? How to debug?

2014-05-01 Thread Software Dev
That actually may be the issue as we had some other employees mess around with our Nimbus vagrant setup. On Thu, May 1, 2014 at 12:00 PM, Derek Dagit der...@yahoo-inc.com wrote: Make sure you do not have a second nimbus daemon running by accident. I saw this one time after someone had launched

Re: Best practice for shutting down storm

2014-05-01 Thread Nathan Leung
Hi Prasun, Acks and fails should continue to be handled. For step 2 I would consider adding a timeout just in case. -Nathan On Thu, May 1, 2014 at 3:36 PM, Prasun Ghosh prasun_gh...@apple.com wrote: Thanks Nathan, So, my shutdown script should be 1. Deactivate the topology 2. Wait for

block rather than sleep until local cluster is in business?

2014-05-01 Thread Eelco Hillenius
Hi, I'm writing unit tests for some Storm code and have been trying to find a way to wait for LocalCluster to be initialized. Is there a straightforward way to do that? Instead of letting the current thread sleep for a little, which is what I see a lot in examples, I'd like to block (e.g. using a

Re: RedStorm: How to Add More than 1 JRubyShellBolt to a Topology

2014-05-01 Thread Fred Miller
The solution ... thanks to Jess Hottenstein: bolt JRubyShellBolt, [python, splitsentence.py] do output_fields word source SimpleSpout, :shuffle end bolt JRubyShellBolt, [python, splitsentence2.py], :id = :sentence2 do output_fields word2 source SimpleSpout, :shuffle

Question about OpaqueTridentKafkaSpout

2014-05-01 Thread Ashok Gupta
Hi, I have theoretical question about the guarantees OpaqueKafkaTridentKafkaSpout provides. I would like to take an example to illustrate the question I have. Suppose a batch with txId 10 has tuple t1, t2, t3, t4 and they respectively come from the kafka partition p1,p2,p3,p4. When this batch

partitionPersist Example

2014-05-01 Thread Dan
Is there an example of using partitionPersist to persist data to ElasticSearch and/or Cassandra? Preferably somethingslightly more complicated than the canonical WordCount example Much obliged-Dan Cieslak