Dynamically changing running topology configuration.

2015-05-29 Thread Nipur Patodi
Hi All, I need to scale out running topology as per load . We can use Strom rebalancing command to achieve it to some extend. What are the possible ways to dynamically change configuration of running topology for parameters like number of task executing ? Can we achieve this by Strom signal? Pl

Re: Supervisor believes worker has not started.

2015-05-29 Thread Jeffery Maass
When you look at the worker logs, do some of the workers sometimes kill themselves because there is a missing stormconf.ser file? If so, grab that error message and have fun googling. Some will say those problems went away with the latest release. Apparently, it is complicated. My best advice is

Re: Global Count - Trident - Please help

2015-05-29 Thread Ashish Soni
Can you point me to the example , I am not able to understand what you mean by partition the data accordingly. If i have 3 node storm cluster and i want to go a global count how it will work , please explain if possible. Regards On Fri, May 29, 2015 at 12:04 PM, Andrew Xor wrote: > I am guessi

Supervisor believes worker has not started.

2015-05-29 Thread Grant Overby (groverby)
Supervisor is reporting that the worker “still hasn’t started” and eventually kills and restarts the worker. However; the worker has started and is processing tuples. This repeats indefinitely. Debugging steps? [http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398

Re: No logs directory in apache storm on Mavericks

2015-05-29 Thread Susheel Kumar Gadalay
You mentioned the storm script is having the logs directory as "./logs". Then it will create the logs directory in the current working directory where you run the storm command. On 5/29/15, Jeffery Maass wrote: > Maybe if you create the logs directory, then restart the processes? > > Thank you f

Re: Global Count - Trident - Please help

2015-05-29 Thread Andrew Xor
I am guessing that as you currently do it you spawn the different tasks counting the same thing, hence you are basically reading the data twice. I suspect if you set a parallelism hint of 3,4 and so on you would get 21 x that number. To do a global count you need to partition the data accordingly,

Re: Global Count - Trident - Please help

2015-05-29 Thread P. Taylor Goetz
Try moving “.parallelismHint(2)” to after the groupBy. With the current placement (before the groupBy) Storm is creating two instances of your spout, each outputting the same data set. -Taylor On May 29, 2015, at 11:09 AM, Ashish Soni wrote: > HI All , > > I am trying to run a global count

Global Count - Trident - Please help

2015-05-29 Thread Ashish Soni
HI All , I am trying to run a global count using Trident and when i use Parallel hint of 2 it is getting double counted , Please tell me what i am doing wrong , below is the code and sample data set. I am just trying to count the no of calls made by a particular phone no and when i do not specify

Re: Newbie Question: Can two different bolts subscribe to each other??

2015-05-29 Thread Andrew Xor
Another way would be to create a direct stream between the two that would accomplish what you suggest, if you elect to not use Trident which simplifies a lot of stuff, then you need to look for inspiration on the implementation of CoordinatedBolt

Re: No logs directory in apache storm on Mavericks

2015-05-29 Thread Jeffery Maass
Maybe if you create the logs directory, then restart the processes? Thank you for your time! + Jeff Maass linkedin.com/in/jeffmaass stackoverflow.com/users/373418/maassql + On Fri, May 29, 2015 at 12:42 AM, Abhishek Raj wrote: > With nimbus the values

Re: Supervisor repeatedly killing worker

2015-05-29 Thread Jeffery Maass
Set logging to info level. The reason is explained every time. Sorry, I don't have any examples You have to look at 3 logs: * nimbus - will say that it is killing a task/executor. As I recall, you have to figure out that the task/executor links up to the supervisor. * supervisor - will say

Re: tuple size limitation?

2015-05-29 Thread Carlos Perelló Marín
Right, that's why I don't understand why serializing with json the tuple before emitting it fixes the issue. If the whole message is going to be serialized with JSON anyway I would expect it to work. (I'm ignoring the JSON encoding/decoding performance, just talking about functionality). Also, the

Dynamic rebalancing and reconfiguration

2015-05-29 Thread Nipur Patodi
Hi All, I know that we have Storm rebalance command available to rebalance running topology as per number of tasks per component assigned while submitting topology ( source ). My question is If I want to reconfig

Re: Newbie Question: Can two different bolts subscribe to each other??

2015-05-29 Thread Nathan Leung
This is possible but if you need to do this on a per tuple basis I would consider doing it in the spout ack method. If you are doing batches I would consider using trident. On May 29, 2015 8:28 AM, "Michail Toutoudakis" wrote: > I would like to ask if it is possible two different bolts to subscri

Re: tuple size limitation?

2015-05-29 Thread Nathan Leung
The default (and in old releases ONLY) multi lang serializer is json, which is in fact slow. On May 29, 2015 8:04 AM, "Andrew Xor" wrote: > ​I think in the storm documentation it clearly says that not only you have > to serialize your objects but when using custom types it is better to > implemen

Newbie Question: Can two different bolts subscribe to each other??

2015-05-29 Thread Michail Toutoudakis
I would like to ask if it is possible two different bolts to subscribe to each other to specific streams. I would like to do this for sync purposes. For example suppose we have bolt1 that inistialized class one and runs in 4 instances. When each of bolt1 instance finishes its task it sents an

Re: Throughput : local mode is faster than cluster mode

2015-05-29 Thread Denis DEBARBIEUX
Dear all, I investiguated a little more. I guess that the partitionBy primitive impacts a lot my performance since my data are not well balanced (70% of my tuples are assigned to a single task). It looks like the cluster mode takes many time to root the data. Denis Le 28/05/2015 21:28, Jef

Re: tuple size limitation?

2015-05-29 Thread Andrew Xor
​I think in the storm documentation it clearly says that not only you have to serialize your objects but when using custom types it is better to implement your own to avoid the "native" serializer which is quite slow.​ I have not used storm multi-lang though to be honest. Regards. On Fri, May 29,

Re: tuple size limitation?

2015-05-29 Thread Carlos Perelló Marín
Found the problem... I'm not serializing the json object so when I call emit, it's a python dictionary. It works most of the time, but for some reason we found several values that break it. I'm not 100% it's not a problem with the storm's multilang support, given that the emit ends doing a json.du

How to configure multi-node Apache Storm cluster

2015-05-29 Thread Chun Yuen Lim
https://stackoverflow.com/questions/30525661/how-to-configure-multi-node-apache-storm-cluster