Re: Apache Storm Twitter account

2019-06-17 Thread Arun Mahadevan
Flink recently published a blog on their networking model [1]. I believe Storm 2.0 does it better in some aspects with the new threading/back-pressure model and we should come out with something that explains the new model and the tuning parameters. - Arun [1]

Re: when are the storm window start?

2018-12-05 Thread Arun Mahadevan
The window start time would be based on which window the events falls into (based on the window length and sliding interval that you set) In the master branch you can access this start and end timestamps. See

Re: Batch processing.

2018-07-05 Thread Arun Mahadevan
> I'm aware of some ideas so far: one simple idea is killing topology when > spout gets all acknowledge messages and data source has no more data. There would be more work needed to optimize the shuffle and scheduling (batch), but I agree with minor modifications to the Spout API (like having a

Re: Batch processing.

2018-07-05 Thread Arun Mahadevan
Your use case seems to be a simple ETL (read from a data source and write to a Sink), which is very well addressed by Storm. With Storm you don’t necessarily need to split the data into batches, but can continuously load the data into ES. If your data set is bounded, you can just kill the

Re: Can I emit a Map?

2017-11-21 Thread Arun Mahadevan
I think it would be fine as long as you ensure your map is Immutable or at-least its not changed after its constructed. I don’t think there would be any issues with sizes. From: Toy Reply-To: "user@storm.apache.org" Date: Wednesday, November 22,

Re: Behavior of Storm when buffers fill

2017-09-26 Thread Arun Mahadevan
If Bolt B’s receive queue is full the tuple will be put into an overflow buffer. If back pressure is not enabled it will keep on filling the overflow buffer and eventually cause an OOM. You might also want to see the proposed changes in STORM-2306 where the receive queues are going to be

Re: is stateful bolts production ready?

2017-08-14 Thread Arun Mahadevan
.com> wrote: > Hi Arun, > > > > Could you please help me with my questions 2 and 3 if possible? > > > > > > Thanks > > Manusha > > > > *From:* Arun Iyer [mailto:ai...@hortonworks.com] *On Behalf Of *Arun > Mahadevan > *Sent:* 11 Au

Re: is stateful bolts production ready?

2017-08-11 Thread Arun Mahadevan
related methods called by the same bolt or spout thread? Thanks Manusha ____ From: Arun Iyer [ai...@hortonworks.com] on behalf of Arun Mahadevan [ar...@apache.org] Sent: Monday, July 24, 2017 2:29 PM To: user@storm.apache.org Subject: Re: is stateful bolts production re

Re: Why The BaseWindowedBolt Excuted 2,428,080 But Only acked 282,380

2017-07-03 Thread Arun Mahadevan
The tuples are ack-ed only once they fall out of the window. The ‘executed’ are typically the tuples that are received by the windowed bolt and enqueued for processing. Once the window moves forward, tuples fall out of the window and are acked. Since you have set a timestamp field, the window

Re: Delay in CHKPT message for stateful task

2017-03-28 Thread Arun Mahadevan
grouping so that every task of downstream one get a copy of $CHKPT message. (nothing like that is observed in topologyBuilder code). On Sat, Mar 25, 2017 at 2:42 PM, Arun Mahadevan <ar...@apache.org> wrote: The checkpoint tuples have to go through the same queue and

Re: Tuple processing

2017-03-26 Thread Arun Mahadevan
With a single worker, bolts A & B would be receiving reference to the same tuple since they are running in the same JVM (splitter bolt emits it once and the thread/task running A & B get the same instance of the tuple). Now if you mutate any value in the tuple (collection of scores in your

Re: Delay in CHKPT message for stateful task

2017-03-25 Thread Arun Mahadevan
The checkpoint tuples have to go through the same queue and follow the tuples emitted before it to make the state consistent across the bolts. When bolt ‘A’ receives a checkpoint (say C1 from the spout), it saves its state (of processing the tuples up to C1) and emits ’C1’ to the next bolt

Re: Benchmarking streaming technologies

2017-03-23 Thread Arun Mahadevan
You can take a look at the Yahoo streaming benchmark. https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at Regards, Arun From: Giselle van Dongen Reply-To: "user@storm.apache.org" Date:

Re: Converting storm tuple to bytearray

2017-03-21 Thread Arun Mahadevan
apache.org> Subject: Re: Converting storm tuple to bytearray But suppose I want to replay value in the tuple to the older taskID or StreamID etc. all those details will be lost . (I am doing this for replaying tuples after state migration.) On Tue, Mar 21, 2017 at 9:31 AM, Arun Mahadevan

Re: Converting storm tuple to bytearray

2017-03-20 Thread Arun Mahadevan
Storm uses Kryo to serialize Tuples. Check this https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java Instead of serializing the entire tuple yourself may be you just want to serialize the relevant values within the tuple.

Re: Rebalancing Stateful bolts in storm 1.0.2

2017-02-19 Thread Arun Mahadevan
This is expected with in-memory state, which stores the state in a local hash map and is not intended for any real use cases. And I don’t think there is any value in serializing the in-memory state during rebalance. How would you resurrect the state if the task gets reassigned to a different

Re: implmement state management with Apache Ignite

2017-02-15 Thread Arun Mahadevan
7 14:39,shawn.du<shawn...@neulion.com.cn> wrote: Hi Arun, Thanks your reply. I have read the document your mentioned. For current storm 1.0.2 only provide a simple KeyValueState interface, We are still considering whether it can fulfill our requirements. Thanks Shawn On 02/7/201

Re: Fault tolerance for stateful operator

2017-02-13 Thread Arun Mahadevan
> 1- Does scaling up/down using rebalance will maintain the state in case of > stateful task proposed in storm 1.0.2. The current rebalance and state logic takes care of this. However, if we do dynamic task scaling in future, the state migration also needs to be taken care of. Thanks,

Re: implmement state management with Apache Ignite

2017-02-07 Thread Arun Mahadevan
state correlates to <key,val> pair in these stateful tasks mentioned in the links. Is it separate for every task ? 2- Whats is the impact fo rebalance operation on these bolts. On Tue, Feb 7, 2017 at 11:40 AM, Arun Mahadevan <ar...@apache.org> wrote: > #1 when

Re: implmement state management with Apache Ignite

2017-02-06 Thread Arun Mahadevan
> #1 when a worker is died or killed by manually, storm framework will restart > this worker, is there a ID which doesn't change for the new worker and the > died worker? if there is, how to get it? Task Id does not change. The worker can be restarted in the same or different host and the

Re: Good examples of StatefulWindowedBoltExecutor

2017-01-27 Thread Arun Mahadevan
There is an example here - https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/storm/starter/StatefulWindowingTopology.java. Also checkout the “Stateful windowing” section under

Re: simple question about grouping

2017-01-23 Thread Arun Mahadevan
? On Mon, Jan 23, 2017 at 1:27 PM, Arun Mahadevan <ar...@apache.org> wrote: > builder.setBolt("MyBolt", new MyBolt(), 4).shuffleGrouping("MySpout"); i > found this example but couldn't know why he use number 4 ? This is the “parallelism hint” (the number

Re: simple question about grouping

2017-01-23 Thread Arun Mahadevan
he bolt doing 2 process so can i do like this builder.setSpout("MySpout", new mySpout(), 1); builder.setBolt("MyBolt", new MyBolt(), 4).shuffleGrouping("MySpout"); i found this example but couldn't know why he use number 4 ? On Mon, Jan 23, 2017 at 1:13 PM, sam mohe

Re: simple question about grouping

2017-01-23 Thread Arun Mahadevan
Grouping makes sense only when you have more than one task for a bolt. If your bolt has more than one task, then the grouping will decide how the tuples from the spout are distributed to the individual tasks of the bolt. (shuffe = random, fields = keyed on some field and so on). See

Re: Storm BaseStatefulBolt not working

2016-12-15 Thread Arun Mahadevan
> A CheckpointSpout is active and checkpoints the state like it should, but the > state isn't synchronized between the instances of the StatefulBolt. The state is private to the bolt instance (task). Each bolt task’s state is independent and is not synchronized between the instances of the

Re: Stateful bolts

2016-11-03 Thread Arun Mahadevan
Hi Abhishek, Right now delete/clear is not supported, so you need to workaround this by putting ‘null’ values. Clear/delete will be added in future. There’s a pending patch to support delete (https://github.com/apache/storm/pull/1470). Thanks, Arun From: Abhishek Raj

Re: When to use MemoryMapState while performing a persistentAggregate in Trident?

2016-10-26 Thread Arun Mahadevan
MemoryMapState is more for testing and does not provide any persistence. It uses a HashMap internally. If you want persistence you need use the one based on redis or other. Thanks, Arun From: Dinesh Babu K G Reply-To: "user@storm.apache.org"

Re: windowing & max spout pending

2016-09-01 Thread Arun Mahadevan
length + sliding >interval)), right? > >Best regards, >Balazs > >On 08/24/2016 05:31 PM, Arun Mahadevan wrote: >> Hi Balazs, >> >> >> >> Tuples are acked only when the window slides and the events fall out of the >> window. >> >> So

Re: windowing & max spout pending

2016-08-24 Thread Arun Mahadevan
Hi Balazs, Tuples are acked only when the window slides and the events fall out of the window. So max.spout.pending should be more than max number of tuples in window length + sliding interval. Thanks, Arun On 8/24/16, 8:33 PM, "Balázs Kossovics" wrote: >Hello,

Re: Trident HBaseState query and update ordering

2016-07-19 Thread Arun Mahadevan
If I understand correctly, the issue is that your trident topology queries the same state that’s being updated to compute the result. You can control the number of batches that trident processes simultaneously by adjusting the value of “topology.max.spout.pending”, if you set it to 1 the

Re: Question on Storm 1.0 State Management

2016-07-19 Thread Arun Mahadevan
elism increase, it is possible to have the state maintained in a different bolt instance right ? Any suggestion on handling such cases ? Thanks, Jins George On Mon, Jul 18, 2016 at 9:17 PM, Arun Mahadevan <ar...@apache.org> wrote: Each bolt instance (task) has its own state, so in your

Re: About count windows behavior

2016-07-12 Thread Arun Mahadevan
It may be a bug. You can raise an issue here - https://issues.apache.org/jira/browse/STORM If you just want count based windows (without accounting for event time/watermarks) you don’t need to set withTimestampField() and it should work. Thanks, Arun From: Lorenzo Affetti

Re: Tumbling Count and Time-Based WindowedBolt

2016-06-06 Thread Arun Mahadevan
Hi Cody, Tumbling window can be either count or time based, not both. I think what you want is a count based window with a time based sliding interval. The window will activate every sliding interval and will give you the last “count” events. You can use the “getNew” if you want only the new

Re: State Checkpointing & spout state

2016-05-18 Thread Arun Mahadevan
POLGY_MESSAGE_TIMEOUT)? Many thanks for your help. Olivier. On Tue, May 17, 2016 at 2:12 PM, Arun Mahadevan <ar...@apache.org> wrote: Hi Oliver, The state checkpointing currently does not checkpoint the state of the spout. It checkpoints the states of all the bolts and once that’s

Re: State Checkpointing & spout state

2016-05-17 Thread Arun Mahadevan
Hi Oliver, The state checkpointing currently does not checkpoint the state of the spout. It checkpoints the states of all the bolts and once that’s successful, the tuples emitted by the spout are acked. So currently it provides at-least once guarantee. In the ack method of the spout, you can

Re: initState method not invoked in Storm 1.0

2016-04-15 Thread Arun Mahadevan
rds, Alex On Apr 15, 2016 11:16 AM, "Arun Mahadevan" <ar...@apache.org> wrote: Ah, I see what you mean. The “setBolt” method without parallelism hint is not overloaded for stateful bolts so if parallelism hint is not specified it ends up as being normal bolt. Will raise a JIRA for

Re: initState method not invoked in Storm 1.0

2016-04-15 Thread Arun Mahadevan
red tuples (that way the the exception was complaining)? I look forward for your answers, Florin On Fri, Apr 15, 2016 at 12:16 PM, Arun Mahadevan <ar...@apache.org> wrote: Ah, I see what you mean. The “setBolt” method without parallelism hint is not overloaded for stateful bolts so if paralle

Re: initState method not invoked in Storm 1.0

2016-04-15 Thread Arun Mahadevan
Bolt method overload by mistake, since stateful bolts are supertypes of stateless ones. Regards Alex On Apr 15, 2016 10:54 AM, "Arun Mahadevan" <ar...@apache.org> wrote: Its the same method (builder.setBolt) that adds stateful bolts to a topology. Heres an example - https://github.c

Re: Storm 1.0.0 Windowing by id

2016-04-04 Thread Arun Mahadevan
f all ids? Thank you! Filipa From: Arun Iyer [mailto:ai...@hortonworks.com] On Behalf Of Arun Mahadevan Sent: 2 de abril de 2016 20:33 To: user@storm.apache.org Subject: Re: Storm 1.0.0 Windowing by id Hi Filipa, Yes, you could have a separate stream per id and have a separate wind

Re: Aw: Re: Combining group by and time window

2016-04-02 Thread Arun Mahadevan
Hi Daniela, > Okay, could I do the grouping already in Kafka? For example would it be > possible to use one topic per region or to use one topic with a partition for > every region? Then the messages would already be grouped when the arrive at > Storm. Is this correct? You would need a kafka

Re: Storm 1.0.0 Windowing by id

2016-04-02 Thread Arun Mahadevan
Hi Filipa, Yes, you could have a separate stream per id and have a separate windowed bolt subscribe to the corresponding stream. The other option is to do the id based grouping within the windowed bolt each time its "execute" method is triggered with the tuples in the current window. Thanks,

Re: Intermittent error with stateful topology

2016-03-10 Thread Arun Mahadevan
Hi Alexander, Can you turn on debug logs and see if the logs have any more information ? What is your topology like ? You might want to file a JIRA and upload the debug logs. Thanks, Arun From: Alexander T Reply-To: "user@storm.apache.org" Date: Thursday, March 10, 2016 at 7:28 PM To:

Re: Stateful bolts and acking

2016-03-08 Thread Arun Mahadevan
lt, and if I replace it with a simple stateful bolt, the acking starts working as expected again. Is there support for non-stateful windowed bolts? Best regards Alexander On Mon, Mar 7, 2016 at 12:16 PM, Arun Mahadevan <ar...@apache.org> wrote: Hi Alexander, You are right, the acking for the non-st

Re: Stateful bolts and acking

2016-03-07 Thread Arun Mahadevan
use bolts which were written for stateless topologies in stateful ones. Are there any plans of adapters or to change the interface so that they can interoperate? Best regards Alexander On Fri, Mar 4, 2016 at 7:26 PM, Arun Mahadevan <ar...@apache.org> wrote: Hi Alexander, The simple topo

Re: Stateful bolts and acking

2016-03-04 Thread Arun Mahadevan
ght have missed? Regards, Alexander On Mar 4, 2016 5:30 PM, "Arun Mahadevan" <ar...@apache.org> wrote: Hi Alexander, For a stateful topology the anchoring and acking is automatically taken care of. Can you check if any of your bolts inherit BaseBasicBolt or if you are manual