Flink recently published a blog on their networking model [1].
I believe Storm 2.0 does it better in some aspects with the new
threading/back-pressure model and we should come out with something that
explains the new model and the tuning parameters.
- Arun
[1]
The window start time would be based on which window the events falls into
(based on the window length and sliding interval that you set)
In the master branch you can access this start and end timestamps.
See
> I'm aware of some ideas so far: one simple idea is killing topology when
> spout gets all acknowledge messages and data source has no more data.
There would be more work needed to optimize the shuffle and scheduling (batch),
but I agree with minor modifications to the Spout API (like having a
Your use case seems to be a simple ETL (read from a data source and write to a
Sink), which is very well addressed by Storm. With Storm you don’t necessarily
need to split the data into batches, but can continuously load the data into
ES. If your data set is bounded, you can just kill the
I think it would be fine as long as you ensure your map is Immutable or
at-least its not changed after its constructed. I don’t think there would be
any issues with sizes.
From: Toy
Reply-To: "user@storm.apache.org"
Date: Wednesday, November 22,
If Bolt B’s receive queue is full the tuple will be put into an overflow
buffer. If back pressure is not enabled it will keep on filling the overflow
buffer and eventually cause an OOM.
You might also want to see the proposed changes in STORM-2306 where the receive
queues are going to be
.com>
wrote:
> Hi Arun,
>
>
>
> Could you please help me with my questions 2 and 3 if possible?
>
>
>
>
>
> Thanks
>
> Manusha
>
>
>
> *From:* Arun Iyer [mailto:ai...@hortonworks.com] *On Behalf Of *Arun
> Mahadevan
> *Sent:* 11 Au
related methods called by the same bolt or spout thread?
Thanks
Manusha
____
From: Arun Iyer [ai...@hortonworks.com] on behalf of Arun Mahadevan
[ar...@apache.org]
Sent: Monday, July 24, 2017 2:29 PM
To: user@storm.apache.org
Subject: Re: is stateful bolts production re
The tuples are ack-ed only once they fall out of the window. The ‘executed’ are
typically the tuples that are received by the windowed bolt and enqueued for
processing. Once the window moves forward, tuples fall out of the window and
are acked. Since you have set a timestamp field, the window
grouping so that every task of
downstream one get a copy of $CHKPT message. (nothing like that is observed in
topologyBuilder code).
On Sat, Mar 25, 2017 at 2:42 PM, Arun Mahadevan <ar...@apache.org> wrote:
The checkpoint tuples have to go through the same queue and
With a single worker, bolts A & B would be receiving reference to the same
tuple since they are running in the same JVM (splitter bolt emits it once and
the thread/task running A & B get the same instance of the tuple). Now if you
mutate any value in the tuple (collection of scores in your
The checkpoint tuples have to go through the same queue and follow the tuples
emitted before it to make the state consistent across the bolts.
When bolt ‘A’ receives a checkpoint (say C1 from the spout), it saves its state
(of processing the tuples up to C1) and emits ’C1’ to the next bolt
You can take a look at the Yahoo streaming benchmark.
https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
Regards,
Arun
From: Giselle van Dongen
Reply-To: "user@storm.apache.org"
Date:
apache.org>
Subject: Re: Converting storm tuple to bytearray
But suppose I want to replay value in the tuple to the older taskID or StreamID
etc. all those details will be lost . (I am doing this for replaying tuples
after state migration.)
On Tue, Mar 21, 2017 at 9:31 AM, Arun Mahadevan
Storm uses Kryo to serialize Tuples. Check this
https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java
Instead of serializing the entire tuple yourself may be you just want to
serialize the relevant values within the tuple.
This is expected with in-memory state, which stores the state in a local hash
map and is not intended for any real use cases. And I don’t think there is any
value in serializing the in-memory state during rebalance. How would you
resurrect the state if the task gets reassigned to a different
7 14:39,shawn.du<shawn...@neulion.com.cn> wrote:
Hi Arun,
Thanks your reply. I have read the document your mentioned.
For current storm 1.0.2 only provide a simple KeyValueState interface, We are
still considering whether it can fulfill our requirements.
Thanks
Shawn
On 02/7/201
> 1- Does scaling up/down using rebalance will maintain the state in case of
> stateful task proposed in storm 1.0.2.
The current rebalance and state logic takes care of this. However, if we do
dynamic task scaling in future, the state migration also needs to be taken care
of.
Thanks,
state correlates to <key,val> pair in these stateful tasks
mentioned in the links. Is it separate for every task ?
2- Whats is the impact fo rebalance operation on these bolts.
On Tue, Feb 7, 2017 at 11:40 AM, Arun Mahadevan <ar...@apache.org> wrote:
> #1 when
> #1 when a worker is died or killed by manually, storm framework will restart
> this worker, is there a ID which doesn't change for the new worker and the
> died worker? if there is, how to get it?
Task Id does not change. The worker can be restarted in the same or different
host and the
There is an example here -
https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/storm/starter/StatefulWindowingTopology.java.
Also checkout the “Stateful windowing” section under
?
On Mon, Jan 23, 2017 at 1:27 PM, Arun Mahadevan <ar...@apache.org> wrote:
> builder.setBolt("MyBolt", new MyBolt(), 4).shuffleGrouping("MySpout"); i
> found this example but couldn't know why he use number 4 ?
This is the “parallelism hint” (the number
he bolt doing 2 process
so can i do like this
builder.setSpout("MySpout", new mySpout(), 1);
builder.setBolt("MyBolt", new MyBolt(), 4).shuffleGrouping("MySpout"); i found
this example but couldn't know why he use number 4 ?
On Mon, Jan 23, 2017 at 1:13 PM, sam mohe
Grouping makes sense only when you have more than one task for a bolt. If your
bolt has more than one task, then the grouping will decide how the tuples from
the spout are distributed to the individual tasks of the bolt. (shuffe =
random, fields = keyed on some field and so on).
See
> A CheckpointSpout is active and checkpoints the state like it should, but the
> state isn't synchronized between the instances of the StatefulBolt.
The state is private to the bolt instance (task). Each bolt task’s state is
independent and is not synchronized between the instances of the
Hi Abhishek,
Right now delete/clear is not supported, so you need to workaround this by
putting ‘null’ values.
Clear/delete will be added in future. There’s a pending patch to support delete
(https://github.com/apache/storm/pull/1470).
Thanks,
Arun
From: Abhishek Raj
MemoryMapState is more for testing and does not provide any persistence. It
uses a HashMap internally. If you want persistence you need use the one based
on redis or other.
Thanks,
Arun
From: Dinesh Babu K G
Reply-To: "user@storm.apache.org"
length + sliding
>interval)), right?
>
>Best regards,
>Balazs
>
>On 08/24/2016 05:31 PM, Arun Mahadevan wrote:
>> Hi Balazs,
>>
>>
>>
>> Tuples are acked only when the window slides and the events fall out of the
>> window.
>>
>> So
Hi Balazs,
Tuples are acked only when the window slides and the events fall out of the
window.
So max.spout.pending should be more than max number of tuples in window length
+ sliding interval.
Thanks,
Arun
On 8/24/16, 8:33 PM, "Balázs Kossovics" wrote:
>Hello,
If I understand correctly, the issue is that your trident topology queries the
same state that’s being updated to compute the result. You can control the
number of batches that trident processes simultaneously by adjusting the value
of “topology.max.spout.pending”, if you set it to 1 the
elism increase, it is possible to have the state maintained
in a different bolt instance right ?
Any suggestion on handling such cases ?
Thanks,
Jins George
On Mon, Jul 18, 2016 at 9:17 PM, Arun Mahadevan <ar...@apache.org> wrote:
Each bolt instance (task) has its own state, so in your
It may be a bug. You can raise an issue here -
https://issues.apache.org/jira/browse/STORM
If you just want count based windows (without accounting for event
time/watermarks) you don’t need to set withTimestampField() and it should work.
Thanks,
Arun
From: Lorenzo Affetti
Hi Cody,
Tumbling window can be either count or time based, not both.
I think what you want is a count based window with a time based sliding
interval. The window will activate every sliding interval and will give you the
last “count” events. You can use the “getNew” if you want only the new
POLGY_MESSAGE_TIMEOUT)?
Many thanks for your help.
Olivier.
On Tue, May 17, 2016 at 2:12 PM, Arun Mahadevan <ar...@apache.org> wrote:
Hi Oliver,
The state checkpointing currently does not checkpoint the state of the spout.
It checkpoints the states of all the bolts and once that’s
Hi Oliver,
The state checkpointing currently does not checkpoint the state of the spout.
It checkpoints the states of all the bolts and once that’s successful, the
tuples emitted by the spout are acked. So currently it provides at-least once
guarantee.
In the ack method of the spout, you can
rds,
Alex
On Apr 15, 2016 11:16 AM, "Arun Mahadevan" <ar...@apache.org> wrote:
Ah, I see what you mean. The “setBolt” method without parallelism hint is not
overloaded for stateful bolts so if parallelism hint is not specified it ends
up as being normal bolt. Will raise a JIRA for
red tuples (that way the the exception was complaining)?
I look forward for your answers,
Florin
On Fri, Apr 15, 2016 at 12:16 PM, Arun Mahadevan <ar...@apache.org> wrote:
Ah, I see what you mean. The “setBolt” method without parallelism hint is not
overloaded for stateful bolts so if paralle
Bolt method overload by
mistake, since stateful bolts are supertypes of stateless ones.
Regards
Alex
On Apr 15, 2016 10:54 AM, "Arun Mahadevan" <ar...@apache.org> wrote:
Its the same method (builder.setBolt) that adds stateful bolts to a topology.
Heres an example -
https://github.c
f all ids?
Thank you!
Filipa
From: Arun Iyer [mailto:ai...@hortonworks.com] On Behalf Of Arun Mahadevan
Sent: 2 de abril de 2016 20:33
To: user@storm.apache.org
Subject: Re: Storm 1.0.0 Windowing by id
Hi Filipa,
Yes, you could have a separate stream per id and have a separate wind
Hi Daniela,
> Okay, could I do the grouping already in Kafka? For example would it be
> possible to use one topic per region or to use one topic with a partition for
> every region? Then the messages would already be grouped when the arrive at
> Storm. Is this correct?
You would need a kafka
Hi Filipa,
Yes, you could have a separate stream per id and have a separate windowed bolt
subscribe to the corresponding stream.
The other option is to do the id based grouping within the windowed bolt each
time its "execute" method is triggered with the tuples in the current window.
Thanks,
Hi Alexander,
Can you turn on debug logs and see if the logs have any more information ? What
is your topology like ?
You might want to file a JIRA and upload the debug logs.
Thanks,
Arun
From: Alexander T
Reply-To: "user@storm.apache.org"
Date: Thursday, March 10, 2016 at 7:28 PM
To:
lt, and if I replace
it with a simple stateful bolt, the acking starts working as expected again. Is
there support for non-stateful windowed bolts?
Best regards
Alexander
On Mon, Mar 7, 2016 at 12:16 PM, Arun Mahadevan <ar...@apache.org> wrote:
Hi Alexander,
You are right, the acking for the non-st
use bolts which were written for
stateless topologies in stateful ones. Are there any plans of adapters or to
change the interface so that they can interoperate?
Best regards
Alexander
On Fri, Mar 4, 2016 at 7:26 PM, Arun Mahadevan <ar...@apache.org> wrote:
Hi Alexander,
The simple topo
ght have missed?
Regards,
Alexander
On Mar 4, 2016 5:30 PM, "Arun Mahadevan" <ar...@apache.org> wrote:
Hi Alexander,
For a stateful topology the anchoring and acking is automatically taken care
of.
Can you check if any of your bolts inherit BaseBasicBolt or if you are manual
45 matches
Mail list logo