Thanks for all the help. :)
On Wed, Sep 21, 2016 at 11:56 AM, Harsh Choudhary
wrote:
> It is real-time. I get streaming JSONs from Kafka.
>
>
>
>
> On Wed, Sep 21, 2016 at 4:15 AM, Ambud Sharma
> wrote:
>
>> Is this real-time or batch?
>>
>> If
It is real-time. I get streaming JSONs from Kafka.
On Wed, Sep 21, 2016 at 4:15 AM, Ambud Sharma
wrote:
> Is this real-time or batch?
>
> If batch this is perfect for MapReduce or Spark.
>
> If real-time then you should use Spark or Storm Trident.
>
> On Sep 20, 2016
Is this real-time or batch?
If batch this is perfect for MapReduce or Spark.
If real-time then you should use Spark or Storm Trident.
On Sep 20, 2016 9:39 AM, "Harsh Choudhary" wrote:
> My use case is that I have a json which contains an array. I need to split
> that
My use case is that I have a json which contains an array. I need to split
that array into multiple jsons and do some computations on them. After
that, results from each json has to be used in further calculation
altogether and come up with the final result.
*Cheers!*
Harsh Choudhary / Software
What's your use case?
The complexities can be manage d as long as your tuple branching is
reasonable I.e. 1 tuple creates several other tuples and you need to sync
results between them.
On Sep 20, 2016 8:19 AM, "Harsh Choudhary" wrote:
> You're right. For that I have to
You're right. For that I have to manage the queue and all those
complexities of timeout. If Storm is not the right place to do this then
what else?
On Tue, Sep 20, 2016 at 8:25 PM, Ambud Sharma
wrote:
> The correct way is to perform time window aggregation using
The correct way is to perform time window aggregation using bucketing.
Use the timestamp on your event computed from.various stages and send it to
a single bolt where the aggregation happens. You only emit from this bolt
once you receive results from both parts.
It's like creating a barrier or
But how would that solve the syncing problem?
On Tue, Sep 20, 2016 at 8:12 PM, Alberto São Marcos
wrote:
> I would dump the *Bolt-A* results in a shared-data-store/queue and have a
> separate workflow with another spout and Bolt-B draining from there
>
> On Tue, Sep 20,
I would dump the *Bolt-A* results in a shared-data-store/queue and have a
separate workflow with another spout and Bolt-B draining from there
On Tue, Sep 20, 2016 at 9:20 AM, Harsh Choudhary
wrote:
> Hi
>
> I am thinking of doing the following.
>
> Spout subscribed to