There is now way to have a larger dop than number of tasks. If you don't
specify the number of tasks, per default #tasks = dop. Thus, the only
way to increase the dop beyond the current #tasks is to resubmit the
topology.

-Matthias

On 06/23/2015 03:02 AM, Harshit Gupta wrote:
> Thanks Bobby. 
> 
> The issue of a bolt losing it's state looks pretty valid. However, what
> I actually wanted to ask is - if I don't want to specify the number of
> tasks in the topology. Say I have a logic that figures out how many
> instances of each component to run. And that can be done once the
> topology has been submitted. Is there a way of doing that ?
> 
> On Tue, Jun 23, 2015 at 5:47 AM, Bobby Evans <[email protected]
> <mailto:[email protected]>> wrote:
> 
>     The issue with this is with routing of tuples.  If I want a keyed
>     grouping where a tuple with "foo" in it will always go to the same
>     instance of a bolt.  I don't see how it is possible to go from a
>     situation where I have one bolt instance that has seen all of the
>     tuples up to that point, and has some arbitrary state computed from
>     them, and go to 2 instances of the bolt.  If I do that, I either
>     have to throw all of the state away for both bolts, which is what
>     redeploying your topology does, or I have to provide a way to
>     checkpoint split and combine the state of these bolts. That is an
>     incredibly difficult problem to solve, especially if the routing is
>     user plug-able.  Instead we ask you ahead of time what is the
>     maximum amount of state partitioning do you want for each bolt
>     instance and then let you potentially run each of these in parallel.
> 
>     I guess we could do something like S4 where every key got a new bolt
>     instance, but then they had a lot of issues with check-pointing all
>     of these bolt instances and swapping them out.  They also didn't
>     allow for pluggable groupings.  Everything was keyed grouping.
>      
>     - Bobby
> 
> 
> 
>     On Friday, June 19, 2015 6:35 AM, Matthias J. Sax
>     <[email protected]
>     <mailto:[email protected]>> wrote:
> 
> 
>     Yes. The number of tasks is the maximum parallelism. However, you can
>     have less parallelism as number of tasks. If you know the maximum number
>     of distinct keys in your data set you can set the number of task
>     accordingly. (more parallelism as number of distinct keys in not
>     possible anyway).
> 
>     -Matthias
> 
> 
>     On 06/19/2015 01:01 PM, Harshit Gupta wrote:
>     > That's what. I want to have an arbitrary degree of parallelism. I
>     don't
>     > wish to hard code it. The current release doesn't allow that,
>     isn't it ?
>     >
>     > On 19/06/2015 8:55 pm, "Matthias J. Sax"
>     <[email protected] <mailto:[email protected]>
>     > <mailto:[email protected]
>     <mailto:[email protected]>>> wrote:
>     >
>     >    If the number of tasks is 3, you can have a maximum dop of 3.
>     >
>     >    ->  #executers <= #tasks
>     >
>     >    Have a lock here:
>     >
>     >   
>     
> https://storm.apache.org/documentation/Understanding-the-parallelism-of-a-Storm-topology.html
>     >
>     >    -Matthias
>     >
>     >    On 06/19/2015 12:31 PM, Harshit Gupta wrote:
>     >    > Hi Matthias,
>     >    >
>     >    > Thanks for your reply.
>     >    >
>     >    > Consider this, say the max number of tasks for a bolt B is set to
>     >    3. But
>     >    > at some point of time, I want to deploy B on 6 different
>     machines. How
>     >    > would I do that ??
>     >    >
>     >    > I am new to Storm and your answer will improve my
>     understanding of the
>     >    > platform.
>     >    >
>     >    > Thanks a lot.
>     >    >
>     >    > On 19/06/2015 6:59 pm, "Matthias J. Sax"
>     >    <[email protected]
>     <mailto:[email protected]>
>     <mailto:[email protected]
>     <mailto:[email protected]>>
>     >    > <mailto:[email protected]
>     <mailto:[email protected]>
>     >    <mailto:[email protected]
>     <mailto:[email protected]>>>> wrote:
>     >    >
>     >    >    Just want to clarify: The number of task is not the number
>     >    parallel
>     >    >    running bolt instances (called executors, which are threads).
>     >    So I don't
>     >    >    understand why you don't want to start with the maximum number
>     >    of tasks?
>     >    >    There should be almost no overhead if you have more tasks than
>     >    executors
>     >    >    (executors can process multiple tasks and switching between
>     >    tasks is
>     >    >    light weight). Adjusting the number of executors during
>     >    runtime can be
>     >    >    done without redeploying (-> "rebalance"), giving you the
>     >    flexibility
>     >    >    you need.
>     >    >
>     >    >    -Matthias
>     >    >
>     >    >    On 06/19/2015 10:09 AM, Nilesh Chhapru wrote:
>     >    >    > Hi Harshit,
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > No there isn’t any way you can achieve this without
>     >    redeploying your
>     >    >    > topology, you may get this feature in the upcoming
>     releases of
>     >    >    storm as
>     >    >    > this is in their roadmap.
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > *Regards*,
>     >    >    >
>     >    >    > *Nilesh Chhapru.*
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > *From:*Harshit Gupta [mailto:[email protected]
>     <mailto:[email protected]>
>     >    <mailto:[email protected]
>     <mailto:[email protected]>>
>     >    >    <mailto:[email protected]
>     <mailto:[email protected]>
>     >    <mailto:[email protected]
>     <mailto:[email protected]>>>]
>     >    >    > *Sent:* 19 June 2015 11:43 AM
>     >    >    > *To:* [email protected]
>     <mailto:[email protected]>
>     >    <mailto:[email protected]
>     <mailto:[email protected]>>
> 
>     >    >    <mailto:[email protected]
>     <mailto:[email protected]>
>     >    <mailto:[email protected]
>     <mailto:[email protected]>>>
>     >    >    > *Subject:* Fwd: DYNAMIC ADJUSTMENT OF NUMBER OF TASKS
>     >    >    >
>     >    >    >
>     >    >    >
>     >    >    > Hello,
>     >    >    >
>     >    >    > I am working on extending the Storm platform and would
>     like to
>     >    >    know the
>     >    >    > scope of dynamically adjusting the number of tasks for a
>     >    topology.
>     >    >    >
>     >    >    > I don't want to work with a worst-case ceiling on the number
>     >    of tasks.
>     >    >    >
>     >    >    > Please let me know if there is/isn't a method for
>     >    dynamically changing
>     >    >    > the number of tasks without restarting the topology.
>     >    >    >
>     >    >    > Thanks.
>     >    >    >
>     >    >    > --
>     >    >    >
>     >    >    > /With regards,/
>     >    >    >
>     >    >    > * *
>     >    >    >
>     >    >    > *HARSHIT GUPTA*
>     >    >    >
>     >    >    > Fourth Year Undergraduate Student,
>     >    >    >
>     >    >    > Department Of Computer Science And Engineering,
>     >    >    >
>     >    >    > Indian Institute Of Technology, Kharagpur.
>     >    >    >
>     >    >
>     >
> 
> 
> 
> 
> 
> -- 
> /With regards,/
> *
> *
> *
> HARSHIT GUPTA*
> Fourth Year Undergraduate Student,
> Department Of Computer Science And Engineering,
> Indian Institute Of Technology, Kharagpur.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to