[jira] [Commented] (STORM-855) Add tuple batching

ASF GitHub Bot (JIRA) Fri, 30 Oct 2015 15:28:51 -0700

    [ 
https://issues.apache.org/jira/browse/STORM-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983491#comment-14983491
 ]


ASF GitHub Bot commented on STORM-855:
--------------------------------------

GitHub user knusbaum opened a pull request:

    https://github.com/apache/storm/pull/838

    [STORM-855] Heartbeat Server (Pacemaker)

    This pull request redirects worker heartbeats away from Zookeeper and into 
a new server, 'Pacemaker'.
    
    The redirection is accomplished by making `ClusterState` pluggable through 
the `ClusterStateFactory` interface.
    
    By default, Pacemaker is not enabled. It can be enabled by setting 
`storm.cluster.state.store` from its default value of 
`"backtype.storm.cluster_state.zookeeper_state_factory"` to 
`"org.apache.storm.pacemaker.pacemaker_state_factory"`
    
    Pacemaker includes both digest-based and kerberos-based security, but it is 
primitive.
    
    Right now Pacemaker is not HA, but currently if Pacemaker fails, Nimbus 
will NOT start killing and reassigning workers, so Pacemaker going down won't 
bring down a cluster. It does need to be brought back up before new jobs can be 
submitted, though.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/knusbaum/incubator-storm STORM-855

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/838.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #838
    
----
commit 444ec05e5a9f38f9a9472c54b39f1371c839683b
Author: Kyle Nusbaum <[email protected]>
Date:   2015-10-30T22:21:27Z

    PACEMAKER OPEN SOURCE!

----


> Add tuple batching
> ------------------
>
>                 Key: STORM-855
>                 URL: https://issues.apache.org/jira/browse/STORM-855
>             Project: Apache Storm
>          Issue Type: New Feature
>          Components: storm-core
>            Reporter: Matthias J. Sax
>            Assignee: Matthias J. Sax
>            Priority: Minor
>
> In order to increase Storm's throughput, multiple tuples can be grouped 
> together in a batch of tuples (ie, fat-tuple) and transfered from producer to 
> consumer at once.
> The initial idea is taken from https://github.com/mjsax/aeolus. However, we 
> aim to integrate this feature deep into the system (in contrast to building 
> it on top), what has multiple advantages:
>   - batching can be even more transparent to the user (eg, no extra 
> direct-streams needed to mimic Storm's data distribution patterns)
>   - fault-tolerance (anchoring/acking) can be done on a tuple granularity 
> (not on a batch granularity, what leads to much more replayed tuples -- and 
> result duplicates -- in case of failure)
> The aim is to extend TopologyBuilder interface with an additional parameter 
> 'batch_size' to expose this feature to the user. Per default, batching will 
> be disabled.
> This batching feature has pure tuple transport purpose, ie, tuple-by-tuple 
> processing semantics are preserved. An output batch is assembled at the 
> producer and completely disassembled at the consumer. The consumer output can 
> be batched again, however, independent of batched or non-batched input. Thus, 
> batches can be of different size for each producer-consumer pair. 
> Furthermore, consumers can receive batches of different size from different 
> producers (including regular non batched input).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (STORM-855) Add tuple batching

Reply via email to