Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Running-topologies-on-a-production-cluster.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Running-topologies-on-a-production-cluster.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Running-topologies-on-a-production-cluster.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Running-topologies-on-a-production-cluster.md Thu Mar 17 02:10:04 2016 @@ -2,12 +2,13 @@ title: Running Topologies on a Production Cluster layout: documentation documentation: true +version: v0.10.0 --- Running topologies on a production cluster is similar to running in [Local mode](Local-mode.html). Here are the steps: -1) Define the topology (Use [TopologyBuilder](/javadoc/apidocs/backtype/storm/topology/TopologyBuilder.html) if defining using Java) +1) Define the topology (Use [TopologyBuilder](javadocs/backtype/storm/topology/TopologyBuilder.html) if defining using Java) -2) Use [StormSubmitter](/javadoc/apidocs/backtype/storm/StormSubmitter.html) to submit the topology to the cluster. `StormSubmitter` takes as input the name of the topology, a configuration for the topology, and the topology itself. For example: +2) Use [StormSubmitter](javadocs/backtype/storm/StormSubmitter.html) to submit the topology to the cluster. `StormSubmitter` takes as input the name of the topology, a configuration for the topology, and the topology itself. For example: ```java Config conf = new Config(); @@ -47,7 +48,7 @@ You can find out how to configure your ` ### Common configurations -There are a variety of configurations you can set per topology. A list of all the configurations you can set can be found [here](/javadoc/apidocs/backtype/storm/Config.html). The ones prefixed with "TOPOLOGY" can be overridden on a topology-specific basis (the other ones are cluster configurations and cannot be overridden). Here are some common ones that are set for a topology: +There are a variety of configurations you can set per topology. A list of all the configurations you can set can be found [here](javadocs/backtype/storm/Config.html). The ones prefixed with "TOPOLOGY" can be overridden on a topology-specific basis (the other ones are cluster configurations and cannot be overridden). Here are some common ones that are set for a topology: 1. **Config.TOPOLOGY_WORKERS**: This sets the number of worker processes to use to execute the topology. For example, if you set this to 25, there will be 25 Java processes across the cluster executing all the tasks. If you had a combined 150 parallelism across all components in the topology, each worker process will have 6 tasks running within it as threads. 2. **Config.TOPOLOGY_ACKERS**: This sets the number of tasks that will track tuple trees and detect when a spout tuple has been fully processed. Ackers are an integral part of Storm's reliability model and you can read more about them on [Guaranteeing message processing](Guaranteeing-message-processing.html). @@ -74,4 +75,4 @@ To update a running topology, the only o The best place to monitor a topology is using the Storm UI. The Storm UI provides information about errors happening in tasks and fine-grained stats on the throughput and latency performance of each component of each running topology. -You can also look at the worker logs on the cluster machines. \ No newline at end of file +You can also look at the worker logs on the cluster machines.
Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Serialization.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Serialization.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Serialization.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Serialization.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Serialization layout: documentation documentation: true +version: v0.10.0 --- This page is about how the serialization system in Storm works for versions 0.6.0 and onwards. Storm used a different serialization system prior to 0.6.0 which is documented on [Serialization (prior to 0.6.0)](Serialization-\(prior-to-0.6.0\).html). @@ -41,7 +42,7 @@ topology.kryo.register: `com.mycompany.CustomType1` and `com.mycompany.CustomType3` will use the `FieldsSerializer`, whereas `com.mycompany.CustomType2` will use `com.mycompany.serializer.CustomType2Serializer` for serialization. -Storm provides helpers for registering serializers in a topology config. The [Config](/javadoc/apidocs/backtype/storm/Config.html) class has a method called `registerSerialization` that takes in a registration to add to the config. +Storm provides helpers for registering serializers in a topology config. The [Config](javadocs/backtype/storm/Config.html) class has a method called `registerSerialization` that takes in a registration to add to the config. There's an advanced config called `Config.TOPOLOGY_SKIP_MISSING_KRYO_REGISTRATIONS`. If you set this to true, Storm will ignore any serializations that are registered but do not have their code available on the classpath. Otherwise, Storm will throw errors when it can't find a serialization. This is useful if you run many topologies on a cluster that each have different serializations, but you want to declare all the serializations across all topologies in the `storm.yaml` files. @@ -59,4 +60,4 @@ Storm 0.7.0 lets you set component-speci When a topology is submitted, a single set of serializations is chosen to be used by all components in the topology for sending messages. This is done by merging the component-specific serializer registrations with the regular set of serialization registrations. If two components define serializers for the same class, one of the serializers is chosen arbitrarily. -To force a serializer for a particular class if there's a conflict between two component-specific registrations, just define the serializer you want to use in the topology-specific configuration. The topology-specific configuration has precedence over component-specific configurations for serialization registrations. \ No newline at end of file +To force a serializer for a particular class if there's a conflict between two component-specific registrations, just define the serializer you want to use in the topology-specific configuration. The topology-specific configuration has precedence over component-specific configurations for serialization registrations. Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Serializers.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Serializers.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Serializers.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Serializers.md Thu Mar 17 02:10:04 2016 @@ -1,4 +1,5 @@ --- layout: documentation +version: v0.10.0 --- -* [storm-json](https://github.com/rapportive-oss/storm-json): Simple JSON serializer for Storm \ No newline at end of file +* [storm-json](https://github.com/rapportive-oss/storm-json): Simple JSON serializer for Storm Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Setting-up-a-Storm-cluster.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Setting-up-a-Storm-cluster.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Setting-up-a-Storm-cluster.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Setting-up-a-Storm-cluster.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Setting up a Storm Cluster layout: documentation documentation: true +version: v0.10.0 --- This page outlines the steps for getting a Storm cluster up and running. If you're on AWS, you should check out the [storm-deploy](https://github.com/nathanmarz/storm-deploy/wiki) project. [storm-deploy](https://github.com/nathanmarz/storm-deploy/wiki) completely automates the provisioning, configuration, and installation of Storm clusters on EC2. It also sets up Ganglia for you so you can monitor CPU, disk, and network usage. @@ -40,7 +41,7 @@ Next, download a Storm release and extra ### Fill in mandatory configurations into storm.yaml -The Storm release contains a file at `conf/storm.yaml` that configures the Storm daemons. You can see the default configuration values [here](https://github.com/apache/storm/blob/master/conf/defaults.yaml). storm.yaml overrides anything in defaults.yaml. There's a few configurations that are mandatory to get a working cluster: +The Storm release contains a file at `conf/storm.yaml` that configures the Storm daemons. You can see the default configuration values [here](https://github.com/apache/storm/blob/{{page.version}}/conf/defaults.yaml). storm.yaml overrides anything in defaults.yaml. There's a few configurations that are mandatory to get a working cluster: 1) **storm.zookeeper.servers**: This is a list of the hosts in the Zookeeper cluster for your Storm cluster. It should look something like: Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Setting-up-development-environment.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Setting-up-development-environment.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Setting-up-development-environment.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Setting-up-development-environment.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Setting Up a Development Environment layout: documentation documentation: true +version: v0.10.0 --- This page outlines what you need to do to get a Storm development environment set up. In summary, the steps are: @@ -38,4 +39,4 @@ Alternatively, if you use the [storm-dep lein run :deploy --attach --name mystormcluster ``` -More information is on the storm-deploy [wiki](https://github.com/nathanmarz/storm-deploy/wiki) \ No newline at end of file +More information is on the storm-deploy [wiki](https://github.com/nathanmarz/storm-deploy/wiki) Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Spout-implementations.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Spout-implementations.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Spout-implementations.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Spout-implementations.md Thu Mar 17 02:10:04 2016 @@ -2,9 +2,10 @@ title: Spout Implementations layout: documentation documentation: true +version: v0.10.0 --- * [storm-kestrel](https://github.com/nathanmarz/storm-kestrel): Adapter to use Kestrel as a spout * [storm-amqp-spout](https://github.com/rapportive-oss/storm-amqp-spout): Adapter to use AMQP source as a spout * [storm-jms](https://github.com/ptgoetz/storm-jms): Adapter to use a JMS source as a spout * [storm-redis-pubsub](https://github.com/sorenmacbeth/storm-redis-pubsub): A spout that subscribes to a Redis pubsub stream -* [storm-beanstalkd-spout](https://github.com/haitaoyao/storm-beanstalkd-spout): A spout that subscribes to a beanstalkd queue \ No newline at end of file +* [storm-beanstalkd-spout](https://github.com/haitaoyao/storm-beanstalkd-spout): A spout that subscribes to a beanstalkd queue Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Structure-of-the-codebase.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Structure-of-the-codebase.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Structure-of-the-codebase.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Structure-of-the-codebase.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Structure of the Codebase layout: documentation documentation: true +version: v0.10.0 --- There are three distinct layers to Storm's codebase. @@ -15,18 +16,18 @@ The following sections explain each of t ### storm.thrift -The first place to look to understand the structure of Storm's codebase is the [storm.thrift](https://github.com/apache/storm/blob/master/storm-core/src/storm.thrift) file. +The first place to look to understand the structure of Storm's codebase is the [storm.thrift](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/storm.thrift) file. Storm uses [this fork](https://github.com/nathanmarz/thrift/tree/storm) of Thrift (branch 'storm') to produce the generated code. This "fork" is actually Thrift 7 with all the Java packages renamed to be `org.apache.thrift7`. Otherwise, it's identical to Thrift 7. This fork was done because of the lack of backwards compatibility in Thrift and the need for many people to use other versions of Thrift in their Storm topologies. -Every spout or bolt in a topology is given a user-specified identifier called the "component id". The component id is used to specify subscriptions from a bolt to the output streams of other spouts or bolts. A [StormTopology](https://github.com/apache/storm/blob/master/storm-core/src/storm.thrift#L91) structure contains a map from component id to component for each type of component (spouts and bolts). +Every spout or bolt in a topology is given a user-specified identifier called the "component id". The component id is used to specify subscriptions from a bolt to the output streams of other spouts or bolts. A [StormTopology](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/storm.thrift#L91) structure contains a map from component id to component for each type of component (spouts and bolts). -Spouts and bolts have the same Thrift definition, so let's just take a look at the [Thrift definition for bolts](https://github.com/apache/storm/blob/master/storm-core/src/storm.thrift#L79). It contains a `ComponentObject` struct and a `ComponentCommon` struct. +Spouts and bolts have the same Thrift definition, so let's just take a look at the [Thrift definition for bolts](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/storm.thrift#L79). It contains a `ComponentObject` struct and a `ComponentCommon` struct. The `ComponentObject` defines the implementation for the bolt. It can be one of three types: -1. A serialized java object (that implements [IBolt](https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/task/IBolt.java)) -2. A `ShellComponent` object that indicates the implementation is in another language. Specifying a bolt this way will cause Storm to instantiate a [ShellBolt](https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/task/ShellBolt.java) object to handle the communication between the JVM-based worker process and the non-JVM-based implementation of the component. +1. A serialized java object (that implements [IBolt](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/backtype/storm/task/IBolt.java)) +2. A `ShellComponent` object that indicates the implementation is in another language. Specifying a bolt this way will cause Storm to instantiate a [ShellBolt](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/backtype/storm/task/ShellBolt.java) object to handle the communication between the JVM-based worker process and the non-JVM-based implementation of the component. 3. A `JavaObject` structure which tells Storm the classname and constructor arguments to use to instantiate that bolt. This is useful if you want to define a topology in a non-JVM language. This way, you can make use of JVM-based spouts and bolts without having to create and serialize a Java object yourself. `ComponentCommon` defines everything else for this component. This includes: @@ -36,107 +37,107 @@ The `ComponentObject` defines the implem 3. The parallelism for this component 4. The component-specific [configuration](https://github.com/apache/storm/wiki/Configuration) for this component -Note that the structure spouts also have a `ComponentCommon` field, and so spouts can also have declarations to consume other input streams. Yet the Storm Java API does not provide a way for spouts to consume other streams, and if you put any input declarations there for a spout you would get an error when you tried to submit the topology. The reason that spouts have an input declarations field is not for users to use, but for Storm itself to use. Storm adds implicit streams and bolts to the topology to set up the [acking framework](https://github.com/apache/storm/wiki/Guaranteeing-message-processing), and two of these implicit streams are from the acker bolt to each spout in the topology. The acker sends "ack" or "fail" messages along these streams whenever a tuple tree is detected to be completed or failed. The code that transforms the user's topology into the runtime topology is located [here](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/co mmon.clj#L279). +Note that the structure spouts also have a `ComponentCommon` field, and so spouts can also have declarations to consume other input streams. Yet the Storm Java API does not provide a way for spouts to consume other streams, and if you put any input declarations there for a spout you would get an error when you tried to submit the topology. The reason that spouts have an input declarations field is not for users to use, but for Storm itself to use. Storm adds implicit streams and bolts to the topology to set up the [acking framework](https://github.com/apache/storm/wiki/Guaranteeing-message-processing), and two of these implicit streams are from the acker bolt to each spout in the topology. The acker sends "ack" or "fail" messages along these streams whenever a tuple tree is detected to be completed or failed. The code that transforms the user's topology into the runtime topology is located [here](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm /daemon/common.clj#L279). ### Java interfaces The interfaces for Storm are generally specified as Java interfaces. The main interfaces are: -1. [IRichBolt](/javadoc/apidocs/backtype/storm/topology/IRichBolt.html) -2. [IRichSpout](/javadoc/apidocs/backtype/storm/topology/IRichSpout.html) -3. [TopologyBuilder](/javadoc/apidocs/backtype/storm/topology/TopologyBuilder.html) +1. [IRichBolt](javadocs/backtype/storm/topology/IRichBolt.html) +2. [IRichSpout](javadocs/backtype/storm/topology/IRichSpout.html) +3. [TopologyBuilder](javadocs/backtype/storm/topology/TopologyBuilder.html) The strategy for the majority of the interfaces is to: 1. Specify the interface using a Java interface 2. Provide a base class that provides default implementations when appropriate -You can see this strategy at work with the [BaseRichSpout](/javadoc/apidocs/backtype/storm/topology/base/BaseRichSpout.html) class. +You can see this strategy at work with the [BaseRichSpout](javadocs/backtype/storm/topology/base/BaseRichSpout.html) class. Spouts and bolts are serialized into the Thrift definition of the topology as described above. -One subtle aspect of the interfaces is the difference between `IBolt` and `ISpout` vs. `IRichBolt` and `IRichSpout`. The main difference between them is the addition of the `declareOutputFields` method in the "Rich" versions of the interfaces. The reason for the split is that the output fields declaration for each output stream needs to be part of the Thrift struct (so it can be specified from any language), but as a user you want to be able to declare the streams as part of your class. What `TopologyBuilder` does when constructing the Thrift representation is call `declareOutputFields` to get the declaration and convert it into the Thrift structure. The conversion happens [at this portion](https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/topology/TopologyBuilder.java#L205) of the `TopologyBuilder` code. +One subtle aspect of the interfaces is the difference between `IBolt` and `ISpout` vs. `IRichBolt` and `IRichSpout`. The main difference between them is the addition of the `declareOutputFields` method in the "Rich" versions of the interfaces. The reason for the split is that the output fields declaration for each output stream needs to be part of the Thrift struct (so it can be specified from any language), but as a user you want to be able to declare the streams as part of your class. What `TopologyBuilder` does when constructing the Thrift representation is call `declareOutputFields` to get the declaration and convert it into the Thrift structure. The conversion happens [at this portion](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/backtype/storm/topology/TopologyBuilder.java#L205) of the `TopologyBuilder` code. ### Implementation Specifying all the functionality via Java interfaces ensures that every feature of Storm is available via Java. Moreso, the focus on Java interfaces ensures that the user experience from Java-land is pleasant as well. -The implementation of Storm, on the other hand, is primarily in Clojure. While the codebase is about 50% Java and 50% Clojure in terms of LOC, most of the implementation logic is in Clojure. There are two notable exceptions to this, and that is the [DRPC](https://github.com/apache/storm/wiki/Distributed-RPC) and [transactional topologies](https://github.com/apache/storm/wiki/Transactional-topologies) implementations. These are implemented purely in Java. This was done to serve as an illustration for how to implement a higher level abstraction on Storm. The DRPC and transactional topologies implementations are in the [backtype.storm.coordination](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/coordination), [backtype.storm.drpc](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/drpc), and [backtype.storm.transactional](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/transactional) packages. +The implementation of Storm, on the other hand, is primarily in Clojure. While the codebase is about 50% Java and 50% Clojure in terms of LOC, most of the implementation logic is in Clojure. There are two notable exceptions to this, and that is the [DRPC](https://github.com/apache/storm/wiki/Distributed-RPC) and [transactional topologies](https://github.com/apache/storm/wiki/Transactional-topologies) implementations. These are implemented purely in Java. This was done to serve as an illustration for how to implement a higher level abstraction on Storm. The DRPC and transactional topologies implementations are in the [backtype.storm.coordination](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/coordination), [backtype.storm.drpc](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/drpc), and [backtype.storm.transactional](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/ transactional) packages. Here's a summary of the purpose of the main Java packages and Clojure namespace: #### Java packages -[backtype.storm.coordination](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/coordination): Implements the pieces required to coordinate batch-processing on top of Storm, which both DRPC and transactional topologies use. `CoordinatedBolt` is the most important class here. +[backtype.storm.coordination](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/coordination): Implements the pieces required to coordinate batch-processing on top of Storm, which both DRPC and transactional topologies use. `CoordinatedBolt` is the most important class here. -[backtype.storm.drpc](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/drpc): Implementation of the DRPC higher level abstraction +[backtype.storm.drpc](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/drpc): Implementation of the DRPC higher level abstraction -[backtype.storm.generated](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/generated): The generated Thrift code for Storm (generated using [this fork](https://github.com/nathanmarz/thrift) of Thrift, which simply renames the packages to org.apache.thrift7 to avoid conflicts with other Thrift versions) +[backtype.storm.generated](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/generated): The generated Thrift code for Storm (generated using [this fork](https://github.com/nathanmarz/thrift) of Thrift, which simply renames the packages to org.apache.thrift7 to avoid conflicts with other Thrift versions) -[backtype.storm.grouping](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/grouping): Contains interface for making custom stream groupings +[backtype.storm.grouping](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/grouping): Contains interface for making custom stream groupings -[backtype.storm.hooks](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/hooks): Interfaces for hooking into various events in Storm, such as when tasks emit tuples, when tuples are acked, etc. User guide for hooks is [here](https://github.com/apache/storm/wiki/Hooks). +[backtype.storm.hooks](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/hooks): Interfaces for hooking into various events in Storm, such as when tasks emit tuples, when tuples are acked, etc. User guide for hooks is [here](https://github.com/apache/storm/wiki/Hooks). -[backtype.storm.serialization](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/serialization): Implementation of how Storm serializes/deserializes tuples. Built on top of [Kryo](http://code.google.com/p/kryo/). +[backtype.storm.serialization](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/serialization): Implementation of how Storm serializes/deserializes tuples. Built on top of [Kryo](http://code.google.com/p/kryo/). -[backtype.storm.spout](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/spout): Definition of spout and associated interfaces (like the `SpoutOutputCollector`). Also contains `ShellSpout` which implements the protocol for defining spouts in non-JVM languages. +[backtype.storm.spout](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/spout): Definition of spout and associated interfaces (like the `SpoutOutputCollector`). Also contains `ShellSpout` which implements the protocol for defining spouts in non-JVM languages. -[backtype.storm.task](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/task): Definition of bolt and associated interfaces (like `OutputCollector`). Also contains `ShellBolt` which implements the protocol for defining bolts in non-JVM languages. Finally, `TopologyContext` is defined here as well, which is provided to spouts and bolts so they can get data about the topology and its execution at runtime. +[backtype.storm.task](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/task): Definition of bolt and associated interfaces (like `OutputCollector`). Also contains `ShellBolt` which implements the protocol for defining bolts in non-JVM languages. Finally, `TopologyContext` is defined here as well, which is provided to spouts and bolts so they can get data about the topology and its execution at runtime. -[backtype.storm.testing](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/testing): Contains a variety of test bolts and utilities used in Storm's unit tests. +[backtype.storm.testing](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/testing): Contains a variety of test bolts and utilities used in Storm's unit tests. -[backtype.storm.topology](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/topology): Java layer over the underlying Thrift structure to provide a clean, pure-Java API to Storm (users don't have to know about Thrift). `TopologyBuilder` is here as well as the helpful base classes for the different spouts and bolts. The slightly-higher level `IBasicBolt` interface is here, which is a simpler way to write certain kinds of bolts. +[backtype.storm.topology](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/topology): Java layer over the underlying Thrift structure to provide a clean, pure-Java API to Storm (users don't have to know about Thrift). `TopologyBuilder` is here as well as the helpful base classes for the different spouts and bolts. The slightly-higher level `IBasicBolt` interface is here, which is a simpler way to write certain kinds of bolts. -[backtype.storm.transactional](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/transactional): Implementation of transactional topologies. +[backtype.storm.transactional](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/transactional): Implementation of transactional topologies. -[backtype.storm.tuple](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/tuple): Implementation of Storm's tuple data model. +[backtype.storm.tuple](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/tuple): Implementation of Storm's tuple data model. -[backtype.storm.utils](https://github.com/apache/storm/tree/master/storm-core/src/jvm/backtype/storm/tuple): Data structures and miscellaneous utilities used throughout the codebase. +[backtype.storm.utils](https://github.com/apache/storm/tree/{{page.version}}/storm-core/src/jvm/backtype/storm/tuple): Data structures and miscellaneous utilities used throughout the codebase. #### Clojure namespaces -[backtype.storm.bootstrap](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/bootstrap.clj): Contains a helpful macro to import all the classes and namespaces that are used throughout the codebase. +[backtype.storm.bootstrap](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/bootstrap.clj): Contains a helpful macro to import all the classes and namespaces that are used throughout the codebase. -[backtype.storm.clojure](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/clojure.clj): Implementation of the Clojure DSL for Storm. +[backtype.storm.clojure](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/clojure.clj): Implementation of the Clojure DSL for Storm. -[backtype.storm.cluster](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/cluster.clj): All Zookeeper logic used in Storm daemons is encapsulated in this file. This code manages how cluster state (like what tasks are running where, what spout/bolt each task runs as) is mapped to the Zookeeper "filesystem" API. +[backtype.storm.cluster](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/cluster.clj): All Zookeeper logic used in Storm daemons is encapsulated in this file. This code manages how cluster state (like what tasks are running where, what spout/bolt each task runs as) is mapped to the Zookeeper "filesystem" API. -[backtype.storm.command.*](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/command): These namespaces implement various commands for the `storm` command line client. These implementations are very short. +[backtype.storm.command.*](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/command): These namespaces implement various commands for the `storm` command line client. These implementations are very short. -[backtype.storm.config](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/config.clj): Implementation of config reading/parsing code for Clojure. Also has utility functions for determining what local path nimbus/supervisor/daemons should be using for various things. e.g. the `master-inbox` function will return the local path that Nimbus should use when jars are uploaded to it. +[backtype.storm.config](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/config.clj): Implementation of config reading/parsing code for Clojure. Also has utility functions for determining what local path nimbus/supervisor/daemons should be using for various things. e.g. the `master-inbox` function will return the local path that Nimbus should use when jars are uploaded to it. -[backtype.storm.daemon.acker](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/acker.clj): Implementation of the "acker" bolt, which is a key part of how Storm guarantees data processing. +[backtype.storm.daemon.acker](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/daemon/acker.clj): Implementation of the "acker" bolt, which is a key part of how Storm guarantees data processing. -[backtype.storm.daemon.common](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/common.clj): Implementation of common functions used in Storm daemons, like getting the id for a topology based on the name, mapping a user's topology into the one that actually executes (with implicit acking streams and acker bolt added - see `system-topology!` function), and definitions for the various heartbeat and other structures persisted by Storm. +[backtype.storm.daemon.common](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/daemon/common.clj): Implementation of common functions used in Storm daemons, like getting the id for a topology based on the name, mapping a user's topology into the one that actually executes (with implicit acking streams and acker bolt added - see `system-topology!` function), and definitions for the various heartbeat and other structures persisted by Storm. -[backtype.storm.daemon.drpc](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/drpc.clj): Implementation of the DRPC server for use with DRPC topologies. +[backtype.storm.daemon.drpc](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/daemon/drpc.clj): Implementation of the DRPC server for use with DRPC topologies. -[backtype.storm.daemon.nimbus](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/nimbus.clj): Implementation of Nimbus. +[backtype.storm.daemon.nimbus](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/daemon/nimbus.clj): Implementation of Nimbus. -[backtype.storm.daemon.supervisor](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/supervisor.clj): Implementation of Supervisor. +[backtype.storm.daemon.supervisor](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/daemon/supervisor.clj): Implementation of Supervisor. -[backtype.storm.daemon.task](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/task.clj): Implementation of an individual task for a spout or bolt. Handles message routing, serialization, stats collection for the UI, as well as the spout-specific and bolt-specific execution implementations. +[backtype.storm.daemon.task](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/daemon/task.clj): Implementation of an individual task for a spout or bolt. Handles message routing, serialization, stats collection for the UI, as well as the spout-specific and bolt-specific execution implementations. -[backtype.storm.daemon.worker](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/daemon/worker.clj): Implementation of a worker process (which will contain many tasks within). Implements message transferring and task launching. +[backtype.storm.daemon.worker](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/daemon/worker.clj): Implementation of a worker process (which will contain many tasks within). Implements message transferring and task launching. -[backtype.storm.event](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/event.clj): Implements a simple asynchronous function executor. Used in various places in Nimbus and Supervisor to make functions execute in serial to avoid any race conditions. +[backtype.storm.event](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/event.clj): Implements a simple asynchronous function executor. Used in various places in Nimbus and Supervisor to make functions execute in serial to avoid any race conditions. -[backtype.storm.log](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/log.clj): Defines the functions used to log messages to log4j. +[backtype.storm.log](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/log.clj): Defines the functions used to log messages to log4j. -[backtype.storm.messaging.*](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/messaging): Defines a higher level interface to implementing point to point messaging. In local mode Storm uses in-memory Java queues to do this; on a cluster, it uses ZeroMQ. The generic interface is defined in protocol.clj. +[backtype.storm.messaging.*](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/messaging): Defines a higher level interface to implementing point to point messaging. In local mode Storm uses in-memory Java queues to do this; on a cluster, it uses ZeroMQ. The generic interface is defined in protocol.clj. -[backtype.storm.stats](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/stats.clj): Implementation of stats rollup routines used when sending stats to ZK for use by the UI. Does things like windowed and rolling aggregations at multiple granularities. +[backtype.storm.stats](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/stats.clj): Implementation of stats rollup routines used when sending stats to ZK for use by the UI. Does things like windowed and rolling aggregations at multiple granularities. -[backtype.storm.testing](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/testing.clj): Implementation of facilities used to test Storm topologies. Includes time simulation, `complete-topology` for running a fixed set of tuples through a topology and capturing the output, tracker topologies for having fine grained control over detecting when a cluster is "idle", and other utilities. +[backtype.storm.testing](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/testing.clj): Implementation of facilities used to test Storm topologies. Includes time simulation, `complete-topology` for running a fixed set of tuples through a topology and capturing the output, tracker topologies for having fine grained control over detecting when a cluster is "idle", and other utilities. -[backtype.storm.thrift](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/thrift.clj): Clojure wrappers around the generated Thrift API to make working with Thrift structures more pleasant. +[backtype.storm.thrift](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/thrift.clj): Clojure wrappers around the generated Thrift API to make working with Thrift structures more pleasant. -[backtype.storm.timer](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/timer.clj): Implementation of a background timer to execute functions in the future or on a recurring interval. Storm couldn't use the [Timer](http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Timer.html) class because it needed integration with time simulation in order to be able to unit test Nimbus and the Supervisor. +[backtype.storm.timer](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/timer.clj): Implementation of a background timer to execute functions in the future or on a recurring interval. Storm couldn't use the [Timer](http://docs.oracle.com/javase/1.4.2/docs/api/java/util/Timer.html) class because it needed integration with time simulation in order to be able to unit test Nimbus and the Supervisor. -[backtype.storm.ui.*](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/ui): Implementation of Storm UI. Completely independent from rest of code base and uses the Nimbus Thrift API to get data. +[backtype.storm.ui.*](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/ui): Implementation of Storm UI. Completely independent from rest of code base and uses the Nimbus Thrift API to get data. -[backtype.storm.util](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/util.clj): Contains generic utility functions used throughout the code base. +[backtype.storm.util](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/util.clj): Contains generic utility functions used throughout the code base. -[backtype.storm.zookeeper](https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/zookeeper.clj): Clojure wrapper around the Zookeeper API and implements some "high-level" stuff like "mkdirs" and "delete-recursive". \ No newline at end of file +[backtype.storm.zookeeper](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/clj/backtype/storm/zookeeper.clj): Clojure wrapper around the Zookeeper API and implements some "high-level" stuff like "mkdirs" and "delete-recursive". Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Support-for-non-java-languages.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Support-for-non-java-languages.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Support-for-non-java-languages.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Support-for-non-java-languages.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Support for Non-Java Languages layout: documentation documentation: true +version: v0.10.0 --- * [Scala DSL](https://github.com/velvia/ScalaStorm) * [JRuby DSL](https://github.com/colinsurprenant/storm-jruby) Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Transactional-topologies.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Transactional-topologies.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Transactional-topologies.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Transactional-topologies.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Transactional Topologies layout: documentation documentation: true +version: v0.10.0 --- **NOTE**: Transactional topologies have been deprecated -- use the [Trident](Trident-tutorial.html) framework instead. @@ -81,7 +82,7 @@ Finally, another thing to note is that t ## The basics through example -You build transactional topologies by using [TransactionalTopologyBuilder](/javadoc/apidocs/backtype/storm/transactional/TransactionalTopologyBuilder.html). Here's the transactional topology definition for a topology that computes the global count of tuples from the input stream. This code comes from [TransactionalGlobalCount](https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/storm/starter/TransactionalGlobalCount.java) in storm-starter. +You build transactional topologies by using [TransactionalTopologyBuilder](javadocs/backtype/storm/transactional/TransactionalTopologyBuilder.html). Here's the transactional topology definition for a topology that computes the global count of tuples from the input stream. This code comes from [TransactionalGlobalCount](https://github.com/apache/storm/blob/{{page.version}}/examples/storm-starter/src/jvm/storm/starter/TransactionalGlobalCount.java) in storm-starter. ```java MemoryTransactionalSpout spout = new MemoryTransactionalSpout(DATA, new Fields("word"), PARTITION_TAKE_PER_BATCH); @@ -132,7 +133,7 @@ public static class BatchCount extends B A new instance of this object is created for every batch that's being processed. The actual bolt this runs within is called [BatchBoltExecutor](https://github.com/apache/storm/blob/0.7.0/src/jvm/backtype/storm/coordination/BatchBoltExecutor.java) and manages the creation and cleanup for these objects. -The `prepare` method parameterizes this batch bolt with the Storm config, the topology context, an output collector, and the id for this batch of tuples. In the case of transactional topologies, the id will be a [TransactionAttempt](/javadoc/apidocs/backtype/storm/transactional/TransactionAttempt.html) object. The batch bolt abstraction can be used in Distributed RPC as well which uses a different type of id for the batches. `BatchBolt` can actually be parameterized with the type of the id, so if you only intend to use the batch bolt for transactional topologies, you can extend `BaseTransactionalBolt` which has this definition: +The `prepare` method parameterizes this batch bolt with the Storm config, the topology context, an output collector, and the id for this batch of tuples. In the case of transactional topologies, the id will be a [TransactionAttempt](javadocs/backtype/storm/transactional/TransactionAttempt.html) object. The batch bolt abstraction can be used in Distributed RPC as well which uses a different type of id for the batches. `BatchBolt` can actually be parameterized with the type of the id, so if you only intend to use the batch bolt for transactional topologies, you can extend `BaseTransactionalBolt` which has this definition: ```java public abstract class BaseTransactionalBolt extends BaseBatchBolt<TransactionAttempt> { @@ -201,7 +202,7 @@ First, notice that this bolt implements The code for `finishBatch` in `UpdateGlobalCount` gets the current value from the database and compares its transaction id to the transaction id for this batch. If they are the same, it does nothing. Otherwise, it increments the value in the database by the partial count for this batch. -A more involved transactional topology example that updates multiple databases idempotently can be found in storm-starter in the [TransactionalWords](https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/storm/starter/TransactionalWords.java) class. +A more involved transactional topology example that updates multiple databases idempotently can be found in storm-starter in the [TransactionalWords](https://github.com/apache/storm/blob/{{page.version}}/examples/storm-starter/src/jvm/storm/starter/TransactionalWords.java) class. ## Transactional Topology API @@ -211,9 +212,9 @@ This section outlines the different piec There are three kinds of bolts possible in a transactional topology: -1. [BasicBolt](/javadoc/apidocs/backtype/storm/topology/base/BaseBasicBolt.html): This bolt doesn't deal with batches of tuples and just emits tuples based on a single tuple of input. -2. [BatchBolt](/javadoc/apidocs/backtype/storm/topology/base/BaseBatchBolt.html): This bolt processes batches of tuples. `execute` is called for each tuple, and `finishBatch` is called when the batch is complete. -3. BatchBolt's that are marked as committers: The only difference between this bolt and a regular batch bolt is when `finishBatch` is called. A committer bolt has `finishedBatch` called during the commit phase. The commit phase is guaranteed to occur only after all prior batches have successfully committed, and it will be retried until all bolts in the topology succeed the commit for the batch. There are two ways to make a `BatchBolt` a committer, by having the `BatchBolt` implement the [ICommitter](/javadoc/apidocs/backtype/storm/transactional/ICommitter.html) marker interface, or by using the `setCommiterBolt` method in `TransactionalTopologyBuilder`. +1. [BasicBolt](javadocs/backtype/storm/topology/base/BaseBasicBolt.html): This bolt doesn't deal with batches of tuples and just emits tuples based on a single tuple of input. +2. [BatchBolt](javadocs/backtype/storm/topology/base/BaseBatchBolt.html): This bolt processes batches of tuples. `execute` is called for each tuple, and `finishBatch` is called when the batch is complete. +3. BatchBolt's that are marked as committers: The only difference between this bolt and a regular batch bolt is when `finishBatch` is called. A committer bolt has `finishedBatch` called during the commit phase. The commit phase is guaranteed to occur only after all prior batches have successfully committed, and it will be retried until all bolts in the topology succeed the commit for the batch. There are two ways to make a `BatchBolt` a committer, by having the `BatchBolt` implement the [ICommitter](javadocs/backtype/storm/transactional/ICommitter.html) marker interface, or by using the `setCommiterBolt` method in `TransactionalTopologyBuilder`. #### Processing phase vs. commit phase in bolts @@ -237,7 +238,7 @@ Notice that you don't have to do any ack #### Failing a transaction -When using regular bolts, you can call the `fail` method on `OutputCollector` to fail the tuple trees of which that tuple is a member. Since transactional topologies hide the acking framework from you, they provide a different mechanism to fail a batch (and cause the batch to be replayed). Just throw a [FailedException](/javadoc/apidocs/backtype/storm/topology/FailedException.html). Unlike regular exceptions, this will only cause that particular batch to replay and will not crash the process. +When using regular bolts, you can call the `fail` method on `OutputCollector` to fail the tuple trees of which that tuple is a member. Since transactional topologies hide the acking framework from you, they provide a different mechanism to fail a batch (and cause the batch to be replayed). Just throw a [FailedException](javadocs/backtype/storm/topology/FailedException.html). Unlike regular exceptions, this will only cause that particular batch to replay and will not crash the process. ### Transactional spout @@ -251,11 +252,11 @@ The coordinator on the left is a regular The need to be idempotent with respect to the tuples it emits requires a `TransactionalSpout` to store a small amount of state. The state is stored in Zookeeper. -The details of implementing a `TransactionalSpout` are in [the Javadoc](/javadoc/apidocs/backtype/storm/transactional/ITransactionalSpout.html). +The details of implementing a `TransactionalSpout` are in [the Javadoc](javadocs/backtype/storm/transactional/ITransactionalSpout.html). #### Partitioned Transactional Spout -A common kind of transactional spout is one that reads the batches from a set of partitions across many queue brokers. For example, this is how [TransactionalKafkaSpout](https://github.com/apache/storm/tree/master/external/storm-kafka/src/jvm/storm/kafka/TransactionalKafkaSpout.java) works. An `IPartitionedTransactionalSpout` automates the bookkeeping work of managing the state for each partition to ensure idempotent replayability. See [the Javadoc](/javadoc/apidocs/backtype/storm/transactional/partitioned/IPartitionedTransactionalSpout.html) for more details. +A common kind of transactional spout is one that reads the batches from a set of partitions across many queue brokers. For example, this is how [TransactionalKafkaSpout](https://github.com/apache/storm/tree/{{page.version}}/external/storm-kafka/src/jvm/storm/kafka/TransactionalKafkaSpout.java) works. An `IPartitionedTransactionalSpout` automates the bookkeeping work of managing the state for each partition to ensure idempotent replayability. See [the Javadoc](javadocs/backtype/storm/transactional/partitioned/IPartitionedTransactionalSpout.html) for more details. ### Configuration @@ -325,7 +326,7 @@ In this scenario, tuples 41-50 are skipp By failing all subsequent transactions on failure, no tuples are skipped. This also shows that a requirement of transactional spouts is that they always emit where the last transaction left off. -A non-idempotent transactional spout is more concisely referred to as an "OpaqueTransactionalSpout" (opaque is the opposite of idempotent). [IOpaquePartitionedTransactionalSpout](/javadoc/apidocs/backtype/storm/transactional/partitioned/IOpaquePartitionedTransactionalSpout.html) is an interface for implementing opaque partitioned transactional spouts, of which [OpaqueTransactionalKafkaSpout](https://github.com/apache/storm/tree/master/external/storm-kafka/src/jvm/storm/kafka/OpaqueTransactionalKafkaSpout.java) is an example. `OpaqueTransactionalKafkaSpout` can withstand losing individual Kafka nodes without sacrificing accuracy as long as you use the update strategy as explained in this section. +A non-idempotent transactional spout is more concisely referred to as an "OpaqueTransactionalSpout" (opaque is the opposite of idempotent). [IOpaquePartitionedTransactionalSpout](javadocs/backtype/storm/transactional/partitioned/IOpaquePartitionedTransactionalSpout.html) is an interface for implementing opaque partitioned transactional spouts, of which [OpaqueTransactionalKafkaSpout](https://github.com/apache/storm/tree/{{page.version}}/external/storm-kafka/src/jvm/storm/kafka/OpaqueTransactionalKafkaSpout.java) is an example. `OpaqueTransactionalKafkaSpout` can withstand losing individual Kafka nodes without sacrificing accuracy as long as you use the update strategy as explained in this section. ## Implementation Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Trident-API-Overview.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Trident-API-Overview.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Trident-API-Overview.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Trident-API-Overview.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Trident API Overview layout: documentation documentation: true +version: v0.10.0 --- The core data model in Trident is the "Stream", processed as a series of batches. A stream is partitioned among the nodes in the cluster, and operations applied to a stream are applied in parallel across each partition. @@ -278,7 +279,7 @@ The groupBy operation repartitions the s  -If you run aggregators on a grouped stream, the aggregation will be run within each group instead of against the whole batch. persistentAggregate can also be run on a GroupedStream, in which case the results will be stored in a [MapState](https://github.com/apache/storm/blob/master/storm-core/src/jvm/storm/trident/state/map/MapState.java) with the key being the grouping fields. You can read more about persistentAggregate in the [Trident state doc](Trident-state.html). +If you run aggregators on a grouped stream, the aggregation will be run within each group instead of against the whole batch. persistentAggregate can also be run on a GroupedStream, in which case the results will be stored in a [MapState](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/storm/trident/state/map/MapState.java) with the key being the grouping fields. You can read more about persistentAggregate in the [Trident state doc](Trident-state.html). Like regular streams, aggregators on grouped streams can be chained. @@ -309,4 +310,4 @@ When a join happens between streams orig You might be wondering â how do you do something like a "windowed join", where tuples from one side of the join are joined against the last hour of tuples from the other side of the join. -To do this, you would make use of partitionPersist and stateQuery. The last hour of tuples from one side of the join would be stored and rotated in a source of state, keyed by the join field. Then the stateQuery would do lookups by the join field to perform the "join". \ No newline at end of file +To do this, you would make use of partitionPersist and stateQuery. The last hour of tuples from one side of the join would be stored and rotated in a source of state, keyed by the join field. Then the stateQuery would do lookups by the join field to perform the "join". Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Trident-spouts.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Trident-spouts.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Trident-spouts.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Trident-spouts.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Trident Spouts layout: documentation documentation: true +version: v0.10.0 --- # Trident spouts @@ -34,10 +35,10 @@ Even while processing multiple batches s Here are the following spout APIs available: -1. [ITridentSpout](https://github.com/apache/storm/blob/master/storm-core/src/jvm/storm/trident/spout/ITridentSpout.java): The most general API that can support transactional or opaque transactional semantics. Generally you'll use one of the partitioned flavors of this API rather than this one directly. -2. [IBatchSpout](https://github.com/apache/storm/blob/master/storm-core/src/jvm/storm/trident/spout/IBatchSpout.java): A non-transactional spout that emits batches of tuples at a time -3. [IPartitionedTridentSpout](https://github.com/apache/storm/blob/master/storm-core/src/jvm/storm/trident/spout/IPartitionedTridentSpout.java): A transactional spout that reads from a partitioned data source (like a cluster of Kafka servers) -4. [IOpaquePartitionedTridentSpout](https://github.com/apache/storm/blob/master/storm-core/src/jvm/storm/trident/spout/IOpaquePartitionedTridentSpout.java): An opaque transactional spout that reads from a partitioned data source +1. [ITridentSpout](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/storm/trident/spout/ITridentSpout.java): The most general API that can support transactional or opaque transactional semantics. Generally you'll use one of the partitioned flavors of this API rather than this one directly. +2. [IBatchSpout](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/storm/trident/spout/IBatchSpout.java): A non-transactional spout that emits batches of tuples at a time +3. [IPartitionedTridentSpout](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/storm/trident/spout/IPartitionedTridentSpout.java): A transactional spout that reads from a partitioned data source (like a cluster of Kafka servers) +4. [IOpaquePartitionedTridentSpout](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/storm/trident/spout/IOpaquePartitionedTridentSpout.java): An opaque transactional spout that reads from a partitioned data source And, like mentioned in the beginning of this tutorial, you can use regular IRichSpout's as well. Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Trident-state.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Trident-state.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Trident-state.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Trident-state.md Thu Mar 17 02:10:04 2016 @@ -1,6 +1,7 @@ --- title: Trident State layout: documentation +version: v0.10.0 --- @@ -28,7 +29,7 @@ Remember, Trident processes tuples as sm 2. There's no overlap between batches of tuples (tuples are in one batch or another, never multiple). 3. Every tuple is in a batch (no tuples are skipped) -This is a pretty easy type of spout to understand, the stream is divided into fixed batches that never change. storm-contrib has [an implementation of a transactional spout](https://github.com/apache/storm/tree/master/external/storm-kafka/src/jvm/storm/kafka/trident/TransactionalTridentKafkaSpout.java) for Kafka. +This is a pretty easy type of spout to understand, the stream is divided into fixed batches that never change. storm-contrib has [an implementation of a transactional spout](https://github.com/apache/storm/tree/{{page.version}}/external/storm-kafka/src/jvm/storm/kafka/trident/TransactionalTridentKafkaSpout.java) for Kafka. You might be wondering â why wouldn't you just always use a transactional spout? They're simple and easy to understand. One reason you might not use one is because they're not necessarily very fault-tolerant. For example, the way TransactionalTridentKafkaSpout works is the batch for a txid will contain tuples from all the Kafka partitions for a topic. Once a batch has been emitted, any time that batch is re-emitted in the future the exact same set of tuples must be emitted to meet the semantics of transactional spouts. Now suppose a batch is emitted from TransactionalTridentKafkaSpout, the batch fails to process, and at the same time one of the Kafka nodes goes down. You're now incapable of replaying the same batch as you did before (since the node is down and some partitions for the topic are not unavailable), and processing will halt. @@ -72,7 +73,7 @@ As described before, an opaque transacti 1. Every tuple is *successfully* processed in exactly one batch. However, it's possible for a tuple to fail to process in one batch and then succeed to process in a later batch. -[OpaqueTridentKafkaSpout](https://github.com/apache/storm/tree/master/external/storm-kafka/src/jvm/storm/kafka/trident/OpaqueTridentKafkaSpout.java) is a spout that has this property and is fault-tolerant to losing Kafka nodes. Whenever it's time for OpaqueTridentKafkaSpout to emit a batch, it emits tuples starting from where the last batch finished emitting. This ensures that no tuple is ever skipped or successfully processed by multiple batches. +[OpaqueTridentKafkaSpout](https://github.com/apache/storm/tree/{{page.version}}/external/storm-kafka/src/jvm/storm/kafka/trident/OpaqueTridentKafkaSpout.java) is a spout that has this property and is fault-tolerant to losing Kafka nodes. Whenever it's time for OpaqueTridentKafkaSpout to emit a batch, it emits tuples starting from where the last batch finished emitting. This ensures that no tuple is ever skipped or successfully processed by multiple batches. With opaque transactional spouts, it's no longer possible to use the trick of skipping state updates if the transaction id in the database is the same as the transaction id for the current batch. This is because the batch may have changed between state updates. @@ -309,7 +310,7 @@ public interface Snapshottable<T> extend } ``` -[MemoryMapState](https://github.com/apache/storm/blob/master/storm-core/src/jvm/storm/trident/testing/MemoryMapState.java) and [MemcachedState](https://github.com/nathanmarz/trident-memcached/blob/master/src/jvm/trident/memcached/MemcachedState.java) each implement both of these interfaces. +[MemoryMapState](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/storm/trident/testing/MemoryMapState.java) and [MemcachedState](https://github.com/nathanmarz/trident-memcached/blob/{{page.version}}/src/jvm/trident/memcached/MemcachedState.java) each implement both of these interfaces. ## Implementing Map States @@ -322,10 +323,10 @@ public interface IBackingMap<T> { } ``` -OpaqueMap's will call multiPut with [OpaqueValue](https://github.com/apache/storm/blob/master/storm-core/src/jvm/storm/trident/state/OpaqueValue.java)'s for the vals, TransactionalMap's will give [TransactionalValue](https://github.com/apache/storm/blob/master/storm-core/src/jvm/storm/trident/state/TransactionalValue.java)'s for the vals, and NonTransactionalMaps will just pass the objects from the topology through. +OpaqueMap's will call multiPut with [OpaqueValue](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/storm/trident/state/OpaqueValue.java)'s for the vals, TransactionalMap's will give [TransactionalValue](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/storm/trident/state/TransactionalValue.java)'s for the vals, and NonTransactionalMaps will just pass the objects from the topology through. -Trident also provides the [CachedMap](https://github.com/apache/storm/blob/master/storm-core/src/jvm/storm/trident/state/map/CachedMap.java) class to do automatic LRU caching of map key/vals. +Trident also provides the [CachedMap](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/storm/trident/state/map/CachedMap.java) class to do automatic LRU caching of map key/vals. -Finally, Trident provides the [SnapshottableMap](https://github.com/apache/storm/blob/master/storm-core/src/jvm/storm/trident/state/map/SnapshottableMap.java) class that turns a MapState into a Snapshottable object, by storing global aggregations into a fixed key. +Finally, Trident provides the [SnapshottableMap](https://github.com/apache/storm/blob/{{page.version}}/storm-core/src/jvm/storm/trident/state/map/SnapshottableMap.java) class that turns a MapState into a Snapshottable object, by storing global aggregations into a fixed key. -Take a look at the implementation of [MemcachedState](https://github.com/nathanmarz/trident-memcached/blob/master/src/jvm/trident/memcached/MemcachedState.java) to see how all these utilities can be put together to make a high performance MapState implementation. MemcachedState allows you to choose between opaque transactional, transactional, and non-transactional semantics. +Take a look at the implementation of [MemcachedState](https://github.com/nathanmarz/trident-memcached/blob/{{page.version}}/src/jvm/trident/memcached/MemcachedState.java) to see how all these utilities can be put together to make a high performance MapState implementation. MemcachedState allows you to choose between opaque transactional, transactional, and non-transactional semantics. Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Trident-tutorial.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Trident-tutorial.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Trident-tutorial.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Trident-tutorial.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Trident Tutorial layout: documentation documentation: true +version: v0.10.0 --- Trident is a high-level abstraction for doing realtime computing on top of Storm. It allows you to seamlessly intermix high throughput (millions of messages per second), stateful stream processing with low latency distributed querying. If you're familiar with high level batch processing tools like Pig or Cascading, the concepts of Trident will be very familiar â Trident has joins, aggregations, grouping, functions, and filters. In addition to these, Trident adds primitives for doing stateful, incremental processing on top of any database or persistence store. Trident has consistent, exactly-once semantics, so it is easy to reason about Trident topologies. @@ -251,4 +252,4 @@ It would compile into Storm spouts/bolts ## Conclusion -Trident makes realtime computation elegant. You've seen how high throughput stream processing, state manipulation, and low-latency querying can be seamlessly intermixed via Trident's API. Trident lets you express your realtime computations in a natural way while still getting maximal performance. \ No newline at end of file +Trident makes realtime computation elegant. You've seen how high throughput stream processing, state manipulation, and low-latency querying can be seamlessly intermixed via Trident's API. Trident lets you express your realtime computations in a natural way while still getting maximal performance. Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Troubleshooting.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Troubleshooting.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Troubleshooting.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Troubleshooting.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Troubleshooting layout: documentation documentation: true +version: v0.10.0 --- This page lists issues people have run into when using Storm along with their solutions. @@ -142,4 +143,4 @@ Caused by: java.lang.NullPointerExceptio Solution: - * This is caused by having multiple threads issue methods on the `OutputCollector`. All emits, acks, and fails must happen on the same thread. One subtle way this can happen is if you make a `IBasicBolt` that emits on a separate thread. `IBasicBolt`'s automatically ack after execute is called, so this would cause multiple threads to use the `OutputCollector` leading to this exception. When using a basic bolt, all emits must happen in the same thread that runs `execute`. \ No newline at end of file + * This is caused by having multiple threads issue methods on the `OutputCollector`. All emits, acks, and fails must happen on the same thread. One subtle way this can happen is if you make a `IBasicBolt` that emits on a separate thread. `IBasicBolt`'s automatically ack after execute is called, so this would cause multiple threads to use the `OutputCollector` leading to this exception. When using a basic bolt, all emits must happen in the same thread that runs `execute`. Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Tutorial.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Tutorial.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Tutorial.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Tutorial.md Thu Mar 17 02:10:04 2016 @@ -2,12 +2,13 @@ title: Tutorial layout: documentation documentation: true +version: v0.10.0 --- In this tutorial, you'll learn how to create Storm topologies and deploy them to a Storm cluster. Java will be the main language used, but a few examples will use Python to illustrate Storm's multi-language capabilities. ## Preliminaries -This tutorial uses examples from the [storm-starter](https://github.com/apache/storm/blob/master/examples/storm-starter) project. It's recommended that you clone the project and follow along with the examples. Read [Setting up a development environment](Setting-up-development-environment.html) and [Creating a new Storm project](Creating-a-new-Storm-project.html) to get your machine set up. +This tutorial uses examples from the [storm-starter](https://github.com/apache/storm/blob/{{page.version}}/examples/storm-starter) project. It's recommended that you clone the project and follow along with the examples. Read [Setting up a development environment](Setting-up-development-environment.html) and [Creating a new Storm project](Creating-a-new-Storm-project.html) to get your machine set up. ## Components of a Storm cluster @@ -103,11 +104,11 @@ This topology contains a spout and two b This code defines the nodes using the `setSpout` and `setBolt` methods. These methods take as input a user-specified id, an object containing the processing logic, and the amount of parallelism you want for the node. In this example, the spout is given id "words" and the bolts are given ids "exclaim1" and "exclaim2". -The object containing the processing logic implements the [IRichSpout](/javadoc/apidocs/backtype/storm/topology/IRichSpout.html) interface for spouts and the [IRichBolt](/javadoc/apidocs/backtype/storm/topology/IRichBolt.html) interface for bolts. +The object containing the processing logic implements the [IRichSpout](javadocs/backtype/storm/topology/IRichSpout.html) interface for spouts and the [IRichBolt](javadocs/backtype/storm/topology/IRichBolt.html) interface for bolts. The last parameter, how much parallelism you want for the node, is optional. It indicates how many threads should execute that component across the cluster. If you omit it, Storm will only allocate one thread for that node. -`setBolt` returns an [InputDeclarer](/javadoc/apidocs/backtype/storm/topology/InputDeclarer.html) object that is used to define the inputs to the Bolt. Here, component "exclaim1" declares that it wants to read all the tuples emitted by component "words" using a shuffle grouping, and component "exclaim2" declares that it wants to read all the tuples emitted by component "exclaim1" using a shuffle grouping. "shuffle grouping" means that tuples should be randomly distributed from the input tasks to the bolt's tasks. There are many ways to group data between components. These will be explained in a few sections. +`setBolt` returns an [InputDeclarer](javadocs/backtype/storm/topology/InputDeclarer.html) object that is used to define the inputs to the Bolt. Here, component "exclaim1" declares that it wants to read all the tuples emitted by component "words" using a shuffle grouping, and component "exclaim2" declares that it wants to read all the tuples emitted by component "exclaim1" using a shuffle grouping. "shuffle grouping" means that tuples should be randomly distributed from the input tasks to the bolt's tasks. There are many ways to group data between components. These will be explained in a few sections. If you wanted component "exclaim2" to read all the tuples emitted by both component "words" and component "exclaim1", you would write component "exclaim2"'s definition like this: @@ -163,7 +164,7 @@ public static class ExclamationBolt impl The `prepare` method provides the bolt with an `OutputCollector` that is used for emitting tuples from this bolt. Tuples can be emitted at anytime from the bolt -- in the `prepare`, `execute`, or `cleanup` methods, or even asynchronously in another thread. This `prepare` implementation simply saves the `OutputCollector` as an instance variable to be used later on in the `execute` method. -The `execute` method receives a tuple from one of the bolt's inputs. The `ExclamationBolt` grabs the first field from the tuple and emits a new tuple with the string "!!!" appended to it. If you implement a bolt that subscribes to multiple input sources, you can find out which component the [Tuple](/javadoc/apidocs/backtype/storm/tuple/Tuple.html) came from by using the `Tuple#getSourceComponent` method. +The `execute` method receives a tuple from one of the bolt's inputs. The `ExclamationBolt` grabs the first field from the tuple and emits a new tuple with the string "!!!" appended to it. If you implement a bolt that subscribes to multiple input sources, you can find out which component the [Tuple](javadocs/backtype/storm/tuple/Tuple.html) came from by using the `Tuple#getSourceComponent` method. There's a few other things going in in the `execute` method, namely that the input tuple is passed as the first argument to `emit` and the input tuple is acked on the final line. These are part of Storm's reliability API for guaranteeing no data loss and will be explained later in this tutorial. @@ -225,7 +226,7 @@ The configuration is used to tune variou 1. **TOPOLOGY_WORKERS** (set with `setNumWorkers`) specifies how many _processes_ you want allocated around the cluster to execute the topology. Each component in the topology will execute as many _threads_. The number of threads allocated to a given component is configured through the `setBolt` and `setSpout` methods. Those _threads_ exist within worker _processes_. Each worker _process_ contains within it some number of _threads_ for some number of components. For instance, you may have 300 threads specified across all your components and 50 worker processes specified in your config. Each worker process will execute 6 threads, each of which of could belong to a different component. You tune the performance of Storm topologies by tweaking the parallelism for each component and the number of worker processes those threads should run within. 2. **TOPOLOGY_DEBUG** (set with `setDebug`), when set to true, tells Storm to log every message every emitted by a component. This is useful in local mode when testing topologies, but you probably want to keep this turned off when running topologies on the cluster. -There's many other configurations you can set for the topology. The various configurations are detailed on [the Javadoc for Config](/javadoc/apidocs/backtype/storm/Config.html). +There's many other configurations you can set for the topology. The various configurations are detailed on [the Javadoc for Config](javadocs/backtype/storm/Config.html). To learn about how to set up your development environment so that you can run topologies in local mode (such as in Eclipse), see [Creating a new Storm project](Creating-a-new-Storm-project.html). @@ -237,7 +238,7 @@ A stream grouping tells a topology how t When a task for Bolt A emits a tuple to Bolt B, which task should it send the tuple to? -A "stream grouping" answers this question by telling Storm how to send tuples between sets of tasks. Before we dig into the different kinds of stream groupings, let's take a look at another topology from [storm-starter](http://github.com/apache/storm/blob/master/examples/storm-starter). This [WordCountTopology](https://github.com/apache/storm/blob/master/examples/storm-starter/src/jvm/storm/starter/WordCountTopology.java) reads sentences off of a spout and streams out of `WordCountBolt` the total number of times it has seen that word before: +A "stream grouping" answers this question by telling Storm how to send tuples between sets of tasks. Before we dig into the different kinds of stream groupings, let's take a look at another topology from [storm-starter](http://github.com/apache/storm/blob/{{page.version}}/examples/storm-starter). This [WordCountTopology](https://github.com/apache/storm/blob/{{page.version}}/examples/storm-starter/src/jvm/storm/starter/WordCountTopology.java) reads sentences off of a spout and streams out of `WordCountBolt` the total number of times it has seen that word before: ```java TopologyBuilder builder = new TopologyBuilder(); @@ -309,4 +310,4 @@ This tutorial showed how to do basic str ## Conclusion -This tutorial gave a broad overview of developing, testing, and deploying Storm topologies. The rest of the documentation dives deeper into all the aspects of using Storm. \ No newline at end of file +This tutorial gave a broad overview of developing, testing, and deploying Storm topologies. The rest of the documentation dives deeper into all the aspects of using Storm. Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Understanding-the-parallelism-of-a-Storm-topology.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Understanding-the-parallelism-of-a-Storm-topology.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Understanding-the-parallelism-of-a-Storm-topology.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Understanding-the-parallelism-of-a-Storm-topology.md Thu Mar 17 02:10:04 2016 @@ -2,6 +2,7 @@ title: Understanding the Parallelism of a Storm Topology layout: documentation documentation: true +version: v0.10.0 --- ## What makes a running topology: worker processes, executors and tasks @@ -30,25 +31,25 @@ The following sections give an overview ### Number of worker processes * Description: How many worker processes to create _for the topology_ across machines in the cluster. -* Configuration option: [TOPOLOGY_WORKERS](/javadoc/apidocs/backtype/storm/Config.html#TOPOLOGY_WORKERS) +* Configuration option: [TOPOLOGY_WORKERS](javadocs/backtype/storm/Config.html#TOPOLOGY_WORKERS) * How to set in your code (examples): - * [Config#setNumWorkers](/javadoc/apidocs/backtype/storm/Config.html) + * [Config#setNumWorkers](javadocs/backtype/storm/Config.html) ### Number of executors (threads) * Description: How many executors to spawn _per component_. * Configuration option: ? * How to set in your code (examples): - * [TopologyBuilder#setSpout()](/javadoc/apidocs/backtype/storm/topology/TopologyBuilder.html) - * [TopologyBuilder#setBolt()](/javadoc/apidocs/backtype/storm/topology/TopologyBuilder.html) + * [TopologyBuilder#setSpout()](javadocs/backtype/storm/topology/TopologyBuilder.html) + * [TopologyBuilder#setBolt()](javadocs/backtype/storm/topology/TopologyBuilder.html) * Note that as of Storm 0.8 the ``parallelism_hint`` parameter now specifies the initial number of executors (not tasks!) for that bolt. ### Number of tasks * Description: How many tasks to create _per component_. -* Configuration option: [TOPOLOGY_TASKS](/javadoc/apidocs/backtype/storm/Config.html#TOPOLOGY_TASKS) +* Configuration option: [TOPOLOGY_TASKS](javadocs/backtype/storm/Config.html#TOPOLOGY_TASKS) * How to set in your code (examples): - * [ComponentConfigurationDeclarer#setNumTasks()](/javadoc/apidocs/backtype/storm/topology/ComponentConfigurationDeclarer.html) + * [ComponentConfigurationDeclarer#setNumTasks()](javadocs/backtype/storm/topology/ComponentConfigurationDeclarer.html) Here is an example code snippet to show these settings in practice: @@ -91,7 +92,7 @@ StormSubmitter.submitTopology( And of course Storm comes with additional configuration settings to control the parallelism of a topology, including: -* [TOPOLOGY_MAX_TASK_PARALLELISM](/javadoc/apidocs/backtype/storm/Config.html#TOPOLOGY_MAX_TASK_PARALLELISM): This setting puts a ceiling on the number of executors that can be spawned for a single component. It is typically used during testing to limit the number of threads spawned when running a topology in local mode. You can set this option via e.g. [Config#setMaxTaskParallelism()](/javadoc/apidocs/backtype/storm/Config.html#setMaxTaskParallelism(int)). +* [TOPOLOGY_MAX_TASK_PARALLELISM](javadocs/backtype/storm/Config.html#TOPOLOGY_MAX_TASK_PARALLELISM): This setting puts a ceiling on the number of executors that can be spawned for a single component. It is typically used during testing to limit the number of threads spawned when running a topology in local mode. You can set this option via e.g. [Config#setMaxTaskParallelism()](javadocs/backtype/storm/Config.html#setMaxTaskParallelism(int)). ## How to change the parallelism of a running topology @@ -119,5 +120,5 @@ $ storm rebalance mytopology -n 5 -e blu * [Running topologies on a production cluster](Running-topologies-on-a-production-cluster.html)] * [Local mode](Local-mode.html) * [Tutorial](Tutorial.html) -* [Storm API documentation](/javadoc/apidocs/), most notably the class ``Config`` +* [Storm API documentation](javadocs/), most notably the class ``Config`` Modified: storm/branches/bobby-versioned-site/releases/0.10.0/Using-non-JVM-languages-with-Storm.md URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/Using-non-JVM-languages-with-Storm.md?rev=1735360&r1=1735359&r2=1735360&view=diff ============================================================================== --- storm/branches/bobby-versioned-site/releases/0.10.0/Using-non-JVM-languages-with-Storm.md (original) +++ storm/branches/bobby-versioned-site/releases/0.10.0/Using-non-JVM-languages-with-Storm.md Thu Mar 17 02:10:04 2016 @@ -1,5 +1,6 @@ --- layout: documentation +version: v0.10.0 --- - two pieces: creating topologies and implementing spouts and bolts in other languages - creating topologies in another language is easy since topologies are just thrift structures (link to storm.thrift) @@ -49,4 +50,4 @@ Then you can connect to Nimbus using the ``` void submitTopology(1: string name, 2: string uploadedJarLocation, 3: string jsonConf, 4: StormTopology topology) throws (1: AlreadyAliveException e, 2: InvalidTopologyException ite); -``` \ No newline at end of file +``` Added: storm/branches/bobby-versioned-site/releases/0.10.0/images/architecture.png URL: http://svn.apache.org/viewvc/storm/branches/bobby-versioned-site/releases/0.10.0/images/architecture.png?rev=1735360&view=auto ============================================================================== Binary file - no diff available. Propchange: storm/branches/bobby-versioned-site/releases/0.10.0/images/architecture.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream
