http://git-wip-us.apache.org/repos/asf/storm-site/blob/6e122a12/content/releases/1.2.1/Command-line-client.html ---------------------------------------------------------------------- diff --git a/content/releases/1.2.1/Command-line-client.html b/content/releases/1.2.1/Command-line-client.html deleted file mode 100644 index 1359555..0000000 --- a/content/releases/1.2.1/Command-line-client.html +++ /dev/null @@ -1,501 +0,0 @@ -<!DOCTYPE html> -<html> - <head> - <meta charset="utf-8"> - <meta http-equiv="X-UA-Compatible" content="IE=edge"> - <meta name="viewport" content="width=device-width, initial-scale=1"> - - <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"> - <link rel="icon" href="/favicon.ico" type="image/x-icon"> - - <title>Command Line Client</title> - - <!-- Bootstrap core CSS --> - <link href="/assets/css/bootstrap.min.css" rel="stylesheet"> - <!-- Bootstrap theme --> - <link href="/assets/css/bootstrap-theme.min.css" rel="stylesheet"> - - <!-- Custom styles for this template --> - <link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css"> - <link href="/css/style.css" rel="stylesheet"> - <link href="/assets/css/owl.theme.css" rel="stylesheet"> - <link href="/assets/css/owl.carousel.css" rel="stylesheet"> - <script type="text/javascript" src="/assets/js/jquery.min.js"></script> - <script type="text/javascript" src="/assets/js/bootstrap.min.js"></script> - <script type="text/javascript" src="/assets/js/owl.carousel.min.js"></script> - <script type="text/javascript" src="/assets/js/storm.js"></script> - <!-- Just for debugging purposes. Don't actually copy these 2 lines! --> - <!--[if lt IE 9]><script src="../../assets/js/ie8-responsive-file-warning.js"></script><![endif]--> - - <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries --> - <!--[if lt IE 9]> - <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script> - <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script> - <![endif]--> - </head> - - - <body> - <header> - <div class="container-fluid"> - <div class="row"> - <div class="col-md-5"> - <a href="/index.html"><img src="/images/logo.png" class="logo" /></a> - </div> - <div class="col-md-5"> - - <h1>Version: 1.2.1</h1> - - </div> - <div class="col-md-2"> - <a href="/downloads.html" class="btn-std btn-block btn-download">Download</a> - </div> - </div> - </div> -</header> -<!--Header End--> -<!--Navigation Begin--> -<div class="navbar" role="banner"> - <div class="container-fluid"> - <div class="navbar-header"> - <button class="navbar-toggle" type="button" data-toggle="collapse" data-target=".bs-navbar-collapse"> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - </button> - </div> - <nav class="collapse navbar-collapse bs-navbar-collapse" role="navigation"> - <ul class="nav navbar-nav"> - <li><a href="/index.html" id="home">Home</a></li> - <li><a href="/getting-help.html" id="getting-help">Getting Help</a></li> - <li><a href="/about/integrates.html" id="project-info">Project Information</a></li> - <li class="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="documentation">Documentation <b class="caret"></b></a> - <ul class="dropdown-menu"> - - - <li><a href="/releases/2.0.0-SNAPSHOT/index.html">2.0.0-SNAPSHOT</a></li> - - - - <li><a href="/releases/1.2.1/index.html">1.2.1</a></li> - - - - <li><a href="/releases/1.1.2/index.html">1.1.2</a></li> - - - - - - <li><a href="/releases/1.0.6/index.html">1.0.6</a></li> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - </ul> - </li> - <li><a href="/talksAndVideos.html">Talks and Slideshows</a></li> - <li class="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="contribute">Community <b class="caret"></b></a> - <ul class="dropdown-menu"> - <li><a href="/contribute/Contributing-to-Storm.html">Contributing</a></li> - <li><a href="/contribute/People.html">People</a></li> - <li><a href="/contribute/BYLAWS.html">ByLaws</a></li> - </ul> - </li> - <li><a href="/2018/06/04/storm122-released.html" id="news">News</a></li> - </ul> - </nav> - </div> -</div> - - - - <div class="container-fluid"> - <h1 class="page-title">Command Line Client</h1> - <div class="row"> - <div class="col-md-12"> - <!-- Documentation --> - -<p class="post-meta"></p> - -<div class="documentation-content"><p>This page describes all the commands that are possible with the "storm" command line client. To learn how to set up your "storm" client to talk to a remote cluster, follow the instructions in <a href="Setting-up-development-environment.html">Setting up development environment</a>. See <a href="Classpath-handling.html">Classpath handling</a> for details on using external libraries in these commands.</p> - -<p>These commands are:</p> - -<ol> -<li>jar</li> -<li>sql</li> -<li>kill</li> -<li>activate</li> -<li>deactivate</li> -<li>rebalance</li> -<li>repl</li> -<li>classpath</li> -<li>localconfvalue</li> -<li>remoteconfvalue</li> -<li>nimbus</li> -<li>supervisor</li> -<li>ui</li> -<li>drpc</li> -<li>blobstore</li> -<li>dev-zookeeper</li> -<li>get-errors</li> -<li>heartbeats</li> -<li>kill_workers</li> -<li>list</li> -<li>logviewer</li> -<li>monitor</li> -<li>node-health-check</li> -<li>pacemaker</li> -<li>set_log_level</li> -<li>shell</li> -<li>upload-credentials</li> -<li>version</li> -<li>help</li> -</ol> - -<h3 id="jar">jar</h3> - -<p>Syntax: <code>storm jar topology-jar-path class ...</code></p> - -<p>Runs the main method of <code>class</code> with the specified arguments. The storm jars and configs in <code>~/.storm</code> are put on the classpath. The process is configured so that <a href="javadocs/org/apache/storm/StormSubmitter.html">StormSubmitter</a> will upload the jar at <code>topology-jar-path</code> when the topology is submitted.</p> - -<p>When you want to ship other jars which is not included to application jar, you can pass them to <code>--jars</code> option with comma-separated string. -For example, --jars "your-local-jar.jar,your-local-jar2.jar" will load your-local-jar.jar and your-local-jar2.jar. -And when you want to ship maven artifacts and its transitive dependencies, you can pass them to <code>--artifacts</code> with comma-separated string. You can also exclude some dependencies like what you're doing in maven pom. Please add exclusion artifacts with '^' separated string after the artifact. For example, <code>--artifacts "redis.clients:jedis:2.9.0,org.apache.kafka:kafka_2.10:0.8.2.2^org.slf4j:slf4j-log4j12"</code> will load jedis and kafka artifact and all of transitive dependencies but exclude slf4j-log4j12 from kafka.</p> - -<p>When you need to pull the artifacts from other than Maven Central, you can pass remote repositories to --artifactRepositories option with comma-separated string. Repository format is "<name>^<url>". '^' is taken as separator because URL allows various characters. For example, --artifactRepositories "jboss-repository^<a href="http://repository.jboss.com/maven2,HDPRepo%5Ehttp://repo.hortonworks.com/content/groups/public/">http://repository.jboss.com/maven2,HDPRepo^http://repo.hortonworks.com/content/groups/public/</a>" will add JBoss and HDP repositories for dependency resolver.</p> - -<p>Complete example of both options is here: <code>./bin/storm jar example/storm-starter/storm-starter-topologies-*.jar org.apache.storm.starter.RollingTopWords blobstore-remote2 remote --jars "./external/storm-redis/storm-redis-1.1.0.jar,./external/storm-kafka/storm-kafka-1.1.0.jar" --artifacts "redis.clients:jedis:2.9.0,org.apache.kafka:kafka_2.10:0.8.2.2^org.slf4j:slf4j-log4j12" --artifactRepositories "jboss-repository^http://repository.jboss.com/maven2,HDPRepo^http://repo.hortonworks.com/content/groups/public/"</code></p> - -<p>When you pass jars and/or artifacts options, StormSubmitter will upload them when the topology is submitted, and they will be included to classpath of both the process which runs the class, and also workers for that topology.</p> - -<h3 id="sql">sql</h3> - -<p>Syntax: <code>storm sql sql-file topology-name</code></p> - -<p>Compiles the SQL statements into a Trident topology and submits it to Storm.</p> - -<p><code>--jars</code> and <code>--artifacts</code>, and <code>--artifactRepositories</code> options available for jar are also applied to sql command. Please refer "help jar" to see how to use --jars and --artifacts, and --artifactRepositories options. You normally want to pass these options since you need to set data source to your sql which is an external storage in many cases.</p> - -<h3 id="kill">kill</h3> - -<p>Syntax: <code>storm kill topology-name [-w wait-time-secs]</code></p> - -<p>Kills the topology with the name <code>topology-name</code>. Storm will first deactivate the topology's spouts for the duration of the topology's message timeout to allow all messages currently being processed to finish processing. Storm will then shutdown the workers and clean up their state. You can override the length of time Storm waits between deactivation and shutdown with the -w flag.</p> - -<h3 id="activate">activate</h3> - -<p>Syntax: <code>storm activate topology-name</code></p> - -<p>Activates the specified topology's spouts.</p> - -<h3 id="deactivate">deactivate</h3> - -<p>Syntax: <code>storm deactivate topology-name</code></p> - -<p>Deactivates the specified topology's spouts.</p> - -<h3 id="rebalance">rebalance</h3> - -<p>Syntax: <code>storm rebalance topology-name [-w wait-time-secs] [-n new-num-workers] [-e component=parallelism]*</code></p> - -<p>Sometimes you may wish to spread out where the workers for a topology are running. For example, let's say you have a 10 node cluster running 4 workers per node, and then let's say you add another 10 nodes to the cluster. You may wish to have Storm spread out the workers for the running topology so that each node runs 2 workers. One way to do this is to kill the topology and resubmit it, but Storm provides a "rebalance" command that provides an easier way to do this.</p> - -<p>Rebalance will first deactivate the topology for the duration of the message timeout (overridable with the -w flag) and then redistribute the workers evenly around the cluster. The topology will then return to its previous state of activation (so a deactivated topology will still be deactivated and an activated topology will go back to being activated).</p> - -<p>The rebalance command can also be used to change the parallelism of a running topology. Use the -n and -e switches to change the number of workers or number of executors of a component respectively.</p> - -<h3 id="repl">repl</h3> - -<p>Syntax: <code>storm repl</code></p> - -<p>Opens up a Clojure REPL with the storm jars and configuration on the classpath. Useful for debugging.</p> - -<h3 id="classpath">classpath</h3> - -<p>Syntax: <code>storm classpath</code></p> - -<p>Prints the classpath used by the storm client when running commands.</p> - -<h3 id="localconfvalue">localconfvalue</h3> - -<p>Syntax: <code>storm localconfvalue conf-name</code></p> - -<p>Prints out the value for <code>conf-name</code> in the local Storm configs. The local Storm configs are the ones in <code>~/.storm/storm.yaml</code> merged in with the configs in <code>defaults.yaml</code>.</p> - -<h3 id="remoteconfvalue">remoteconfvalue</h3> - -<p>Syntax: <code>storm remoteconfvalue conf-name</code></p> - -<p>Prints out the value for <code>conf-name</code> in the cluster's Storm configs. The cluster's Storm configs are the ones in <code>$STORM-PATH/conf/storm.yaml</code> merged in with the configs in <code>defaults.yaml</code>. This command must be run on a cluster machine.</p> - -<h3 id="nimbus">nimbus</h3> - -<p>Syntax: <code>storm nimbus</code></p> - -<p>Launches the nimbus daemon. This command should be run under supervision with a tool like <a href="http://cr.yp.to/daemontools.html">daemontools</a> or <a href="http://mmonit.com/monit/">monit</a>. See <a href="Setting-up-a-Storm-cluster.html">Setting up a Storm cluster</a> for more information.</p> - -<h3 id="supervisor">supervisor</h3> - -<p>Syntax: <code>storm supervisor</code></p> - -<p>Launches the supervisor daemon. This command should be run under supervision with a tool like <a href="http://cr.yp.to/daemontools.html">daemontools</a> or <a href="http://mmonit.com/monit/">monit</a>. See <a href="Setting-up-a-Storm-cluster.html">Setting up a Storm cluster</a> for more information.</p> - -<h3 id="ui">ui</h3> - -<p>Syntax: <code>storm ui</code></p> - -<p>Launches the UI daemon. The UI provides a web interface for a Storm cluster and shows detailed stats about running topologies. This command should be run under supervision with a tool like <a href="http://cr.yp.to/daemontools.html">daemontools</a> or <a href="http://mmonit.com/monit/">monit</a>. See <a href="Setting-up-a-Storm-cluster.html">Setting up a Storm cluster</a> for more information.</p> - -<h3 id="drpc">drpc</h3> - -<p>Syntax: <code>storm drpc</code></p> - -<p>Launches a DRPC daemon. This command should be run under supervision with a tool like <a href="http://cr.yp.to/daemontools.html">daemontools</a> or <a href="http://mmonit.com/monit/">monit</a>. See <a href="Distributed-RPC.html">Distributed RPC</a> for more information.</p> - -<h3 id="blobstore">blobstore</h3> - -<p>Syntax: <code>storm blobstore cmd</code></p> - -<p>list [KEY...] - lists blobs currently in the blob store</p> - -<p>cat [-f FILE] KEY - read a blob and then either write it to a file, or STDOUT (requires read access).</p> - -<p>create [-f FILE] [-a ACL ...] [--replication-factor NUMBER] KEY - create a new blob. Contents comes from a FILE or STDIN. ACL is in the form [uo]:[username]:[r-][w-][a-] can be comma separated list.</p> - -<p>update [-f FILE] KEY - update the contents of a blob. Contents comes from a FILE or STDIN (requires write access).</p> - -<p>delete KEY - delete an entry from the blob store (requires write access).</p> - -<p>set-acl [-s ACL] KEY - ACL is in the form [uo]:[username]:[r-][w-][a-] can be comma separated list (requires admin access).</p> - -<p>replication --read KEY - Used to read the replication factor of the blob.</p> - -<p>replication --update --replication-factor NUMBER KEY where NUMBER > 0. It is used to update the replication factor of a blob.</p> - -<p>For example, the following would create a mytopo:data.tgz key using the data stored in data.tgz. User alice would have full access, bob would have read/write access and everyone else would have read access.</p> - -<p>storm blobstore create mytopo:data.tgz -f data.tgz -a u:alice:rwa,u:bob:rw,o::r</p> - -<p>See <a href="distcache-blobstore.html">Blobstore(Distcahce)</a> for more information.</p> - -<h3 id="dev-zookeeper">dev-zookeeper</h3> - -<p>Syntax: <code>storm dev-zookeeper</code></p> - -<p>Launches a fresh Zookeeper server using "dev.zookeeper.path" as its local dir and "storm.zookeeper.port" as its port. This is only intended for development/testing, the Zookeeper instance launched is not configured to be used in production.</p> - -<h3 id="get-errors">get-errors</h3> - -<p>Syntax: <code>storm get-errors topology-name</code></p> - -<p>Get the latest error from the running topology. The returned result contains the key value pairs for component-name and component-error for the components in error. The result is returned in json format.</p> - -<h3 id="heartbeats">heartbeats</h3> - -<p>Syntax: <code>storm heartbeats [cmd]</code></p> - -<p>list PATH - lists heartbeats nodes under PATH currently in the ClusterState. -get PATH - Get the heartbeat data at PATH</p> - -<h3 id="kill_workers">kill_workers</h3> - -<p>Syntax: <code>storm kill_workers</code></p> - -<p>Kill the workers running on this supervisor. This command should be run on a supervisor node. If the cluster is running in secure mode, then user needs to have admin rights on the node to be able to successfully kill all workers.</p> - -<h3 id="list">list</h3> - -<p>Syntax: <code>storm list</code></p> - -<p>List the running topologies and their statuses.</p> - -<h3 id="logviewer">logviewer</h3> - -<p>Syntax: <code>storm logviewer</code></p> - -<p>Launches the log viewer daemon. It provides a web interface for viewing storm log files. This command should be run under supervision with a tool like daemontools or monit.</p> - -<p>See Setting up a Storm cluster for more information.(<a href="http://storm.apache.org/documentation/Setting-up-a-Storm-cluster">http://storm.apache.org/documentation/Setting-up-a-Storm-cluster</a>)</p> - -<h3 id="monitor">monitor</h3> - -<p>Syntax: <code>storm monitor topology-name [-i interval-secs] [-m component-id] [-s stream-id] [-w [emitted | transferred]]</code></p> - -<p>Monitor given topology's throughput interactively. -One can specify poll-interval, component-id, stream-id, watch-item[emitted | transferred] - By default, - poll-interval is 4 seconds; - all component-ids will be list; - stream-id is 'default'; - watch-item is 'emitted';</p> - -<h3 id="node-health-check">node-health-check</h3> - -<p>Syntax: <code>storm node-health-check</code></p> - -<p>Run health checks on the local supervisor.</p> - -<h3 id="pacemaker">pacemaker</h3> - -<p>Syntax: <code>storm pacemaker</code></p> - -<p>Launches the Pacemaker daemon. This command should be run under -supervision with a tool like daemontools or monit.</p> - -<p>See Setting up a Storm cluster for more information.(<a href="http://storm.apache.org/documentation/Setting-up-a-Storm-cluster">http://storm.apache.org/documentation/Setting-up-a-Storm-cluster</a>)</p> - -<h3 id="set_log_level">set_log_level</h3> - -<p>Syntax: <code>storm set_log_level -l [logger name]=[log level][:optional timeout] -r [logger name] topology-name</code></p> - -<p>Dynamically change topology log levels</p> - -<p>where log level is one of: ALL, TRACE, DEBUG, INFO, WARN, ERROR, FATAL, OFF -and timeout is integer seconds.</p> - -<p>e.g. - ./bin/storm set_log_level -l ROOT=DEBUG:30 topology-name</p> - -<p>Set the root logger's level to DEBUG for 30 seconds</p> - -<p>./bin/storm set_log_level -l com.myapp=WARN topology-name</p> - -<p>Set the com.myapp logger's level to WARN for 30 seconds</p> - -<p>./bin/storm set_log_level -l com.myapp=WARN -l com.myOtherLogger=ERROR:123 topology-name</p> - -<p>Set the com.myapp logger's level to WARN indifinitely, and com.myOtherLogger to ERROR for 123 seconds</p> - -<p>./bin/storm set_log_level -r com.myOtherLogger topology-name</p> - -<p>Clears settings, resetting back to the original level</p> - -<h3 id="shell">shell</h3> - -<p>Syntax: <code>storm shell resourcesdir command args</code></p> - -<p>Makes constructing jar and uploading to nimbus for using non JVM languages</p> - -<p>eg: <code>storm shell resources/ python topology.py arg1 arg2</code></p> - -<h3 id="upload-credentials">upload-credentials</h3> - -<p>Syntax: <code>storm upload_credentials topology-name [credkey credvalue]*</code></p> - -<p>Uploads a new set of credentials to a running topology</p> - -<h3 id="version">version</h3> - -<p>Syntax: <code>storm version</code></p> - -<p>Prints the version number of this Storm release.</p> - -<h3 id="help">help</h3> - -<p>Syntax: <code>storm help [command]</code></p> - -<p>Print one help message or list of available commands</p> -</div> - - - </div> - </div> - </div> -<footer> - <div class="container-fluid"> - <div class="row"> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>Meetups</h5> - <ul class="latest-news"> - - <li><a href="http://www.meetup.com/Apache-Storm-Apache-Kafka/">Apache Storm & Apache Kafka</a> <span class="small">(Sunnyvale, CA)</span></li> - - <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Apache Storm & Kafka Users</a> <span class="small">(Seattle, WA)</span></li> - - <li><a href="http://www.meetup.com/New-York-City-Storm-User-Group/">NYC Storm User Group</a> <span class="small">(New York, NY)</span></li> - - <li><a href="http://www.meetup.com/Bay-Area-Stream-Processing">Bay Area Stream Processing</a> <span class="small">(Emeryville, CA)</span></li> - - <li><a href="http://www.meetup.com/Boston-Storm-Users/">Boston Realtime Data</a> <span class="small">(Boston, MA)</span></li> - - <li><a href="http://www.meetup.com/storm-london">London Storm User Group</a> <span class="small">(London, UK)</span></li> - - <!-- <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Seatle, WA</a> <span class="small">(27 Jun 2015)</span></li> --> - </ul> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>About Storm</h5> - <p>Storm integrates with any queueing system and any database system. Storm's spout abstraction makes it easy to integrate a new queuing system. Likewise, integrating Storm with database systems is easy.</p> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>First Look</h5> - <ul class="footer-list"> - <li><a href="/releases/current/Rationale.html">Rationale</a></li> - <li><a href="/releases/current/Tutorial.html">Tutorial</a></li> - <li><a href="/releases/current/Setting-up-development-environment.html">Setting up development environment</a></li> - <li><a href="/releases/current/Creating-a-new-Storm-project.html">Creating a new Storm project</a></li> - </ul> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>Documentation</h5> - <ul class="footer-list"> - <li><a href="/releases/current/index.html">Index</a></li> - <li><a href="/releases/current/javadocs/index.html">Javadoc</a></li> - <li><a href="/releases/current/FAQ.html">FAQ</a></li> - </ul> - </div> - </div> - </div> - <hr/> - <div class="row"> - <div class="col-md-12"> - <p align="center">Copyright © 2015 <a href="http://www.apache.org">Apache Software Foundation</a>. All Rights Reserved. - <br>Apache Storm, Apache, the Apache feather logo, and the Apache Storm project logos are trademarks of The Apache Software Foundation. - <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</p> - </div> - </div> - </div> -</footer> -<!--Footer End--> -<!-- Scroll to top --> -<span class="totop"><a href="#"><i class="fa fa-angle-up"></i></a></span> - -</body> - -</html> -
http://git-wip-us.apache.org/repos/asf/storm-site/blob/6e122a12/content/releases/1.2.1/Common-patterns.html ---------------------------------------------------------------------- diff --git a/content/releases/1.2.1/Common-patterns.html b/content/releases/1.2.1/Common-patterns.html deleted file mode 100644 index 6b33baa..0000000 --- a/content/releases/1.2.1/Common-patterns.html +++ /dev/null @@ -1,290 +0,0 @@ -<!DOCTYPE html> -<html> - <head> - <meta charset="utf-8"> - <meta http-equiv="X-UA-Compatible" content="IE=edge"> - <meta name="viewport" content="width=device-width, initial-scale=1"> - - <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"> - <link rel="icon" href="/favicon.ico" type="image/x-icon"> - - <title>Common Topology Patterns</title> - - <!-- Bootstrap core CSS --> - <link href="/assets/css/bootstrap.min.css" rel="stylesheet"> - <!-- Bootstrap theme --> - <link href="/assets/css/bootstrap-theme.min.css" rel="stylesheet"> - - <!-- Custom styles for this template --> - <link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css"> - <link href="/css/style.css" rel="stylesheet"> - <link href="/assets/css/owl.theme.css" rel="stylesheet"> - <link href="/assets/css/owl.carousel.css" rel="stylesheet"> - <script type="text/javascript" src="/assets/js/jquery.min.js"></script> - <script type="text/javascript" src="/assets/js/bootstrap.min.js"></script> - <script type="text/javascript" src="/assets/js/owl.carousel.min.js"></script> - <script type="text/javascript" src="/assets/js/storm.js"></script> - <!-- Just for debugging purposes. Don't actually copy these 2 lines! --> - <!--[if lt IE 9]><script src="../../assets/js/ie8-responsive-file-warning.js"></script><![endif]--> - - <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries --> - <!--[if lt IE 9]> - <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script> - <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script> - <![endif]--> - </head> - - - <body> - <header> - <div class="container-fluid"> - <div class="row"> - <div class="col-md-5"> - <a href="/index.html"><img src="/images/logo.png" class="logo" /></a> - </div> - <div class="col-md-5"> - - <h1>Version: 1.2.1</h1> - - </div> - <div class="col-md-2"> - <a href="/downloads.html" class="btn-std btn-block btn-download">Download</a> - </div> - </div> - </div> -</header> -<!--Header End--> -<!--Navigation Begin--> -<div class="navbar" role="banner"> - <div class="container-fluid"> - <div class="navbar-header"> - <button class="navbar-toggle" type="button" data-toggle="collapse" data-target=".bs-navbar-collapse"> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - </button> - </div> - <nav class="collapse navbar-collapse bs-navbar-collapse" role="navigation"> - <ul class="nav navbar-nav"> - <li><a href="/index.html" id="home">Home</a></li> - <li><a href="/getting-help.html" id="getting-help">Getting Help</a></li> - <li><a href="/about/integrates.html" id="project-info">Project Information</a></li> - <li class="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="documentation">Documentation <b class="caret"></b></a> - <ul class="dropdown-menu"> - - - <li><a href="/releases/2.0.0-SNAPSHOT/index.html">2.0.0-SNAPSHOT</a></li> - - - - <li><a href="/releases/1.2.1/index.html">1.2.1</a></li> - - - - <li><a href="/releases/1.1.2/index.html">1.1.2</a></li> - - - - - - <li><a href="/releases/1.0.6/index.html">1.0.6</a></li> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - </ul> - </li> - <li><a href="/talksAndVideos.html">Talks and Slideshows</a></li> - <li class="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="contribute">Community <b class="caret"></b></a> - <ul class="dropdown-menu"> - <li><a href="/contribute/Contributing-to-Storm.html">Contributing</a></li> - <li><a href="/contribute/People.html">People</a></li> - <li><a href="/contribute/BYLAWS.html">ByLaws</a></li> - </ul> - </li> - <li><a href="/2018/06/04/storm122-released.html" id="news">News</a></li> - </ul> - </nav> - </div> -</div> - - - - <div class="container-fluid"> - <h1 class="page-title">Common Topology Patterns</h1> - <div class="row"> - <div class="col-md-12"> - <!-- Documentation --> - -<p class="post-meta"></p> - -<div class="documentation-content"><p>This page lists a variety of common patterns in Storm topologies.</p> - -<ol> -<li>Batching</li> -<li>BasicBolt</li> -<li>In-memory caching + fields grouping combo</li> -<li>Streaming top N</li> -<li>TimeCacheMap for efficiently keeping a cache of things that have been recently updated</li> -<li>CoordinatedBolt and KeyedFairBolt for Distributed RPC</li> -</ol> - -<h3 id="batching">Batching</h3> - -<p>Oftentimes for efficiency reasons or otherwise, you want to process a group of tuples in batch rather than individually. For example, you may want to batch updates to a database or do a streaming aggregation of some sort.</p> - -<p>If you want reliability in your data processing, the right way to do this is to hold on to tuples in an instance variable while the bolt waits to do the batching. Once you do the batch operation, you then ack all the tuples you were holding onto.</p> - -<p>If the bolt emits tuples, then you may want to use multi-anchoring to ensure reliability. It all depends on the specific application. See <a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a> for more details on how reliability works.</p> - -<h3 id="basicbolt">BasicBolt</h3> - -<p>Many bolts follow a similar pattern of reading an input tuple, emitting zero or more tuples based on that input tuple, and then acking that input tuple immediately at the end of the execute method. Bolts that match this pattern are things like functions and filters. This is such a common pattern that Storm exposes an interface called <a href="javadocs/org/apache/storm/topology/IBasicBolt.html">IBasicBolt</a> that automates this pattern for you. See <a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a> for more information.</p> - -<h3 id="in-memory-caching-fields-grouping-combo">In-memory caching + fields grouping combo</h3> - -<p>It's common to keep caches in-memory in Storm bolts. Caching becomes particularly powerful when you combine it with a fields grouping. For example, suppose you have a bolt that expands short URLs (like bit.ly, t.co, etc.) into long URLs. You can increase performance by keeping an LRU cache of short URL to long URL expansions to avoid doing the same HTTP requests over and over. Suppose component "urls" emits short URLS, and component "expand" expands short URLs into long URLs and keeps a cache internally. Consider the difference between the two following snippets of code:</p> -<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">builder</span><span class="o">.</span><span class="na">setBolt</span><span class="o">(</span><span class="s">"expand"</span><span class="o">,</span> <span class="k">new</span> <span class="n">ExpandUrl</span><span class="o">(),</span> <span class="n">parallelism</span><span class="o">)</span> - <span class="o">.</span><span class="na">shuffleGrouping</span><span class="o">(</span><span class="mi">1</span><span class="o">);</span> -</code></pre></div><div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">builder</span><span class="o">.</span><span class="na">setBolt</span><span class="o">(</span><span class="s">"expand"</span><span class="o">,</span> <span class="k">new</span> <span class="n">ExpandUrl</span><span class="o">(),</span> <span class="n">parallelism</span><span class="o">)</span> - <span class="o">.</span><span class="na">fieldsGrouping</span><span class="o">(</span><span class="s">"urls"</span><span class="o">,</span> <span class="k">new</span> <span class="n">Fields</span><span class="o">(</span><span class="s">"url"</span><span class="o">));</span> -</code></pre></div> -<p>The second approach will have vastly more effective caches, since the same URL will always go to the same task. This avoids having duplication across any of the caches in the tasks and makes it much more likely that a short URL will hit the cache.</p> - -<h3 id="streaming-top-n">Streaming top N</h3> - -<p>A common continuous computation done on Storm is a "streaming top N" of some sort. Suppose you have a bolt that emits tuples of the form ["value", "count"] and you want a bolt that emits the top N tuples based on count. The simplest way to do this is to have a bolt that does a global grouping on the stream and maintains a list in memory of the top N items.</p> - -<p>This approach obviously doesn't scale to large streams since the entire stream has to go through one task. A better way to do the computation is to do many top N's in parallel across partitions of the stream, and then merge those top N's together to get the global top N. The pattern looks like this:</p> -<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">builder</span><span class="o">.</span><span class="na">setBolt</span><span class="o">(</span><span class="s">"rank"</span><span class="o">,</span> <span class="k">new</span> <span class="n">RankObjects</span><span class="o">(),</span> <span class="n">parallelism</span><span class="o">)</span> - <span class="o">.</span><span class="na">fieldsGrouping</span><span class="o">(</span><span class="s">"objects"</span><span class="o">,</span> <span class="k">new</span> <span class="n">Fields</span><span class="o">(</span><span class="s">"value"</span><span class="o">));</span> -<span class="n">builder</span><span class="o">.</span><span class="na">setBolt</span><span class="o">(</span><span class="s">"merge"</span><span class="o">,</span> <span class="k">new</span> <span class="n">MergeObjects</span><span class="o">())</span> - <span class="o">.</span><span class="na">globalGrouping</span><span class="o">(</span><span class="s">"rank"</span><span class="o">);</span> -</code></pre></div> -<p>This pattern works because of the fields grouping done by the first bolt which gives the partitioning you need for this to be semantically correct. You can see an example of this pattern in storm-starter <a href="http://github.com/apache/storm/blob/v1.2.1/examples/storm-starter/src/jvm/org/apache/storm/starter/RollingTopWords.java">here</a>.</p> - -<p>If however you have a known skew in the data being processed it can be advantageous to use partialKeyGrouping instead of fieldsGrouping. This will distribute the load for each key between two downstream bolts instead of a single one.</p> -<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">builder</span><span class="o">.</span><span class="na">setBolt</span><span class="o">(</span><span class="s">"count"</span><span class="o">,</span> <span class="k">new</span> <span class="n">CountObjects</span><span class="o">(),</span> <span class="n">parallelism</span><span class="o">)</span> - <span class="o">.</span><span class="na">partialKeyGrouping</span><span class="o">(</span><span class="s">"objects"</span><span class="o">,</span> <span class="k">new</span> <span class="n">Fields</span><span class="o">(</span><span class="s">"value"</span><span class="o">));</span> -<span class="n">builder</span><span class="o">.</span><span class="na">setBolt</span><span class="o">(</span><span class="s">"rank"</span> <span class="k">new</span> <span class="n">AggregateCountsAndRank</span><span class="o">(),</span> <span class="n">parallelism</span><span class="o">)</span> - <span class="o">.</span><span class="na">fieldsGrouping</span><span class="o">(</span><span class="s">"count"</span><span class="o">,</span> <span class="k">new</span> <span class="n">Fields</span><span class="o">(</span><span class="s">"key"</span><span class="o">))</span> -<span class="n">builder</span><span class="o">.</span><span class="na">setBolt</span><span class="o">(</span><span class="s">"merge"</span><span class="o">,</span> <span class="k">new</span> <span class="n">MergeRanksObjects</span><span class="o">())</span> - <span class="o">.</span><span class="na">globalGrouping</span><span class="o">(</span><span class="s">"rank"</span><span class="o">);</span> -</code></pre></div> -<p>The topology needs an extra layer of processing to aggregate the partial counts from the upstream bolts but this only processes aggregated values now so the bolt it is not subject to the load caused by the skewed data. You can see an example of this pattern in storm-starter <a href="http://github.com/apache/storm/blob/v1.2.1/examples/storm-starter/src/jvm/org/apache/storm/starter/SkewedRollingTopWords.java">here</a>.</p> - -<h3 id="timecachemap-for-efficiently-keeping-a-cache-of-things-that-have-been-recently-updated">TimeCacheMap for efficiently keeping a cache of things that have been recently updated</h3> - -<p>You sometimes want to keep a cache in memory of items that have been recently "active" and have items that have been inactive for some time be automatically expires. <a href="javadocs/org/apache/storm/utils/TimeCacheMap.html">TimeCacheMap</a> is an efficient data structure for doing this and provides hooks so you can insert callbacks whenever an item is expired.</p> - -<h3 id="coordinatedbolt-and-keyedfairbolt-for-distributed-rpc">CoordinatedBolt and KeyedFairBolt for Distributed RPC</h3> - -<p>When building distributed RPC applications on top of Storm, there are two common patterns that are usually needed. These are encapsulated by <a href="javadocs/org/apache/storm/task/CoordinatedBolt.html">CoordinatedBolt</a> and <a href="javadocs/org/apache/storm/task/KeyedFairBolt.html">KeyedFairBolt</a> which are part of the "standard library" that ships with the Storm codebase.</p> - -<p><code>CoordinatedBolt</code> wraps the bolt containing your logic and figures out when your bolt has received all the tuples for any given request. It makes heavy use of direct streams to do this.</p> - -<p><code>KeyedFairBolt</code> also wraps the bolt containing your logic and makes sure your topology processes multiple DRPC invocations at the same time, instead of doing them serially one at a time.</p> - -<p>See <a href="Distributed-RPC.html">Distributed RPC</a> for more details.</p> -</div> - - - </div> - </div> - </div> -<footer> - <div class="container-fluid"> - <div class="row"> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>Meetups</h5> - <ul class="latest-news"> - - <li><a href="http://www.meetup.com/Apache-Storm-Apache-Kafka/">Apache Storm & Apache Kafka</a> <span class="small">(Sunnyvale, CA)</span></li> - - <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Apache Storm & Kafka Users</a> <span class="small">(Seattle, WA)</span></li> - - <li><a href="http://www.meetup.com/New-York-City-Storm-User-Group/">NYC Storm User Group</a> <span class="small">(New York, NY)</span></li> - - <li><a href="http://www.meetup.com/Bay-Area-Stream-Processing">Bay Area Stream Processing</a> <span class="small">(Emeryville, CA)</span></li> - - <li><a href="http://www.meetup.com/Boston-Storm-Users/">Boston Realtime Data</a> <span class="small">(Boston, MA)</span></li> - - <li><a href="http://www.meetup.com/storm-london">London Storm User Group</a> <span class="small">(London, UK)</span></li> - - <!-- <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Seatle, WA</a> <span class="small">(27 Jun 2015)</span></li> --> - </ul> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>About Storm</h5> - <p>Storm integrates with any queueing system and any database system. Storm's spout abstraction makes it easy to integrate a new queuing system. Likewise, integrating Storm with database systems is easy.</p> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>First Look</h5> - <ul class="footer-list"> - <li><a href="/releases/current/Rationale.html">Rationale</a></li> - <li><a href="/releases/current/Tutorial.html">Tutorial</a></li> - <li><a href="/releases/current/Setting-up-development-environment.html">Setting up development environment</a></li> - <li><a href="/releases/current/Creating-a-new-Storm-project.html">Creating a new Storm project</a></li> - </ul> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>Documentation</h5> - <ul class="footer-list"> - <li><a href="/releases/current/index.html">Index</a></li> - <li><a href="/releases/current/javadocs/index.html">Javadoc</a></li> - <li><a href="/releases/current/FAQ.html">FAQ</a></li> - </ul> - </div> - </div> - </div> - <hr/> - <div class="row"> - <div class="col-md-12"> - <p align="center">Copyright © 2015 <a href="http://www.apache.org">Apache Software Foundation</a>. All Rights Reserved. - <br>Apache Storm, Apache, the Apache feather logo, and the Apache Storm project logos are trademarks of The Apache Software Foundation. - <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</p> - </div> - </div> - </div> -</footer> -<!--Footer End--> -<!-- Scroll to top --> -<span class="totop"><a href="#"><i class="fa fa-angle-up"></i></a></span> - -</body> - -</html> - http://git-wip-us.apache.org/repos/asf/storm-site/blob/6e122a12/content/releases/1.2.1/Concepts.html ---------------------------------------------------------------------- diff --git a/content/releases/1.2.1/Concepts.html b/content/releases/1.2.1/Concepts.html deleted file mode 100644 index e2f4ce0..0000000 --- a/content/releases/1.2.1/Concepts.html +++ /dev/null @@ -1,346 +0,0 @@ -<!DOCTYPE html> -<html> - <head> - <meta charset="utf-8"> - <meta http-equiv="X-UA-Compatible" content="IE=edge"> - <meta name="viewport" content="width=device-width, initial-scale=1"> - - <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"> - <link rel="icon" href="/favicon.ico" type="image/x-icon"> - - <title>Concepts</title> - - <!-- Bootstrap core CSS --> - <link href="/assets/css/bootstrap.min.css" rel="stylesheet"> - <!-- Bootstrap theme --> - <link href="/assets/css/bootstrap-theme.min.css" rel="stylesheet"> - - <!-- Custom styles for this template --> - <link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css"> - <link href="/css/style.css" rel="stylesheet"> - <link href="/assets/css/owl.theme.css" rel="stylesheet"> - <link href="/assets/css/owl.carousel.css" rel="stylesheet"> - <script type="text/javascript" src="/assets/js/jquery.min.js"></script> - <script type="text/javascript" src="/assets/js/bootstrap.min.js"></script> - <script type="text/javascript" src="/assets/js/owl.carousel.min.js"></script> - <script type="text/javascript" src="/assets/js/storm.js"></script> - <!-- Just for debugging purposes. Don't actually copy these 2 lines! --> - <!--[if lt IE 9]><script src="../../assets/js/ie8-responsive-file-warning.js"></script><![endif]--> - - <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries --> - <!--[if lt IE 9]> - <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script> - <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script> - <![endif]--> - </head> - - - <body> - <header> - <div class="container-fluid"> - <div class="row"> - <div class="col-md-5"> - <a href="/index.html"><img src="/images/logo.png" class="logo" /></a> - </div> - <div class="col-md-5"> - - <h1>Version: 1.2.1</h1> - - </div> - <div class="col-md-2"> - <a href="/downloads.html" class="btn-std btn-block btn-download">Download</a> - </div> - </div> - </div> -</header> -<!--Header End--> -<!--Navigation Begin--> -<div class="navbar" role="banner"> - <div class="container-fluid"> - <div class="navbar-header"> - <button class="navbar-toggle" type="button" data-toggle="collapse" data-target=".bs-navbar-collapse"> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - </button> - </div> - <nav class="collapse navbar-collapse bs-navbar-collapse" role="navigation"> - <ul class="nav navbar-nav"> - <li><a href="/index.html" id="home">Home</a></li> - <li><a href="/getting-help.html" id="getting-help">Getting Help</a></li> - <li><a href="/about/integrates.html" id="project-info">Project Information</a></li> - <li class="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="documentation">Documentation <b class="caret"></b></a> - <ul class="dropdown-menu"> - - - <li><a href="/releases/2.0.0-SNAPSHOT/index.html">2.0.0-SNAPSHOT</a></li> - - - - <li><a href="/releases/1.2.1/index.html">1.2.1</a></li> - - - - <li><a href="/releases/1.1.2/index.html">1.1.2</a></li> - - - - - - <li><a href="/releases/1.0.6/index.html">1.0.6</a></li> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - </ul> - </li> - <li><a href="/talksAndVideos.html">Talks and Slideshows</a></li> - <li class="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="contribute">Community <b class="caret"></b></a> - <ul class="dropdown-menu"> - <li><a href="/contribute/Contributing-to-Storm.html">Contributing</a></li> - <li><a href="/contribute/People.html">People</a></li> - <li><a href="/contribute/BYLAWS.html">ByLaws</a></li> - </ul> - </li> - <li><a href="/2018/06/04/storm122-released.html" id="news">News</a></li> - </ul> - </nav> - </div> -</div> - - - - <div class="container-fluid"> - <h1 class="page-title">Concepts</h1> - <div class="row"> - <div class="col-md-12"> - <!-- Documentation --> - -<p class="post-meta"></p> - -<div class="documentation-content"><p>This page lists the main concepts of Storm and links to resources where you can find more information. The concepts discussed are:</p> - -<ol> -<li>Topologies</li> -<li>Streams</li> -<li>Spouts</li> -<li>Bolts</li> -<li>Stream groupings</li> -<li>Reliability</li> -<li>Tasks</li> -<li>Workers</li> -</ol> - -<h3 id="topologies">Topologies</h3> - -<p>The logic for a realtime application is packaged into a Storm topology. A Storm topology is analogous to a MapReduce job. One key difference is that a MapReduce job eventually finishes, whereas a topology runs forever (or until you kill it, of course). A topology is a graph of spouts and bolts that are connected with stream groupings. These concepts are described below.</p> - -<p><strong>Resources:</strong></p> - -<ul> -<li><a href="javadocs/org/apache/storm/topology/TopologyBuilder.html">TopologyBuilder</a>: use this class to construct topologies in Java</li> -<li><a href="Running-topologies-on-a-production-cluster.html">Running topologies on a production cluster</a></li> -<li><a href="Local-mode.html">Local mode</a>: Read this to learn how to develop and test topologies in local mode.</li> -</ul> - -<h3 id="streams">Streams</h3> - -<p>The stream is the core abstraction in Storm. A stream is an unbounded sequence of tuples that is processed and created in parallel in a distributed fashion. Streams are defined with a schema that names the fields in the stream's tuples. By default, tuples can contain integers, longs, shorts, bytes, strings, doubles, floats, booleans, and byte arrays. You can also define your own serializers so that custom types can be used natively within tuples.</p> - -<p>Every stream is given an id when declared. Since single-stream spouts and bolts are so common, <a href="javadocs/org/apache/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a> has convenience methods for declaring a single stream without specifying an id. In this case, the stream is given the default id of "default".</p> - -<p><strong>Resources:</strong></p> - -<ul> -<li><a href="javadocs/org/apache/storm/tuple/Tuple.html">Tuple</a>: streams are composed of tuples</li> -<li><a href="javadocs/org/apache/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a>: used to declare streams and their schemas</li> -<li><a href="Serialization.html">Serialization</a>: Information about Storm's dynamic typing of tuples and declaring custom serializations</li> -</ul> - -<h3 id="spouts">Spouts</h3> - -<p>A spout is a source of streams in a topology. Generally spouts will read tuples from an external source and emit them into the topology (e.g. a Kestrel queue or the Twitter API). Spouts can either be <strong>reliable</strong> or <strong>unreliable</strong>. A reliable spout is capable of replaying a tuple if it failed to be processed by Storm, whereas an unreliable spout forgets about the tuple as soon as it is emitted.</p> - -<p>Spouts can emit more than one stream. To do so, declare multiple streams using the <code>declareStream</code> method of <a href="javadocs/org/apache/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a> and specify the stream to emit to when using the <code>emit</code> method on <a href="javadocs/org/apache/storm/spout/SpoutOutputCollector.html">SpoutOutputCollector</a>.</p> - -<p>The main method on spouts is <code>nextTuple</code>. <code>nextTuple</code> either emits a new tuple into the topology or simply returns if there are no new tuples to emit. It is imperative that <code>nextTuple</code> does not block for any spout implementation, because Storm calls all the spout methods on the same thread.</p> - -<p>The other main methods on spouts are <code>ack</code> and <code>fail</code>. These are called when Storm detects that a tuple emitted from the spout either successfully completed through the topology or failed to be completed. <code>ack</code> and <code>fail</code> are only called for reliable spouts. See <a href="javadocs/org/apache/storm/spout/ISpout.html">the Javadoc</a> for more information.</p> - -<p><strong>Resources:</strong></p> - -<ul> -<li><a href="javadocs/org/apache/storm/topology/IRichSpout.html">IRichSpout</a>: this is the interface that spouts must implement.</li> -<li><a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a></li> -</ul> - -<h3 id="bolts">Bolts</h3> - -<p>All processing in topologies is done in bolts. Bolts can do anything from filtering, functions, aggregations, joins, talking to databases, and more. </p> - -<p>Bolts can do simple stream transformations. Doing complex stream transformations often requires multiple steps and thus multiple bolts. For example, transforming a stream of tweets into a stream of trending images requires at least two steps: a bolt to do a rolling count of retweets for each image, and one or more bolts to stream out the top X images (you can do this particular stream transformation in a more scalable way with three bolts than with two). </p> - -<p>Bolts can emit more than one stream. To do so, declare multiple streams using the <code>declareStream</code> method of <a href="javadocs/org/apache/storm/topology/OutputFieldsDeclarer.html">OutputFieldsDeclarer</a> and specify the stream to emit to when using the <code>emit</code> method on <a href="javadocs/org/apache/storm/task/OutputCollector.html">OutputCollector</a>.</p> - -<p>When you declare a bolt's input streams, you always subscribe to specific streams of another component. If you want to subscribe to all the streams of another component, you have to subscribe to each one individually. <a href="javadocs/org/apache/storm/topology/InputDeclarer.html">InputDeclarer</a> has syntactic sugar for subscribing to streams declared on the default stream id. Saying <code>declarer.shuffleGrouping("1")</code> subscribes to the default stream on component "1" and is equivalent to <code>declarer.shuffleGrouping("1", DEFAULT_STREAM_ID)</code>.</p> - -<p>The main method in bolts is the <code>execute</code> method which takes in as input a new tuple. Bolts emit new tuples using the <a href="javadocs/org/apache/storm/task/OutputCollector.html">OutputCollector</a> object. Bolts must call the <code>ack</code> method on the <code>OutputCollector</code> for every tuple they process so that Storm knows when tuples are completed (and can eventually determine that its safe to ack the original spout tuples). For the common case of processing an input tuple, emitting 0 or more tuples based on that tuple, and then acking the input tuple, Storm provides an <a href="javadocs/org/apache/storm/topology/IBasicBolt.html">IBasicBolt</a> interface which does the acking automatically.</p> - -<p>Its perfectly fine to launch new threads in bolts that do processing asynchronously. <a href="javadocs/org/apache/storm/task/OutputCollector.html">OutputCollector</a> is thread-safe and can be called at any time.</p> - -<p><strong>Resources:</strong></p> - -<ul> -<li><a href="javadocs/org/apache/storm/topology/IRichBolt.html">IRichBolt</a>: this is general interface for bolts.</li> -<li><a href="javadocs/org/apache/storm/topology/IBasicBolt.html">IBasicBolt</a>: this is a convenience interface for defining bolts that do filtering or simple functions.</li> -<li><a href="javadocs/org/apache/storm/task/OutputCollector.html">OutputCollector</a>: bolts emit tuples to their output streams using an instance of this class</li> -<li><a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a></li> -</ul> - -<h3 id="stream-groupings">Stream groupings</h3> - -<p>Part of defining a topology is specifying for each bolt which streams it should receive as input. A stream grouping defines how that stream should be partitioned among the bolt's tasks.</p> - -<p>There are eight built-in stream groupings in Storm, and you can implement a custom stream grouping by implementing the <a href="javadocs/org/apache/storm/grouping/CustomStreamGrouping.html">CustomStreamGrouping</a> interface:</p> - -<ol> -<li><strong>Shuffle grouping</strong>: Tuples are randomly distributed across the bolt's tasks in a way such that each bolt is guaranteed to get an equal number of tuples.</li> -<li><strong>Fields grouping</strong>: The stream is partitioned by the fields specified in the grouping. For example, if the stream is grouped by the "user-id" field, tuples with the same "user-id" will always go to the same task, but tuples with different "user-id"'s may go to different tasks.</li> -<li><strong>Partial Key grouping</strong>: The stream is partitioned by the fields specified in the grouping, like the Fields grouping, but are load balanced between two downstream bolts, which provides better utilization of resources when the incoming data is skewed. <a href="https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf">This paper</a> provides a good explanation of how it works and the advantages it provides.</li> -<li><strong>All grouping</strong>: The stream is replicated across all the bolt's tasks. Use this grouping with care.</li> -<li><strong>Global grouping</strong>: The entire stream goes to a single one of the bolt's tasks. Specifically, it goes to the task with the lowest id.</li> -<li><strong>None grouping</strong>: This grouping specifies that you don't care how the stream is grouped. Currently, none groupings are equivalent to shuffle groupings. Eventually though, Storm will push down bolts with none groupings to execute in the same thread as the bolt or spout they subscribe from (when possible).</li> -<li><strong>Direct grouping</strong>: This is a special kind of grouping. A stream grouped this way means that the <strong>producer</strong> of the tuple decides which task of the consumer will receive this tuple. Direct groupings can only be declared on streams that have been declared as direct streams. Tuples emitted to a direct stream must be emitted using one of the [emitDirect](javadocs/org/apache/storm/task/OutputCollector.html#emitDirect(int, int, java.util.List) methods. A bolt can get the task ids of its consumers by either using the provided <a href="javadocs/org/apache/storm/task/TopologyContext.html">TopologyContext</a> or by keeping track of the output of the <code>emit</code> method in <a href="javadocs/org/apache/storm/task/OutputCollector.html">OutputCollector</a> (which returns the task ids that the tuple was sent to).</li> -<li><strong>Local or shuffle grouping</strong>: If the target bolt has one or more tasks in the same worker process, tuples will be shuffled to just those in-process tasks. Otherwise, this acts like a normal shuffle grouping.</li> -</ol> - -<p><strong>Resources:</strong></p> - -<ul> -<li><a href="javadocs/org/apache/storm/topology/TopologyBuilder.html">TopologyBuilder</a>: use this class to define topologies</li> -<li><a href="javadocs/org/apache/storm/topology/InputDeclarer.html">InputDeclarer</a>: this object is returned whenever <code>setBolt</code> is called on <code>TopologyBuilder</code> and is used for declaring a bolt's input streams and how those streams should be grouped</li> -</ul> - -<h3 id="reliability">Reliability</h3> - -<p>Storm guarantees that every spout tuple will be fully processed by the topology. It does this by tracking the tree of tuples triggered by every spout tuple and determining when that tree of tuples has been successfully completed. Every topology has a "message timeout" associated with it. If Storm fails to detect that a spout tuple has been completed within that timeout, then it fails the tuple and replays it later. </p> - -<p>To take advantage of Storm's reliability capabilities, you must tell Storm when new edges in a tuple tree are being created and tell Storm whenever you've finished processing an individual tuple. These are done using the <a href="javadocs/org/apache/storm/task/OutputCollector.html">OutputCollector</a> object that bolts use to emit tuples. Anchoring is done in the <code>emit</code> method, and you declare that you're finished with a tuple using the <code>ack</code> method.</p> - -<p>This is all explained in much more detail in <a href="Guaranteeing-message-processing.html">Guaranteeing message processing</a>. </p> - -<h3 id="tasks">Tasks</h3> - -<p>Each spout or bolt executes as many tasks across the cluster. Each task corresponds to one thread of execution, and stream groupings define how to send tuples from one set of tasks to another set of tasks. You set the parallelism for each spout or bolt in the <code>setSpout</code> and <code>setBolt</code> methods of <a href="javadocs/org/apache/storm/topology/TopologyBuilder.html">TopologyBuilder</a>.</p> - -<h3 id="workers">Workers</h3> - -<p>Topologies execute across one or more worker processes. Each worker process is a physical JVM and executes a subset of all the tasks for the topology. For example, if the combined parallelism of the topology is 300 and 50 workers are allocated, then each worker will execute 6 tasks (as threads within the worker). Storm tries to spread the tasks evenly across all the workers.</p> - -<p><strong>Resources:</strong></p> - -<ul> -<li><a href="javadocs/org/apache/storm/Config.html#TOPOLOGY_WORKERS">Config.TOPOLOGY_WORKERS</a>: this config sets the number of workers to allocate for executing the topology</li> -</ul> -</div> - - - </div> - </div> - </div> -<footer> - <div class="container-fluid"> - <div class="row"> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>Meetups</h5> - <ul class="latest-news"> - - <li><a href="http://www.meetup.com/Apache-Storm-Apache-Kafka/">Apache Storm & Apache Kafka</a> <span class="small">(Sunnyvale, CA)</span></li> - - <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Apache Storm & Kafka Users</a> <span class="small">(Seattle, WA)</span></li> - - <li><a href="http://www.meetup.com/New-York-City-Storm-User-Group/">NYC Storm User Group</a> <span class="small">(New York, NY)</span></li> - - <li><a href="http://www.meetup.com/Bay-Area-Stream-Processing">Bay Area Stream Processing</a> <span class="small">(Emeryville, CA)</span></li> - - <li><a href="http://www.meetup.com/Boston-Storm-Users/">Boston Realtime Data</a> <span class="small">(Boston, MA)</span></li> - - <li><a href="http://www.meetup.com/storm-london">London Storm User Group</a> <span class="small">(London, UK)</span></li> - - <!-- <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Seatle, WA</a> <span class="small">(27 Jun 2015)</span></li> --> - </ul> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>About Storm</h5> - <p>Storm integrates with any queueing system and any database system. Storm's spout abstraction makes it easy to integrate a new queuing system. Likewise, integrating Storm with database systems is easy.</p> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>First Look</h5> - <ul class="footer-list"> - <li><a href="/releases/current/Rationale.html">Rationale</a></li> - <li><a href="/releases/current/Tutorial.html">Tutorial</a></li> - <li><a href="/releases/current/Setting-up-development-environment.html">Setting up development environment</a></li> - <li><a href="/releases/current/Creating-a-new-Storm-project.html">Creating a new Storm project</a></li> - </ul> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>Documentation</h5> - <ul class="footer-list"> - <li><a href="/releases/current/index.html">Index</a></li> - <li><a href="/releases/current/javadocs/index.html">Javadoc</a></li> - <li><a href="/releases/current/FAQ.html">FAQ</a></li> - </ul> - </div> - </div> - </div> - <hr/> - <div class="row"> - <div class="col-md-12"> - <p align="center">Copyright © 2015 <a href="http://www.apache.org">Apache Software Foundation</a>. All Rights Reserved. - <br>Apache Storm, Apache, the Apache feather logo, and the Apache Storm project logos are trademarks of The Apache Software Foundation. - <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</p> - </div> - </div> - </div> -</footer> -<!--Footer End--> -<!-- Scroll to top --> -<span class="totop"><a href="#"><i class="fa fa-angle-up"></i></a></span> - -</body> - -</html> - http://git-wip-us.apache.org/repos/asf/storm-site/blob/6e122a12/content/releases/1.2.1/Configuration.html ---------------------------------------------------------------------- diff --git a/content/releases/1.2.1/Configuration.html b/content/releases/1.2.1/Configuration.html deleted file mode 100644 index 29385d5..0000000 --- a/content/releases/1.2.1/Configuration.html +++ /dev/null @@ -1,253 +0,0 @@ -<!DOCTYPE html> -<html> - <head> - <meta charset="utf-8"> - <meta http-equiv="X-UA-Compatible" content="IE=edge"> - <meta name="viewport" content="width=device-width, initial-scale=1"> - - <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"> - <link rel="icon" href="/favicon.ico" type="image/x-icon"> - - <title>Configuration</title> - - <!-- Bootstrap core CSS --> - <link href="/assets/css/bootstrap.min.css" rel="stylesheet"> - <!-- Bootstrap theme --> - <link href="/assets/css/bootstrap-theme.min.css" rel="stylesheet"> - - <!-- Custom styles for this template --> - <link rel="stylesheet" href="http://fortawesome.github.io/Font-Awesome/assets/font-awesome/css/font-awesome.css"> - <link href="/css/style.css" rel="stylesheet"> - <link href="/assets/css/owl.theme.css" rel="stylesheet"> - <link href="/assets/css/owl.carousel.css" rel="stylesheet"> - <script type="text/javascript" src="/assets/js/jquery.min.js"></script> - <script type="text/javascript" src="/assets/js/bootstrap.min.js"></script> - <script type="text/javascript" src="/assets/js/owl.carousel.min.js"></script> - <script type="text/javascript" src="/assets/js/storm.js"></script> - <!-- Just for debugging purposes. Don't actually copy these 2 lines! --> - <!--[if lt IE 9]><script src="../../assets/js/ie8-responsive-file-warning.js"></script><![endif]--> - - <!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries --> - <!--[if lt IE 9]> - <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script> - <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script> - <![endif]--> - </head> - - - <body> - <header> - <div class="container-fluid"> - <div class="row"> - <div class="col-md-5"> - <a href="/index.html"><img src="/images/logo.png" class="logo" /></a> - </div> - <div class="col-md-5"> - - <h1>Version: 1.2.1</h1> - - </div> - <div class="col-md-2"> - <a href="/downloads.html" class="btn-std btn-block btn-download">Download</a> - </div> - </div> - </div> -</header> -<!--Header End--> -<!--Navigation Begin--> -<div class="navbar" role="banner"> - <div class="container-fluid"> - <div class="navbar-header"> - <button class="navbar-toggle" type="button" data-toggle="collapse" data-target=".bs-navbar-collapse"> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - <span class="icon-bar"></span> - </button> - </div> - <nav class="collapse navbar-collapse bs-navbar-collapse" role="navigation"> - <ul class="nav navbar-nav"> - <li><a href="/index.html" id="home">Home</a></li> - <li><a href="/getting-help.html" id="getting-help">Getting Help</a></li> - <li><a href="/about/integrates.html" id="project-info">Project Information</a></li> - <li class="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="documentation">Documentation <b class="caret"></b></a> - <ul class="dropdown-menu"> - - - <li><a href="/releases/2.0.0-SNAPSHOT/index.html">2.0.0-SNAPSHOT</a></li> - - - - <li><a href="/releases/1.2.1/index.html">1.2.1</a></li> - - - - <li><a href="/releases/1.1.2/index.html">1.1.2</a></li> - - - - - - <li><a href="/releases/1.0.6/index.html">1.0.6</a></li> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - </ul> - </li> - <li><a href="/talksAndVideos.html">Talks and Slideshows</a></li> - <li class="dropdown"> - <a href="#" class="dropdown-toggle" data-toggle="dropdown" id="contribute">Community <b class="caret"></b></a> - <ul class="dropdown-menu"> - <li><a href="/contribute/Contributing-to-Storm.html">Contributing</a></li> - <li><a href="/contribute/People.html">People</a></li> - <li><a href="/contribute/BYLAWS.html">ByLaws</a></li> - </ul> - </li> - <li><a href="/2018/06/04/storm122-released.html" id="news">News</a></li> - </ul> - </nav> - </div> -</div> - - - - <div class="container-fluid"> - <h1 class="page-title">Configuration</h1> - <div class="row"> - <div class="col-md-12"> - <!-- Documentation --> - -<p class="post-meta"></p> - -<div class="documentation-content"><p>Storm has a variety of configurations for tweaking the behavior of nimbus, supervisors, and running topologies. Some configurations are system configurations and cannot be modified on a topology by topology basis, whereas other configurations can be modified per topology. </p> - -<p>Every configuration has a default value defined in <a href="http://github.com/apache/storm/blob/v1.2.1/conf/defaults.yaml">defaults.yaml</a> in the Storm codebase. You can override these configurations by defining a storm.yaml in the classpath of Nimbus and the supervisors. Finally, you can define a topology-specific configuration that you submit along with your topology when using <a href="javadocs/org/apache/storm/StormSubmitter.html">StormSubmitter</a>. However, the topology-specific configuration can only override configs prefixed with "TOPOLOGY".</p> - -<p>Storm 0.7.0 and onwards lets you override configuration on a per-bolt/per-spout basis. The only configurations that can be overriden this way are:</p> - -<ol> -<li>"topology.debug"</li> -<li>"topology.max.spout.pending"</li> -<li>"topology.max.task.parallelism"</li> -<li>"topology.kryo.register": This works a little bit differently than the other ones, since the serializations will be available to all components in the topology. More details on <a href="Serialization.html">Serialization</a>. </li> -</ol> - -<p>The Java API lets you specify component specific configurations in two ways:</p> - -<ol> -<li><em>Internally:</em> Override <code>getComponentConfiguration</code> in any spout or bolt and return the component-specific configuration map.</li> -<li><em>Externally:</em> <code>setSpout</code> and <code>setBolt</code> in <code>TopologyBuilder</code> return an object with methods <code>addConfiguration</code> and <code>addConfigurations</code> that can be used to override the configurations for the component.</li> -</ol> - -<p>The preference order for configuration values is defaults.yaml < storm.yaml < topology specific configuration < internal component specific configuration < external component specific configuration. </p> - -<p><strong>Resources:</strong></p> - -<ul> -<li><a href="javadocs/org/apache/storm/Config.html">Config</a>: a listing of all configurations as well as a helper class for creating topology specific configurations</li> -<li><a href="http://github.com/apache/storm/blob/v1.2.1/conf/defaults.yaml">defaults.yaml</a>: the default values for all configurations</li> -<li><a href="Setting-up-a-Storm-cluster.html">Setting up a Storm cluster</a>: explains how to create and configure a Storm cluster</li> -<li><a href="Running-topologies-on-a-production-cluster.html">Running topologies on a production cluster</a>: lists useful configurations when running topologies on a cluster</li> -<li><a href="Local-mode.html">Local mode</a>: lists useful configurations when using local mode</li> -</ul> -</div> - - - </div> - </div> - </div> -<footer> - <div class="container-fluid"> - <div class="row"> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>Meetups</h5> - <ul class="latest-news"> - - <li><a href="http://www.meetup.com/Apache-Storm-Apache-Kafka/">Apache Storm & Apache Kafka</a> <span class="small">(Sunnyvale, CA)</span></li> - - <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Apache Storm & Kafka Users</a> <span class="small">(Seattle, WA)</span></li> - - <li><a href="http://www.meetup.com/New-York-City-Storm-User-Group/">NYC Storm User Group</a> <span class="small">(New York, NY)</span></li> - - <li><a href="http://www.meetup.com/Bay-Area-Stream-Processing">Bay Area Stream Processing</a> <span class="small">(Emeryville, CA)</span></li> - - <li><a href="http://www.meetup.com/Boston-Storm-Users/">Boston Realtime Data</a> <span class="small">(Boston, MA)</span></li> - - <li><a href="http://www.meetup.com/storm-london">London Storm User Group</a> <span class="small">(London, UK)</span></li> - - <!-- <li><a href="http://www.meetup.com/Apache-Storm-Kafka-Users/">Seatle, WA</a> <span class="small">(27 Jun 2015)</span></li> --> - </ul> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>About Storm</h5> - <p>Storm integrates with any queueing system and any database system. Storm's spout abstraction makes it easy to integrate a new queuing system. Likewise, integrating Storm with database systems is easy.</p> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>First Look</h5> - <ul class="footer-list"> - <li><a href="/releases/current/Rationale.html">Rationale</a></li> - <li><a href="/releases/current/Tutorial.html">Tutorial</a></li> - <li><a href="/releases/current/Setting-up-development-environment.html">Setting up development environment</a></li> - <li><a href="/releases/current/Creating-a-new-Storm-project.html">Creating a new Storm project</a></li> - </ul> - </div> - </div> - <div class="col-md-3"> - <div class="footer-widget"> - <h5>Documentation</h5> - <ul class="footer-list"> - <li><a href="/releases/current/index.html">Index</a></li> - <li><a href="/releases/current/javadocs/index.html">Javadoc</a></li> - <li><a href="/releases/current/FAQ.html">FAQ</a></li> - </ul> - </div> - </div> - </div> - <hr/> - <div class="row"> - <div class="col-md-12"> - <p align="center">Copyright © 2015 <a href="http://www.apache.org">Apache Software Foundation</a>. All Rights Reserved. - <br>Apache Storm, Apache, the Apache feather logo, and the Apache Storm project logos are trademarks of The Apache Software Foundation. - <br>All other marks mentioned may be trademarks or registered trademarks of their respective owners.</p> - </div> - </div> - </div> -</footer> -<!--Footer End--> -<!-- Scroll to top --> -<span class="totop"><a href="#"><i class="fa fa-angle-up"></i></a></span> - -</body> - -</html> -
