Github user priyank5485 commented on a diff in the pull request:
https://github.com/apache/storm/pull/1586#discussion_r72305457
--- Diff: external/storm-kinesis/README.md ---
@@ -0,0 +1,139 @@
+#Storm Kinesis Spout
+Provides core storm spout for consuming data from a stream in Amazon
Kinesis Streams. It stores the sequence numbers that can be committed in
zookeeper and
+starts consuming records after that sequence number on restart by default.
Below is the code sample to create a sample topology that uses the spout. Each
+object used in configuring the spout is explained below. Ideally, the
number of spout tasks should be equal to number of shards in kinesis. However
each task
+can read from more than one shard.
+
+```java
+public class KinesisSpoutTopology {
+ public static void main (String args[]) throws
InvalidTopologyException, AuthorizationException, AlreadyAliveException {
+ String topologyName = args[0];
+ RecordToTupleMapper recordToTupleMapper = new
TestRecordToTupleMapper();
+ KinesisConnectionInfo kinesisConnectionInfo = new
KinesisConnectionInfo(new CredentialsProviderChain(), new
ClientConfiguration(), Regions.US_WEST_2,
+ 1000);
+ org.apache.storm.kinesis.spout.Config config = new
org.apache.storm.kinesis.spout.Config(args[1], ShardIteratorType.TRIM_HORIZON,
+ recordToTupleMapper, new Date(), new
ExponentialBackoffRetrier(), new ZkInfo(), kinesisConnectionInfo, 10000L);
+ KinesisSpout kinesisSpout = new KinesisSpout(config);
+ TopologyBuilder topologyBuilder = new TopologyBuilder();
+ topologyBuilder.setSpout("spout", kinesisSpout, 3);
+ topologyBuilder.setBolt("bolt", new KinesisBoltTest(),
1).shuffleGrouping("spout");
+ Config topologyConfig = new Config();
+ topologyConfig.setDebug(true);
+ topologyConfig.setNumWorkers(3);
+ StormSubmitter.submitTopology(topologyName, topologyConfig,
topologyBuilder.createTopology());
+ }
+}
+```
+As you can see above the spout takes an object of Config in its
constructor. The constructor of Config takes 8 objects as explained below.
+
+#### `String` streamName
+name of kinesis stream to consume data from
+
+#### `ShardIteratorType` shardIteratorType
+3 types are supported - TRIM_HORIZON(beginning of shard), LATEST and
AT_TIMESTAMP. By default this argument is ignored if state for shards
+is found in zookeeper. Hence they will apply the first time a topology is
started. If you want to use any of these in subsequent runs of the topology,
you
+will need to clear the state of zookeeper node used for storing sequence
numbers
+#### `RecordToTupleMapper` recordToTupleMapper
--- End diff --
Will change it
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---