rmatharu commented on a change in pull request #1064: SAMZA-1949: Add java docs and configuration documentation for side inputs URL: https://github.com/apache/samza/pull/1064#discussion_r297777416
########## File path: docs/learn/documentation/versioned/jobs/samza-configurations.md ########## @@ -274,6 +274,8 @@ These properties define Samza's storage mechanism for efficient [stateful stream |stores.**_store-name_**.<br>rocksdb.max.log.file.size.bytes|67108864|The maximum size in bytes of the RocksDB LOG file before it is rotated.| |stores.**_store-name_**.<br>rocksdb.keep.log.file.num|2|The number of RocksDB LOG files (including rotated LOG.old.* files) to keep.| |stores.**_store-name_**.<br>rocksdb.metrics.list|(none)|A list of [RocksDB properties](https://github.com/facebook/rocksdb/blob/master/include/rocksdb/db.h#L409) to expose as metrics (gauges).| +|stores.**_store-name_**.<br>side.inputs|(none)|For Samza applications with large stores that are periodically populated by a secondary data sources such as HDFS, but otherwise ready-only; Side inputs feature makes it easier and efficient to populate data. Stores configured with side inputs use the the source streams to bootstrap in case of failure thereby, reducing additional copy of the data in changelog. The value is a comma-separated list of streams.<br> Each stream is of the format `system-name.stream-name`. Additionally, applications should add the side inputs to job inputs (`task.inputs`) and configure side input processor (`stores.store-name.side.inputs.processor.factory`). Review comment: Immense nitpick: "Samza applications with stores that are updated by secondary data sources, e.g., output of batch-job written to HDFS, can leverage Samza's side inputs feature, which allows applications to plugin and use such inputs seamlessly without any special case handling.".... ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
