----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34974/#review93553 -----------------------------------------------------------
Lgtm, overall! I was able to test it as well. Some feedback: Would it possible to mark the SSP of the IncomingMessageEnvelope with a boolean flag indicating whether it is from a broadcast stream or not? When I tried writing a sample code, I realized that the streamTask has to be aware of the exact topic name in order to figure out if the message was from a broadcast system stream. If we had an API such as "envelope.getSystemStreamPartition().isBroadcast()", I think it will be more convenient. What do you think? docs/learn/documentation/versioned/jobs/configuration-table.html (line 453) <https://reviews.apache.org/r/34974/#comment147930> Rephrase - "This property specifies the partitions that all tasks should consume." samza-api/src/main/java/org/apache/samza/system/SystemAdmin.java (line 73) <https://reviews.apache.org/r/34974/#comment147929> malformed html error in javadocs for '<' and '>' Also, in Line 79 I think the delay in my review caused a few more commits to go through. So, you have to rebase and update your patch again. I found this error: samza-hdfs/src/main/scala/org/apache/samza/system/hdfs/HdfsSystemAdmin.scala:27: error: class HdfsSystemAdmin needs to be abstract, since method offsetComparator in trait SystemAdmin of type (x$1: String, x$2: String)Integer is not defined class HdfsSystemAdmin extends SystemAdmin with Logging { ^ one error found I guess you can just throw an UnsupportedMethodException for now. - Navina Ramesh On July 29, 2015, 10:49 p.m., Yan Fang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/34974/ > ----------------------------------------------------------- > > (Updated July 29, 2015, 10:49 p.m.) > > > Review request for samza. > > > Bugs: SAMZA-676 > https://issues.apache.org/jira/browse/SAMZA-676 > > > Repository: samza > > > Description > ------- > > 1. added offsetComparator method in SystemAdmin Interface > > 2. added "task.global.inputs" config > > 3. rewrote Grouper classes using Java; allows to assign global streams during > grouping > > 4. used LinkedHashSet instead of HashSet in CoordinatorStreamSystemConsumer > to preserve messages order > > 5. added taskNames to the offsets in OffsetManager > > 6. allowed to assign one SSP to multiple taskInstances > > 7. skipped already-processed messages in RunLoop > > 8. unit tests for all changes > > > Diffs > ----- > > checkstyle/import-control.xml 6654319 > docs/learn/documentation/versioned/container/samza-container.md 9f46414 > docs/learn/documentation/versioned/jobs/configuration-table.html ea73b40 > samza-api/src/main/java/org/apache/samza/system/SystemAdmin.java 7a588eb > > samza-api/src/main/java/org/apache/samza/util/SinglePartitionWithoutOffsetsSystemAdmin.java > 249b8ae > samza-core/src/main/java/org/apache/samza/config/TaskConfigJava.java > PRE-CREATION > > samza-core/src/main/java/org/apache/samza/container/grouper/stream/GroupByPartition.java > PRE-CREATION > > samza-core/src/main/java/org/apache/samza/container/grouper/stream/GroupByPartitionFactory.java > PRE-CREATION > > samza-core/src/main/java/org/apache/samza/container/grouper/stream/GroupBySystemStreamPartition.java > PRE-CREATION > > samza-core/src/main/java/org/apache/samza/container/grouper/stream/GroupBySystemStreamPartitionFactory.java > PRE-CREATION > samza-core/src/main/scala/org/apache/samza/checkpoint/OffsetManager.scala > 20e5d26 > samza-core/src/main/scala/org/apache/samza/config/JobConfig.scala e4b14f4 > samza-core/src/main/scala/org/apache/samza/container/RunLoop.scala c292ae4 > samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala > 27b2517 > samza-core/src/main/scala/org/apache/samza/container/TaskInstance.scala > c5a5ea5 > > samza-core/src/main/scala/org/apache/samza/container/TaskInstanceMetrics.scala > 9dc7051 > > samza-core/src/main/scala/org/apache/samza/container/grouper/stream/GroupByPartition.scala > 44e95fc > > samza-core/src/main/scala/org/apache/samza/container/grouper/stream/GroupBySystemStreamPartition.scala > 3c0acad > > samza-core/src/main/scala/org/apache/samza/system/filereader/FileReaderSystemAdmin.scala > 097f410 > samza-core/src/test/java/org/apache/samza/config/TestTaskConfigJava.java > PRE-CREATION > > samza-core/src/test/java/org/apache/samza/container/grouper/stream/TestGroupByPartition.java > PRE-CREATION > > samza-core/src/test/java/org/apache/samza/container/grouper/stream/TestGroupBySystemStreamPartition.java > PRE-CREATION > > samza-core/src/test/scala/org/apache/samza/checkpoint/TestOffsetManager.scala > 8d54c46 > samza-core/src/test/scala/org/apache/samza/container/TestRunLoop.scala > 64a5844 > > samza-core/src/test/scala/org/apache/samza/container/TestSamzaContainer.scala > 84fdeaa > samza-core/src/test/scala/org/apache/samza/container/TestTaskInstance.scala > 7caad28 > > samza-core/src/test/scala/org/apache/samza/container/grouper/stream/GroupByTestBase.scala > a14169b > > samza-core/src/test/scala/org/apache/samza/container/grouper/stream/TestGroupByPartition.scala > 74daf72 > > samza-core/src/test/scala/org/apache/samza/container/grouper/stream/TestGroupBySystemStreamPartition.scala > deb3895 > > samza-core/src/test/scala/org/apache/samza/coordinator/TestJobCoordinator.scala > 4097ac7 > > samza-elasticsearch/src/main/java/org/apache/samza/system/elasticsearch/ElasticsearchSystemAdmin.java > 1fd5dd3 > > samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemAdmin.scala > 35086f5 > > samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemConsumer.scala > de00320 > > samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemFactory.scala > 1629035 > > samza-kafka/src/test/scala/org/apache/samza/system/kafka/TestKafkaSystemConsumer.scala > 2a84328 > samza-test/src/main/java/org/apache/samza/system/mock/MockSystemAdmin.java > b063366 > > samza-yarn/src/test/scala/org/apache/samza/job/yarn/TestSamzaAppMasterTaskManager.scala > 1e936b4 > > Diff: https://reviews.apache.org/r/34974/diff/ > > > Testing > ------- > > > Thanks, > > Yan Fang > >