[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117871=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117871 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 30/Jun/18 15:55 Start Date: 30/Jun/18 15:55 Worklog Time Spent: 10m Work Description: xinyuiscool edited a comment on issue #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#issuecomment-401548761 Thanks. It wasn't intended to make Samza logo huge compared to other runners :). Just saw you already put up a fix for them. Thanks a lot for the help! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117871) Time Spent: 5h 10m (was: 5h) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 5h 10m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117867 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 30/Jun/18 15:44 Start Date: 30/Jun/18 15:44 Worklog Time Spent: 10m Work Description: xinyuiscool commented on issue #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#issuecomment-401548761 Thanks. It wasn't intended to make Samza logo huge compared to other runners :). I will try to open another pr today to make it smaller. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117867) Time Spent: 5h (was: 4h 50m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 5h > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117758 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 30/Jun/18 04:51 Start Date: 30/Jun/18 04:51 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#issuecomment-401517206 I just made the tiny fixes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117758) Time Spent: 4h 50m (was: 4h 40m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 4h 50m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117757=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117757 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 30/Jun/18 04:49 Start Date: 30/Jun/18 04:49 Worklog Time Spent: 10m Work Description: asfgit closed pull request #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/src/_data/capability-matrix.yml b/src/_data/capability-matrix.yml index acac0ad40..508ac1f42 100644 --- a/src/_data/capability-matrix.yml +++ b/src/_data/capability-matrix.yml @@ -17,6 +17,8 @@ columns: name: JStorm - class: ibmstreams name: IBM Streams + - class: samza +name: Apache Samza categories: - description: What is being computed? @@ -64,6 +66,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: samza +l1: 'Yes' +l2: fully supported +l3: Supported with per-element transformation. - name: GroupByKey values: - class: model @@ -102,6 +108,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: samza +l1: 'Yes' +l2: fully supported +l3: "Uses Samza's partitionBy for key grouping and Beam's logic for window aggregation and triggering." - name: Flatten values: - class: model @@ -140,6 +150,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: samza +l1: 'Yes' +l2: fully supported +l3: '' - name: Combine values: - class: model @@ -178,6 +192,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: samza +l1: 'Yes' +l2: fully supported +l3: Use combiner for efficient pre-aggregation. - name: Composite Transforms values: - class: model @@ -216,6 +234,10 @@ categories: l1: 'Partially' l2: supported via inlining l3: '' + - class: samza +l1: 'Partially' +l2: supported via inlining +l3: '' - name: Side Inputs values: - class: model @@ -254,6 +276,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: samza +l1: 'Yes' +l2: fully supported +l3: Uses Samza's broadcast operator to distribute the side inputs. - name: Source API values: - class: model @@ -292,6 +318,10 @@ categories: l1: 'Yes' l2: fully supported l3: '' + - class: samza +l1: 'Yes' +l2: fully supported +l3: '' - name: Splittable DoFn values: - class: model @@ -329,7 +359,11 @@ categories: - class: ibmstreams l1: 'No' l2: not implemented -l3: +l3: + - class: samza +l1: 'No' +l2: not implemented +l3: - name: Metrics values: - class: model @@ -368,6 +402,10 @@ categories: l1: 'Partially' l2: All metrics types are supported. l3: Only attempted values are supported. No committed values for metrics. + - class: samza +l1: 'Partially' +l2: Counter and Gauge are supported. +l3: Only attempted values are supported. No committed values for metrics. - name: Stateful Processing values: - class: model @@ -406,6 +444,10 @@ categories: l1: 'Partially' l2: non-merging windows l3: '' + - class: samza +l1: 'Partially' +l2: non-merging windows +l3: 'States are backed up by either rocksDb KV store or in-memory hash map, and persist using changelog.' - description: Where in event time? anchor: where color-b: '37d' @@ -451,6 +493,10 @@ categories: l1: 'Yes' l2: supported l3: '' + - class: samza +l1: 'Yes' +l2: supported +l3: '' - name: Fixed windows values: - class: model @@ -489,6 +535,10 @@ categories: l1: 'Yes' l2: supported
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117742 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 30/Jun/18 03:31 Start Date: 30/Jun/18 03:31 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#discussion_r199311660 ## File path: src/get-started/beam-overview.md ## @@ -37,6 +37,7 @@ Beam currently supports Runners that work with the following distributed process * Apache Gearpump (incubating) ![Apache Gearpump logo]({{ "/images/logos/runners/gearpump.png" | prepend: site.baseurl }}) * Apache Spark ![Apache Spark logo]({{ "/images/logos/runners/spark.png" | prepend: site.baseurl }}) * Google Cloud Dataflow ![Google Cloud Dataflow logo]({{ "/images/logos/runners/dataflow.png" | prepend: site.baseurl }}) +* Apache Samza ![Apache Samza logo]({{ "/images/logos/runners/samza.png" | prepend: site.baseurl }}) Review comment: If you check out the generated page, this logo really is huge! It might be easy to style it in markdown, or find a smaller image. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117742) Time Spent: 4.5h (was: 4h 20m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 4.5h > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117743 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 30/Jun/18 03:31 Start Date: 30/Jun/18 03:31 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#discussion_r199311732 ## File path: src/documentation/index.md ## @@ -46,6 +46,7 @@ A Beam Runner runs a Beam pipeline on a specific (often distributed) data proces * [SparkRunner]({{ site.baseurl }}/documentation/runners/spark/): Runs on [Apache Spark](http://spark.apache.org). * [DataflowRunner]({{ site.baseurl }}/documentation/runners/dataflow/): Runs on [Google Cloud Dataflow](https://cloud.google.com/dataflow), a fully managed service within [Google Cloud Platform](https://cloud.google.com/). * [GearpumpRunner]({{ site.baseurl }}/documentation/runners/gearpump/): Runs on [Apache Gearpump (incubating)](http://gearpump.apache.org). +* [SamzaRunner]({{ site.baseurl }}/documentation/runners/samza/): Runs on [Apache Samza](http://samza.apache.org). Review comment: It is OK to have things on the site that are not yet released. We haven't really got a good process for that. So we put stuff up when it hits master, usually, and sometimes with a little note. You can leave it here. As a follow-up, they should be in alphabetical order. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117743) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 4.5h > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117575=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117575 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 29/Jun/18 20:17 Start Date: 29/Jun/18 20:17 Worklog Time Spent: 10m Work Description: xinyuiscool commented on issue #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#issuecomment-401462943 no problem. It's easy to revert it :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117575) Time Spent: 4h 20m (was: 4h 10m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 4h 20m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117568=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117568 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 29/Jun/18 19:54 Start Date: 29/Jun/18 19:54 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#issuecomment-401457589 Oh that was an old comment. Now it should go to the main page. Sorry! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117568) Time Spent: 4h 10m (was: 4h) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117498=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117498 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 29/Jun/18 17:43 Start Date: 29/Jun/18 17:43 Worklog Time Spent: 10m Work Description: xinyuiscool commented on a change in pull request #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#discussion_r199232774 ## File path: src/documentation/index.md ## @@ -46,6 +46,7 @@ A Beam Runner runs a Beam pipeline on a specific (often distributed) data proces * [SparkRunner]({{ site.baseurl }}/documentation/runners/spark/): Runs on [Apache Spark](http://spark.apache.org). * [DataflowRunner]({{ site.baseurl }}/documentation/runners/dataflow/): Runs on [Google Cloud Dataflow](https://cloud.google.com/dataflow), a fully managed service within [Google Cloud Platform](https://cloud.google.com/). * [GearpumpRunner]({{ site.baseurl }}/documentation/runners/gearpump/): Runs on [Apache Gearpump (incubating)](http://gearpump.apache.org). +* [SamzaRunner]({{ site.baseurl }}/documentation/runners/samza/): Runs on [Apache Samza](http://samza.apache.org). Review comment: Sure. Since Samza runner is already on master, I assume it will go out as 2.6.0 release? So we will update this part after the release, if I understand correctly. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117498) Time Spent: 4h (was: 3h 50m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 4h > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117497=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117497 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 29/Jun/18 17:43 Start Date: 29/Jun/18 17:43 Worklog Time Spent: 10m Work Description: xinyuiscool commented on a change in pull request #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#discussion_r199232774 ## File path: src/documentation/index.md ## @@ -46,6 +46,7 @@ A Beam Runner runs a Beam pipeline on a specific (often distributed) data proces * [SparkRunner]({{ site.baseurl }}/documentation/runners/spark/): Runs on [Apache Spark](http://spark.apache.org). * [DataflowRunner]({{ site.baseurl }}/documentation/runners/dataflow/): Runs on [Google Cloud Dataflow](https://cloud.google.com/dataflow), a fully managed service within [Google Cloud Platform](https://cloud.google.com/). * [GearpumpRunner]({{ site.baseurl }}/documentation/runners/gearpump/): Runs on [Apache Gearpump (incubating)](http://gearpump.apache.org). +* [SamzaRunner]({{ site.baseurl }}/documentation/runners/samza/): Runs on [Apache Samza](http://samza.apache.org). Review comment: ure. Since Samza runner is already on master, I assume it will go out as 2.6.0 release? So we will update this part after the release, if I understand correctly. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117497) Time Spent: 3h 50m (was: 3h 40m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117427 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 29/Jun/18 15:51 Start Date: 29/Jun/18 15:51 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#issuecomment-401396220 OK it turns out we should continue here until the web site is truly merged all the way. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117427) Time Spent: 3h 40m (was: 3.5h) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117426 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 29/Jun/18 15:51 Start Date: 29/Jun/18 15:51 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#discussion_r196122361 ## File path: src/documentation/index.md ## @@ -46,6 +46,7 @@ A Beam Runner runs a Beam pipeline on a specific (often distributed) data proces * [SparkRunner]({{ site.baseurl }}/documentation/runners/spark/): Runs on [Apache Spark](http://spark.apache.org). * [DataflowRunner]({{ site.baseurl }}/documentation/runners/dataflow/): Runs on [Google Cloud Dataflow](https://cloud.google.com/dataflow), a fully managed service within [Google Cloud Platform](https://cloud.google.com/). * [GearpumpRunner]({{ site.baseurl }}/documentation/runners/gearpump/): Runs on [Apache Gearpump (incubating)](http://gearpump.apache.org). +* [SamzaRunner]({{ site.baseurl }}/documentation/runners/samza/): Runs on [Apache Samza](http://samza.apache.org). Review comment: This area is actually reserved for runners with are on `master` and released as part of Beam. It is a bit confusing, since we include them all in the capability matrix. Currently the place we reference in-progress runners is http://apache-beam-website-pull-requests.storage.googleapis.com/471/contribute/index.html#works-in-progress This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 117426) Time Spent: 3.5h (was: 3h 20m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 3.5h > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=114953=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-114953 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 22/Jun/18 22:33 Start Date: 22/Jun/18 22:33 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#issuecomment-399600876 The website is actually being moved to the main repo, so let's wait just a little and then merge this over there. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 114953) Time Spent: 3h 20m (was: 3h 10m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=112783=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112783 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 18/Jun/18 15:27 Start Date: 18/Jun/18 15:27 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471#issuecomment-398094170 http://apache-beam-website-pull-requests.storage.googleapis.com/471/index.html This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 112783) Time Spent: 3h 10m (was: 3h) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=112779=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112779 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 18/Jun/18 15:25 Start Date: 18/Jun/18 15:25 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#discussion_r196119260 ## File path: runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/DoFnRunnerWithKeyedInternals.java ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.beam.runners.samza.runtime; + +import static com.google.common.base.Preconditions.checkState; + +import java.util.List; +import org.apache.beam.runners.core.DoFnRunner; +import org.apache.beam.runners.core.DoFnRunners; +import org.apache.beam.runners.core.KeyedWorkItem; +import org.apache.beam.runners.core.SideInputReader; +import org.apache.beam.runners.core.StateInternals; +import org.apache.beam.runners.core.StateInternalsFactory; +import org.apache.beam.runners.core.StatefulDoFnRunner; +import org.apache.beam.runners.core.StepContext; +import org.apache.beam.runners.core.TimerInternals; +import org.apache.beam.runners.core.TimerInternalsFactory; +import org.apache.beam.runners.samza.metrics.DoFnRunnerWithMetrics; +import org.apache.beam.runners.samza.metrics.SamzaMetricsContainer; +import org.apache.beam.sdk.coders.Coder; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.state.TimeDomain; +import org.apache.beam.sdk.transforms.DoFn; +import org.apache.beam.sdk.transforms.reflect.DoFnSignature; +import org.apache.beam.sdk.transforms.reflect.DoFnSignatures; +import org.apache.beam.sdk.transforms.windowing.BoundedWindow; +import org.apache.beam.sdk.util.WindowedValue; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.sdk.values.TypeDescriptor; +import org.apache.beam.sdk.values.WindowingStrategy; +import org.joda.time.Instant; + +/** + * This class wraps a DoFnRunner with keyed StateInternals and TimerInternals access. + */ +public class DoFnRunnerWithKeyedInternals implements DoFnRunner { + private final DoFnRunner underlying; + private final KeyedInternals keyedInternals; + + public static DoFnRunner of( + PipelineOptions options, + DoFn doFn, + SideInputReader sideInputReader, + DoFnRunners.OutputManager outputManager, + TupleTag mainOutputTag, + List> additionalOutputTags, + StateInternalsFactory stateInternalsFactory, + TimerInternalsFactory timerInternalsFactory, + WindowingStrategy windowingStrategy, + SamzaMetricsContainer metricsContainer, + String stepName) { + +final DoFnSignature signature = DoFnSignatures.getSignature(doFn.getClass()); +final KeyedInternals keyedInternals; +final TimerInternals timerInternals; +final StateInternals stateInternals; + +if (signature.usesState()) { + keyedInternals = new KeyedInternals(stateInternalsFactory, timerInternalsFactory); + stateInternals = keyedInternals.stateInternals(); + timerInternals = keyedInternals.timerInternals(); +} else { + keyedInternals = null; + stateInternals = stateInternalsFactory.stateInternalsForKey(null); + timerInternals = timerInternalsFactory.timerInternalsForKey(null); +} + +final DoFnRunner doFnRunner = DoFnRunners.simpleRunner( +options, +doFn, +sideInputReader, +outputManager, +mainOutputTag, +additionalOutputTags, +createStepContext(stateInternals, timerInternals), +windowingStrategy); + +final DoFnRunner doFnRunnerWithMetrics = DoFnRunnerWithMetrics +.wrap(doFnRunner, metricsContainer, stepName); + +if (keyedInternals != null) { Review comment: This conditional goes right along with the one above to make just two code paths: the one
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=112780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112780 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 18/Jun/18 15:25 Start Date: 18/Jun/18 15:25 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#discussion_r196119675 ## File path: runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/DoFnRunnerWithKeyedInternals.java ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.beam.runners.samza.runtime; + +import static com.google.common.base.Preconditions.checkState; + +import java.util.List; +import org.apache.beam.runners.core.DoFnRunner; +import org.apache.beam.runners.core.DoFnRunners; +import org.apache.beam.runners.core.KeyedWorkItem; +import org.apache.beam.runners.core.SideInputReader; +import org.apache.beam.runners.core.StateInternals; +import org.apache.beam.runners.core.StateInternalsFactory; +import org.apache.beam.runners.core.StatefulDoFnRunner; +import org.apache.beam.runners.core.StepContext; +import org.apache.beam.runners.core.TimerInternals; +import org.apache.beam.runners.core.TimerInternalsFactory; +import org.apache.beam.runners.samza.metrics.DoFnRunnerWithMetrics; +import org.apache.beam.runners.samza.metrics.SamzaMetricsContainer; +import org.apache.beam.sdk.coders.Coder; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.state.TimeDomain; +import org.apache.beam.sdk.transforms.DoFn; +import org.apache.beam.sdk.transforms.reflect.DoFnSignature; +import org.apache.beam.sdk.transforms.reflect.DoFnSignatures; +import org.apache.beam.sdk.transforms.windowing.BoundedWindow; +import org.apache.beam.sdk.util.WindowedValue; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.sdk.values.TypeDescriptor; +import org.apache.beam.sdk.values.WindowingStrategy; +import org.joda.time.Instant; + +/** + * This class wraps a DoFnRunner with keyed StateInternals and TimerInternals access. + */ +public class DoFnRunnerWithKeyedInternals implements DoFnRunner { + private final DoFnRunner underlying; + private final KeyedInternals keyedInternals; + + public static DoFnRunner of( + PipelineOptions options, + DoFn doFn, + SideInputReader sideInputReader, + DoFnRunners.OutputManager outputManager, + TupleTag mainOutputTag, + List> additionalOutputTags, + StateInternalsFactory stateInternalsFactory, + TimerInternalsFactory timerInternalsFactory, + WindowingStrategy windowingStrategy, + SamzaMetricsContainer metricsContainer, + String stepName) { + +final DoFnSignature signature = DoFnSignatures.getSignature(doFn.getClass()); +final KeyedInternals keyedInternals; +final TimerInternals timerInternals; +final StateInternals stateInternals; + +if (signature.usesState()) { + keyedInternals = new KeyedInternals(stateInternalsFactory, timerInternalsFactory); + stateInternals = keyedInternals.stateInternals(); + timerInternals = keyedInternals.timerInternals(); +} else { + keyedInternals = null; + stateInternals = stateInternalsFactory.stateInternalsForKey(null); + timerInternals = timerInternalsFactory.timerInternalsForKey(null); +} + +final DoFnRunner doFnRunner = DoFnRunners.simpleRunner( +options, +doFn, +sideInputReader, +outputManager, +mainOutputTag, +additionalOutputTags, +createStepContext(stateInternals, timerInternals), Review comment: Ah, is this why it is organized this way? OK. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=112781=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112781 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 18/Jun/18 15:25 Start Date: 18/Jun/18 15:25 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#discussion_r196119044 ## File path: runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/DoFnRunnerWithKeyedInternals.java ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.beam.runners.samza.runtime; + +import static com.google.common.base.Preconditions.checkState; + +import java.util.List; +import org.apache.beam.runners.core.DoFnRunner; +import org.apache.beam.runners.core.DoFnRunners; +import org.apache.beam.runners.core.KeyedWorkItem; +import org.apache.beam.runners.core.SideInputReader; +import org.apache.beam.runners.core.StateInternals; +import org.apache.beam.runners.core.StateInternalsFactory; +import org.apache.beam.runners.core.StatefulDoFnRunner; +import org.apache.beam.runners.core.StepContext; +import org.apache.beam.runners.core.TimerInternals; +import org.apache.beam.runners.core.TimerInternalsFactory; +import org.apache.beam.runners.samza.metrics.DoFnRunnerWithMetrics; +import org.apache.beam.runners.samza.metrics.SamzaMetricsContainer; +import org.apache.beam.sdk.coders.Coder; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.state.TimeDomain; +import org.apache.beam.sdk.transforms.DoFn; +import org.apache.beam.sdk.transforms.reflect.DoFnSignature; +import org.apache.beam.sdk.transforms.reflect.DoFnSignatures; +import org.apache.beam.sdk.transforms.windowing.BoundedWindow; +import org.apache.beam.sdk.util.WindowedValue; +import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.sdk.values.TypeDescriptor; +import org.apache.beam.sdk.values.WindowingStrategy; +import org.joda.time.Instant; + +/** + * This class wraps a DoFnRunner with keyed StateInternals and TimerInternals access. + */ +public class DoFnRunnerWithKeyedInternals implements DoFnRunner { + private final DoFnRunner underlying; + private final KeyedInternals keyedInternals; + + public static DoFnRunner of( + PipelineOptions options, + DoFn doFn, + SideInputReader sideInputReader, + DoFnRunners.OutputManager outputManager, + TupleTag mainOutputTag, + List> additionalOutputTags, + StateInternalsFactory stateInternalsFactory, + TimerInternalsFactory timerInternalsFactory, + WindowingStrategy windowingStrategy, + SamzaMetricsContainer metricsContainer, + String stepName) { + +final DoFnSignature signature = DoFnSignatures.getSignature(doFn.getClass()); +final KeyedInternals keyedInternals; +final TimerInternals timerInternals; +final StateInternals stateInternals; + +if (signature.usesState()) { Review comment: If it doesn't use state, you don't need to create a `DoFnRunnerWithKeyedInternals`, do you? I think it would clean this up to refactor the constructor without the conditionals here and below. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 112781) Time Spent: 2h 50m (was: 2h 40m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority:
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=112414=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112414 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 15/Jun/18 18:45 Start Date: 15/Jun/18 18:45 Worklog Time Spent: 10m Work Description: xinyuiscool opened a new pull request #471: [BEAM-3079]: Samza Runner docs and capability matrix URL: https://github.com/apache/beam-site/pull/471 Add the Samza runner docs, and update the capability matrix and wordcount examples with Samza runner. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 112414) Time Spent: 2.5h (was: 2h 20m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 2.5h > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=63=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-63 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 12/Jun/18 17:38 Start Date: 12/Jun/18 17:38 Worklog Time Spent: 10m Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#issuecomment-396673409 hmm, seems the headers are missing again in this patch. Let me quickly add them. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 63) Time Spent: 2h 20m (was: 2h 10m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 2h 20m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=56=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-56 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 12/Jun/18 17:27 Start Date: 12/Jun/18 17:27 Worklog Time Spent: 10m Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#issuecomment-396669181 @kennknowles : I squashed all my commits into one so the changes are separated from the upstream merge. I think UsesImpulses tests are added in the master and Samza doesn't support it right now. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 56) Time Spent: 2h 10m (was: 2h) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=50=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-50 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 12/Jun/18 17:25 Start Date: 12/Jun/18 17:25 Worklog Time Spent: 10m Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#issuecomment-396669181 @kennknowles : I squashed all the commits into one so the changes are separated from the upstream merge. I think UsesImpulses tests are added in the master and Samza doesn't support it right now. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 50) Time Spent: 2h (was: 1h 50m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 2h > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=49=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-49 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 12/Jun/18 17:24 Start Date: 12/Jun/18 17:24 Worklog Time Spent: 10m Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#issuecomment-396669181 @kennknowles : I squashed all the commits into one so the changes are separated from the merge. I think UsesImpulses tests are added in the master and Samza doesn't support it right now. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 49) Time Spent: 1h 50m (was: 1h 40m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=05=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-05 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 12/Jun/18 16:15 Start Date: 12/Jun/18 16:15 Worklog Time Spent: 10m Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#issuecomment-396648065 Sorry about it. Let me take a look. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 05) Time Spent: 1h 40m (was: 1.5h) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=110949=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110949 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 12/Jun/18 04:51 Start Date: 12/Jun/18 04:51 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#issuecomment-396464975 There's a merge commit I see from `master` in there. And in the diff I see `UsesImpulse` being added. Can you separate the upstream sync from the upgrade, or is that not possible? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 110949) Time Spent: 1.5h (was: 1h 20m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 1.5h > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=110707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110707 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 11/Jun/18 18:05 Start Date: 11/Jun/18 18:05 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#issuecomment-39623 Yes, sorry for the delay! Reviewing today. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 110707) Time Spent: 1h 20m (was: 1h 10m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=110706=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110706 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 11/Jun/18 18:01 Start Date: 11/Jun/18 18:01 Worklog Time Spent: 10m Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#issuecomment-396331962 @kennknowles : could you please help review it when you get a chance? Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 110706) Time Spent: 1h 10m (was: 1h) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=107520=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107520 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 31/May/18 03:05 Start Date: 31/May/18 03:05 Worklog Time Spent: 10m Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#issuecomment-393386979 Fixed the headers. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 107520) Time Spent: 1h (was: 50m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 1h > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=107518=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107518 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 31/May/18 02:32 Start Date: 31/May/18 02:32 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517#issuecomment-393382260 The reason all three precommits failed is `./gradlew :rat` which checks license headers. Go ahead and fix that and I will go ahead and review. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 107518) Time Spent: 50m (was: 40m) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 50m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=107255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107255 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 30/May/18 17:00 Start Date: 30/May/18 17:00 Worklog Time Spent: 10m Work Description: xinyuiscool opened a new pull request #5517: [BEAM-3079] Update samza-runner with more features and improvements URL: https://github.com/apache/beam/pull/5517 Add the following feature support: - Stateful ParDo - Trigger using processing-time Improvements: - Use URN to translate PTransform - Direct translation of Combine to avoid events buffering - Allow sideinput watermark to populate separately - Support broadcasting PCollectionView - Make state store point to /tmp when running the tests This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 107255) Time Spent: 40m (was: 0.5h) > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 40m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-3079) Samza runner
[ https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=106883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106883 ] ASF GitHub Bot logged work on BEAM-3079: Author: ASF GitHub Bot Created on: 29/May/18 22:22 Start Date: 29/May/18 22:22 Worklog Time Spent: 10m Work Description: xinyuiscool opened a new pull request #5505: [BEAM-3079] Rebase Samza runner with master URL: https://github.com/apache/beam/pull/5505 Merge the latest master to samza-runner branch. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 106883) Time Spent: 10m Remaining Estimate: 0h > Samza runner > > > Key: BEAM-3079 > URL: https://issues.apache.org/jira/browse/BEAM-3079 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Xinyu Liu >Assignee: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > Time Spent: 10m > Remaining Estimate: 0h > > Apache Samza is a distributed data-processing platform which supports both > stream and batch processing. It'll be awesome if we can run BEAM's advanced > data transform and multi-language sdks on top of Samza. -- This message was sent by Atlassian JIRA (v7.6.3#76005)