[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117871=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117871
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 30/Jun/18 15:55
Start Date: 30/Jun/18 15:55
Worklog Time Spent: 10m 
  Work Description: xinyuiscool edited a comment on issue #471: 
[BEAM-3079]: Samza Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#issuecomment-401548761
 
 
   Thanks. It wasn't intended to make Samza logo huge compared to other runners 
:). Just saw you already put up a fix for them. Thanks a lot for the help!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117871)
Time Spent: 5h 10m  (was: 5h)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-samza
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117867
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 30/Jun/18 15:44
Start Date: 30/Jun/18 15:44
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on issue #471: [BEAM-3079]: Samza 
Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#issuecomment-401548761
 
 
   Thanks. It wasn't intended to make Samza logo huge compared to other runners 
:). I will try to open another pr today to make it smaller.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117867)
Time Spent: 5h  (was: 4h 50m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-samza
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117758
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 30/Jun/18 04:51
Start Date: 30/Jun/18 04:51
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #471: [BEAM-3079]: Samza 
Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#issuecomment-401517206
 
 
   I just made the tiny fixes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117758)
Time Spent: 4h 50m  (was: 4h 40m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-samza
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117757=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117757
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 30/Jun/18 04:49
Start Date: 30/Jun/18 04:49
Worklog Time Spent: 10m 
  Work Description: asfgit closed pull request #471: [BEAM-3079]: Samza 
Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/_data/capability-matrix.yml b/src/_data/capability-matrix.yml
index acac0ad40..508ac1f42 100644
--- a/src/_data/capability-matrix.yml
+++ b/src/_data/capability-matrix.yml
@@ -17,6 +17,8 @@ columns:
 name: JStorm
   - class: ibmstreams
 name: IBM Streams
+  - class: samza
+name: Apache Samza
 
 categories:
   - description: What is being computed?
@@ -64,6 +66,10 @@ categories:
 l1: 'Yes'
 l2: fully supported
 l3: ''
+  - class: samza
+l1: 'Yes'
+l2: fully supported
+l3: Supported with per-element transformation.
   - name: GroupByKey
 values:
   - class: model
@@ -102,6 +108,10 @@ categories:
 l1: 'Yes'
 l2: fully supported
 l3: ''
+  - class: samza
+l1: 'Yes'
+l2: fully supported
+l3: "Uses Samza's partitionBy for key grouping and Beam's logic 
for window aggregation and triggering."
   - name: Flatten
 values:
   - class: model
@@ -140,6 +150,10 @@ categories:
 l1: 'Yes'
 l2: fully supported
 l3: ''
+  - class: samza
+l1: 'Yes'
+l2: fully supported
+l3: ''
   - name: Combine
 values:
   - class: model
@@ -178,6 +192,10 @@ categories:
 l1: 'Yes'
 l2: fully supported
 l3: ''
+  - class: samza
+l1: 'Yes'
+l2: fully supported
+l3: Use combiner for efficient pre-aggregation.
   - name: Composite Transforms
 values:
   - class: model
@@ -216,6 +234,10 @@ categories:
 l1: 'Partially'
 l2: supported via inlining
 l3: ''
+  - class: samza
+l1: 'Partially'
+l2: supported via inlining
+l3: ''
   - name: Side Inputs
 values:
   - class: model
@@ -254,6 +276,10 @@ categories:
 l1: 'Yes'
 l2: fully supported
 l3: ''
+  - class: samza
+l1: 'Yes'
+l2: fully supported
+l3: Uses Samza's broadcast operator to distribute the side inputs.
   - name: Source API
 values:
   - class: model
@@ -292,6 +318,10 @@ categories:
 l1: 'Yes'
 l2: fully supported
 l3: ''
+  - class: samza
+l1: 'Yes'
+l2: fully supported
+l3: ''
   - name: Splittable DoFn
 values:
   - class: model
@@ -329,7 +359,11 @@ categories:
   - class: ibmstreams
 l1: 'No'
 l2: not implemented
-l3: 
+l3:
+  - class: samza
+l1: 'No'
+l2: not implemented
+l3:
   - name: Metrics
 values:
   - class: model
@@ -368,6 +402,10 @@ categories:
 l1: 'Partially'
 l2: All metrics types are supported.
 l3: Only attempted values are supported. No committed values for 
metrics.
+  - class: samza
+l1: 'Partially'
+l2: Counter and Gauge are supported.
+l3: Only attempted values are supported. No committed values for 
metrics.
   - name: Stateful Processing
 values:
   - class: model
@@ -406,6 +444,10 @@ categories:
 l1: 'Partially'
 l2: non-merging windows
 l3: ''
+  - class: samza
+l1: 'Partially'
+l2: non-merging windows
+l3: 'States are backed up by either rocksDb KV store or in-memory 
hash map, and persist using changelog.'
   - description: Where in event time?
 anchor: where
 color-b: '37d'
@@ -451,6 +493,10 @@ categories:
 l1: 'Yes'
 l2: supported
 l3: ''
+  - class: samza
+l1: 'Yes'
+l2: supported
+l3: ''
   - name: Fixed windows
 values:
   - class: model
@@ -489,6 +535,10 @@ categories:
 l1: 'Yes'
 l2: supported

[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117742
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 30/Jun/18 03:31
Start Date: 30/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on a change in pull request #471: 
[BEAM-3079]: Samza Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#discussion_r199311660
 
 

 ##
 File path: src/get-started/beam-overview.md
 ##
 @@ -37,6 +37,7 @@ Beam currently supports Runners that work with the following 
distributed process
 * Apache Gearpump (incubating) ![Apache Gearpump logo]({{ 
"/images/logos/runners/gearpump.png" | prepend: site.baseurl }})
 * Apache Spark ![Apache Spark logo]({{ "/images/logos/runners/spark.png" | 
prepend: site.baseurl }})
 * Google Cloud Dataflow ![Google Cloud Dataflow logo]({{ 
"/images/logos/runners/dataflow.png" | prepend: site.baseurl }})
+* Apache Samza ![Apache Samza logo]({{ "/images/logos/runners/samza.png" | 
prepend: site.baseurl }})
 
 Review comment:
   If you check out the generated page, this logo really is huge! It might be 
easy to style it in markdown, or find a smaller image.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117742)
Time Spent: 4.5h  (was: 4h 20m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-samza
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117743
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 30/Jun/18 03:31
Start Date: 30/Jun/18 03:31
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on a change in pull request #471: 
[BEAM-3079]: Samza Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#discussion_r199311732
 
 

 ##
 File path: src/documentation/index.md
 ##
 @@ -46,6 +46,7 @@ A Beam Runner runs a Beam pipeline on a specific (often 
distributed) data proces
 * [SparkRunner]({{ site.baseurl }}/documentation/runners/spark/): Runs on 
[Apache Spark](http://spark.apache.org).
 * [DataflowRunner]({{ site.baseurl }}/documentation/runners/dataflow/): Runs 
on [Google Cloud Dataflow](https://cloud.google.com/dataflow), a fully managed 
service within [Google Cloud Platform](https://cloud.google.com/).
 * [GearpumpRunner]({{ site.baseurl }}/documentation/runners/gearpump/): Runs 
on [Apache Gearpump (incubating)](http://gearpump.apache.org).
+* [SamzaRunner]({{ site.baseurl }}/documentation/runners/samza/): Runs on 
[Apache Samza](http://samza.apache.org).
 
 Review comment:
   It is OK to have things on the site that are not yet released. We haven't 
really got a good process for that. So we put stuff up when it hits master, 
usually, and sometimes with a little note. You can leave it here. As a 
follow-up, they should be in alphabetical order.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117743)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-samza
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117575=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117575
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 29/Jun/18 20:17
Start Date: 29/Jun/18 20:17
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on issue #471: [BEAM-3079]: Samza 
Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#issuecomment-401462943
 
 
   no problem. It's easy to revert it :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117575)
Time Spent: 4h 20m  (was: 4h 10m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-samza
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117568=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117568
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 29/Jun/18 19:54
Start Date: 29/Jun/18 19:54
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #471: [BEAM-3079]: Samza 
Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#issuecomment-401457589
 
 
   Oh that was an old comment. Now it should go to the main page. Sorry!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117568)
Time Spent: 4h 10m  (was: 4h)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-samza
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117498=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117498
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 29/Jun/18 17:43
Start Date: 29/Jun/18 17:43
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on a change in pull request #471: 
[BEAM-3079]: Samza Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#discussion_r199232774
 
 

 ##
 File path: src/documentation/index.md
 ##
 @@ -46,6 +46,7 @@ A Beam Runner runs a Beam pipeline on a specific (often 
distributed) data proces
 * [SparkRunner]({{ site.baseurl }}/documentation/runners/spark/): Runs on 
[Apache Spark](http://spark.apache.org).
 * [DataflowRunner]({{ site.baseurl }}/documentation/runners/dataflow/): Runs 
on [Google Cloud Dataflow](https://cloud.google.com/dataflow), a fully managed 
service within [Google Cloud Platform](https://cloud.google.com/).
 * [GearpumpRunner]({{ site.baseurl }}/documentation/runners/gearpump/): Runs 
on [Apache Gearpump (incubating)](http://gearpump.apache.org).
+* [SamzaRunner]({{ site.baseurl }}/documentation/runners/samza/): Runs on 
[Apache Samza](http://samza.apache.org).
 
 Review comment:
   Sure. Since Samza runner is already on master, I assume it will go out as 
2.6.0 release? So we will update this part after the release, if I understand 
correctly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117498)
Time Spent: 4h  (was: 3h 50m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117497=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117497
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 29/Jun/18 17:43
Start Date: 29/Jun/18 17:43
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on a change in pull request #471: 
[BEAM-3079]: Samza Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#discussion_r199232774
 
 

 ##
 File path: src/documentation/index.md
 ##
 @@ -46,6 +46,7 @@ A Beam Runner runs a Beam pipeline on a specific (often 
distributed) data proces
 * [SparkRunner]({{ site.baseurl }}/documentation/runners/spark/): Runs on 
[Apache Spark](http://spark.apache.org).
 * [DataflowRunner]({{ site.baseurl }}/documentation/runners/dataflow/): Runs 
on [Google Cloud Dataflow](https://cloud.google.com/dataflow), a fully managed 
service within [Google Cloud Platform](https://cloud.google.com/).
 * [GearpumpRunner]({{ site.baseurl }}/documentation/runners/gearpump/): Runs 
on [Apache Gearpump (incubating)](http://gearpump.apache.org).
+* [SamzaRunner]({{ site.baseurl }}/documentation/runners/samza/): Runs on 
[Apache Samza](http://samza.apache.org).
 
 Review comment:
   ure. Since Samza runner is already on master, I assume it will go out as 
2.6.0 release? So we will update this part after the release, if I understand 
correctly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117497)
Time Spent: 3h 50m  (was: 3h 40m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117427=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117427
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 29/Jun/18 15:51
Start Date: 29/Jun/18 15:51
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #471: [BEAM-3079]: Samza 
Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#issuecomment-401396220
 
 
   OK it turns out we should continue here until the web site is truly merged 
all the way.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117427)
Time Spent: 3h 40m  (was: 3.5h)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=117426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-117426
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 29/Jun/18 15:51
Start Date: 29/Jun/18 15:51
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on a change in pull request #471: 
[BEAM-3079]: Samza Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#discussion_r196122361
 
 

 ##
 File path: src/documentation/index.md
 ##
 @@ -46,6 +46,7 @@ A Beam Runner runs a Beam pipeline on a specific (often 
distributed) data proces
 * [SparkRunner]({{ site.baseurl }}/documentation/runners/spark/): Runs on 
[Apache Spark](http://spark.apache.org).
 * [DataflowRunner]({{ site.baseurl }}/documentation/runners/dataflow/): Runs 
on [Google Cloud Dataflow](https://cloud.google.com/dataflow), a fully managed 
service within [Google Cloud Platform](https://cloud.google.com/).
 * [GearpumpRunner]({{ site.baseurl }}/documentation/runners/gearpump/): Runs 
on [Apache Gearpump (incubating)](http://gearpump.apache.org).
+* [SamzaRunner]({{ site.baseurl }}/documentation/runners/samza/): Runs on 
[Apache Samza](http://samza.apache.org).
 
 Review comment:
   This area is actually reserved for runners with are on `master` and released 
as part of Beam. It is a bit confusing, since we include them all in the 
capability matrix. Currently the place we reference in-progress runners is 
http://apache-beam-website-pull-requests.storage.googleapis.com/471/contribute/index.html#works-in-progress


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 117426)
Time Spent: 3.5h  (was: 3h 20m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=114953=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-114953
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 22/Jun/18 22:33
Start Date: 22/Jun/18 22:33
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #471: [BEAM-3079]: Samza 
Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#issuecomment-399600876
 
 
   The website is actually being moved to the main repo, so let's wait just a 
little and then merge this over there.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 114953)
Time Spent: 3h 20m  (was: 3h 10m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=112783=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112783
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 18/Jun/18 15:27
Start Date: 18/Jun/18 15:27
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #471: [BEAM-3079]: Samza 
Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471#issuecomment-398094170
 
 
   
http://apache-beam-website-pull-requests.storage.googleapis.com/471/index.html


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 112783)
Time Spent: 3h 10m  (was: 3h)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=112779=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112779
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 18/Jun/18 15:25
Start Date: 18/Jun/18 15:25
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on a change in pull request 
#5517: [BEAM-3079] Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#discussion_r196119260
 
 

 ##
 File path: 
runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/DoFnRunnerWithKeyedInternals.java
 ##
 @@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.runners.samza.runtime;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import java.util.List;
+import org.apache.beam.runners.core.DoFnRunner;
+import org.apache.beam.runners.core.DoFnRunners;
+import org.apache.beam.runners.core.KeyedWorkItem;
+import org.apache.beam.runners.core.SideInputReader;
+import org.apache.beam.runners.core.StateInternals;
+import org.apache.beam.runners.core.StateInternalsFactory;
+import org.apache.beam.runners.core.StatefulDoFnRunner;
+import org.apache.beam.runners.core.StepContext;
+import org.apache.beam.runners.core.TimerInternals;
+import org.apache.beam.runners.core.TimerInternalsFactory;
+import org.apache.beam.runners.samza.metrics.DoFnRunnerWithMetrics;
+import org.apache.beam.runners.samza.metrics.SamzaMetricsContainer;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.state.TimeDomain;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.reflect.DoFnSignature;
+import org.apache.beam.sdk.transforms.reflect.DoFnSignatures;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.TupleTag;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.apache.beam.sdk.values.WindowingStrategy;
+import org.joda.time.Instant;
+
+/**
+ * This class wraps a DoFnRunner with keyed StateInternals and TimerInternals 
access.
+ */
+public class DoFnRunnerWithKeyedInternals implements 
DoFnRunner {
+  private final DoFnRunner underlying;
+  private final KeyedInternals keyedInternals;
+
+  public static  DoFnRunner of(
+  PipelineOptions options,
+  DoFn doFn,
+  SideInputReader sideInputReader,
+  DoFnRunners.OutputManager outputManager,
+  TupleTag mainOutputTag,
+  List> additionalOutputTags,
+  StateInternalsFactory stateInternalsFactory,
+  TimerInternalsFactory timerInternalsFactory,
+  WindowingStrategy windowingStrategy,
+  SamzaMetricsContainer metricsContainer,
+  String stepName) {
+
+final DoFnSignature signature = 
DoFnSignatures.getSignature(doFn.getClass());
+final KeyedInternals keyedInternals;
+final TimerInternals timerInternals;
+final StateInternals stateInternals;
+
+if (signature.usesState()) {
+  keyedInternals = new KeyedInternals(stateInternalsFactory, 
timerInternalsFactory);
+  stateInternals = keyedInternals.stateInternals();
+  timerInternals = keyedInternals.timerInternals();
+} else {
+  keyedInternals = null;
+  stateInternals = stateInternalsFactory.stateInternalsForKey(null);
+  timerInternals = timerInternalsFactory.timerInternalsForKey(null);
+}
+
+final DoFnRunner doFnRunner = DoFnRunners.simpleRunner(
+options,
+doFn,
+sideInputReader,
+outputManager,
+mainOutputTag,
+additionalOutputTags,
+createStepContext(stateInternals, timerInternals),
+windowingStrategy);
+
+final DoFnRunner doFnRunnerWithMetrics = 
DoFnRunnerWithMetrics
+.wrap(doFnRunner, metricsContainer, stepName);
+
+if (keyedInternals != null) {
 
 Review comment:
   This conditional goes right along with the one above to make just two code 
paths: the one 

[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=112780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112780
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 18/Jun/18 15:25
Start Date: 18/Jun/18 15:25
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on a change in pull request 
#5517: [BEAM-3079] Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#discussion_r196119675
 
 

 ##
 File path: 
runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/DoFnRunnerWithKeyedInternals.java
 ##
 @@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.runners.samza.runtime;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import java.util.List;
+import org.apache.beam.runners.core.DoFnRunner;
+import org.apache.beam.runners.core.DoFnRunners;
+import org.apache.beam.runners.core.KeyedWorkItem;
+import org.apache.beam.runners.core.SideInputReader;
+import org.apache.beam.runners.core.StateInternals;
+import org.apache.beam.runners.core.StateInternalsFactory;
+import org.apache.beam.runners.core.StatefulDoFnRunner;
+import org.apache.beam.runners.core.StepContext;
+import org.apache.beam.runners.core.TimerInternals;
+import org.apache.beam.runners.core.TimerInternalsFactory;
+import org.apache.beam.runners.samza.metrics.DoFnRunnerWithMetrics;
+import org.apache.beam.runners.samza.metrics.SamzaMetricsContainer;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.state.TimeDomain;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.reflect.DoFnSignature;
+import org.apache.beam.sdk.transforms.reflect.DoFnSignatures;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.TupleTag;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.apache.beam.sdk.values.WindowingStrategy;
+import org.joda.time.Instant;
+
+/**
+ * This class wraps a DoFnRunner with keyed StateInternals and TimerInternals 
access.
+ */
+public class DoFnRunnerWithKeyedInternals implements 
DoFnRunner {
+  private final DoFnRunner underlying;
+  private final KeyedInternals keyedInternals;
+
+  public static  DoFnRunner of(
+  PipelineOptions options,
+  DoFn doFn,
+  SideInputReader sideInputReader,
+  DoFnRunners.OutputManager outputManager,
+  TupleTag mainOutputTag,
+  List> additionalOutputTags,
+  StateInternalsFactory stateInternalsFactory,
+  TimerInternalsFactory timerInternalsFactory,
+  WindowingStrategy windowingStrategy,
+  SamzaMetricsContainer metricsContainer,
+  String stepName) {
+
+final DoFnSignature signature = 
DoFnSignatures.getSignature(doFn.getClass());
+final KeyedInternals keyedInternals;
+final TimerInternals timerInternals;
+final StateInternals stateInternals;
+
+if (signature.usesState()) {
+  keyedInternals = new KeyedInternals(stateInternalsFactory, 
timerInternalsFactory);
+  stateInternals = keyedInternals.stateInternals();
+  timerInternals = keyedInternals.timerInternals();
+} else {
+  keyedInternals = null;
+  stateInternals = stateInternalsFactory.stateInternalsForKey(null);
+  timerInternals = timerInternalsFactory.timerInternalsForKey(null);
+}
+
+final DoFnRunner doFnRunner = DoFnRunners.simpleRunner(
+options,
+doFn,
+sideInputReader,
+outputManager,
+mainOutputTag,
+additionalOutputTags,
+createStepContext(stateInternals, timerInternals),
 
 Review comment:
   Ah, is this why it is organized this way? OK.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries 

[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=112781=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112781
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 18/Jun/18 15:25
Start Date: 18/Jun/18 15:25
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on a change in pull request 
#5517: [BEAM-3079] Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#discussion_r196119044
 
 

 ##
 File path: 
runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/DoFnRunnerWithKeyedInternals.java
 ##
 @@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.runners.samza.runtime;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import java.util.List;
+import org.apache.beam.runners.core.DoFnRunner;
+import org.apache.beam.runners.core.DoFnRunners;
+import org.apache.beam.runners.core.KeyedWorkItem;
+import org.apache.beam.runners.core.SideInputReader;
+import org.apache.beam.runners.core.StateInternals;
+import org.apache.beam.runners.core.StateInternalsFactory;
+import org.apache.beam.runners.core.StatefulDoFnRunner;
+import org.apache.beam.runners.core.StepContext;
+import org.apache.beam.runners.core.TimerInternals;
+import org.apache.beam.runners.core.TimerInternalsFactory;
+import org.apache.beam.runners.samza.metrics.DoFnRunnerWithMetrics;
+import org.apache.beam.runners.samza.metrics.SamzaMetricsContainer;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.state.TimeDomain;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.reflect.DoFnSignature;
+import org.apache.beam.sdk.transforms.reflect.DoFnSignatures;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.TupleTag;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.apache.beam.sdk.values.WindowingStrategy;
+import org.joda.time.Instant;
+
+/**
+ * This class wraps a DoFnRunner with keyed StateInternals and TimerInternals 
access.
+ */
+public class DoFnRunnerWithKeyedInternals implements 
DoFnRunner {
+  private final DoFnRunner underlying;
+  private final KeyedInternals keyedInternals;
+
+  public static  DoFnRunner of(
+  PipelineOptions options,
+  DoFn doFn,
+  SideInputReader sideInputReader,
+  DoFnRunners.OutputManager outputManager,
+  TupleTag mainOutputTag,
+  List> additionalOutputTags,
+  StateInternalsFactory stateInternalsFactory,
+  TimerInternalsFactory timerInternalsFactory,
+  WindowingStrategy windowingStrategy,
+  SamzaMetricsContainer metricsContainer,
+  String stepName) {
+
+final DoFnSignature signature = 
DoFnSignatures.getSignature(doFn.getClass());
+final KeyedInternals keyedInternals;
+final TimerInternals timerInternals;
+final StateInternals stateInternals;
+
+if (signature.usesState()) {
 
 Review comment:
   If it doesn't use state, you don't need to create a 
`DoFnRunnerWithKeyedInternals`, do you? I think it would clean this up to 
refactor the constructor without the conditionals here and below.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 112781)
Time Spent: 2h 50m  (was: 2h 40m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: 

[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=112414=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112414
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 15/Jun/18 18:45
Start Date: 15/Jun/18 18:45
Worklog Time Spent: 10m 
  Work Description: xinyuiscool opened a new pull request #471: 
[BEAM-3079]: Samza Runner docs and capability matrix
URL: https://github.com/apache/beam-site/pull/471
 
 
   Add the Samza runner docs, and update the capability matrix and wordcount 
examples with Samza runner.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 112414)
Time Spent: 2.5h  (was: 2h 20m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=63=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-63
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 12/Jun/18 17:38
Start Date: 12/Jun/18 17:38
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] 
Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#issuecomment-396673409
 
 
   hmm, seems the headers are missing again in this patch. Let me quickly add 
them.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 63)
Time Spent: 2h 20m  (was: 2h 10m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=56=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-56
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 12/Jun/18 17:27
Start Date: 12/Jun/18 17:27
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] 
Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#issuecomment-396669181
 
 
   @kennknowles : I squashed all my commits into one so the changes are 
separated from the upstream merge. I think UsesImpulses tests are added in the 
master and Samza doesn't support it right now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 56)
Time Spent: 2h 10m  (was: 2h)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=50=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-50
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 12/Jun/18 17:25
Start Date: 12/Jun/18 17:25
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] 
Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#issuecomment-396669181
 
 
   @kennknowles : I squashed all the commits into one so the changes are 
separated from the upstream merge. I think UsesImpulses tests are added in the 
master and Samza doesn't support it right now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 50)
Time Spent: 2h  (was: 1h 50m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=49=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-49
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 12/Jun/18 17:24
Start Date: 12/Jun/18 17:24
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] 
Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#issuecomment-396669181
 
 
   @kennknowles : I squashed all the commits into one so the changes are 
separated from the merge. I think UsesImpulses tests are added in the master 
and Samza doesn't support it right now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 49)
Time Spent: 1h 50m  (was: 1h 40m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=05=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-05
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 12/Jun/18 16:15
Start Date: 12/Jun/18 16:15
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] 
Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#issuecomment-396648065
 
 
   Sorry about it. Let me take a look.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 05)
Time Spent: 1h 40m  (was: 1.5h)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=110949=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110949
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 12/Jun/18 04:51
Start Date: 12/Jun/18 04:51
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5517: [BEAM-3079] 
Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#issuecomment-396464975
 
 
   There's a merge commit I see from `master` in there. And in the diff I see 
`UsesImpulse` being added. Can you separate the upstream sync from the upgrade, 
or is that not possible?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110949)
Time Spent: 1.5h  (was: 1h 20m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=110707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110707
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 11/Jun/18 18:05
Start Date: 11/Jun/18 18:05
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5517: [BEAM-3079] 
Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#issuecomment-39623
 
 
   Yes, sorry for the delay! Reviewing today.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110707)
Time Spent: 1h 20m  (was: 1h 10m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-06-11 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=110706=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-110706
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 11/Jun/18 18:01
Start Date: 11/Jun/18 18:01
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] 
Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#issuecomment-396331962
 
 
   @kennknowles : could you please help review it when you get a chance? Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 110706)
Time Spent: 1h 10m  (was: 1h)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=107520=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107520
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 31/May/18 03:05
Start Date: 31/May/18 03:05
Worklog Time Spent: 10m 
  Work Description: xinyuiscool commented on issue #5517: [BEAM-3079] 
Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#issuecomment-393386979
 
 
   Fixed the headers. Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107520)
Time Spent: 1h  (was: 50m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=107518=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107518
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 31/May/18 02:32
Start Date: 31/May/18 02:32
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5517: [BEAM-3079] 
Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517#issuecomment-393382260
 
 
   The reason all three precommits failed is `./gradlew :rat` which checks 
license headers. Go ahead and fix that and I will go ahead and review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107518)
Time Spent: 50m  (was: 40m)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-05-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=107255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-107255
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 30/May/18 17:00
Start Date: 30/May/18 17:00
Worklog Time Spent: 10m 
  Work Description: xinyuiscool opened a new pull request #5517: 
[BEAM-3079] Update samza-runner with more features and improvements
URL: https://github.com/apache/beam/pull/5517
 
 
   Add the following feature support:
   - Stateful ParDo
   - Trigger using processing-time
   
   Improvements:
   - Use URN to translate PTransform
   - Direct translation of Combine to avoid events buffering
   - Allow sideinput watermark to populate separately
   - Support broadcasting PCollectionView
   - Make state store point to /tmp when running the tests


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 107255)
Time Spent: 40m  (was: 0.5h)

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3079) Samza runner

2018-05-29 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3079?focusedWorklogId=106883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-106883
 ]

ASF GitHub Bot logged work on BEAM-3079:


Author: ASF GitHub Bot
Created on: 29/May/18 22:22
Start Date: 29/May/18 22:22
Worklog Time Spent: 10m 
  Work Description: xinyuiscool opened a new pull request #5505: 
[BEAM-3079] Rebase Samza runner with master
URL: https://github.com/apache/beam/pull/5505
 
 
   Merge the latest master to samza-runner branch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 106883)
Time Spent: 10m
Remaining Estimate: 0h

> Samza runner
> 
>
> Key: BEAM-3079
> URL: https://issues.apache.org/jira/browse/BEAM-3079
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Xinyu Liu
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Apache Samza is a distributed data-processing platform which supports both 
> stream and batch processing. It'll be awesome if we can run BEAM's advanced 
> data transform and multi-language sdks on top of Samza.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)