[jira] [Work logged] (BEAM-6557) Add IPython notebooks for quickstarts and custom I/O

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6557?focusedWorklogId=230568=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230568
 ]

ASF GitHub Bot logged work on BEAM-6557:


Author: ASF GitHub Bot
Created on: 22/Apr/19 02:35
Start Date: 22/Apr/19 02:35
Worklog Time Spent: 10m 
  Work Description: melap commented on issue #7679: [BEAM-6557] Adds an 
interactive "Try Apache Beam" page
URL: https://github.com/apache/beam/pull/7679#issuecomment-485306953
 
 
   I made some minor edit suggestions and added a language switcher in a couple 
commits -- PTAL. The content LGTM. I'm not familiar with the gradle build files 
though, so not a good person to review those.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230568)
Time Spent: 3h 40m  (was: 3.5h)

> Add IPython notebooks for quickstarts and custom I/O
> 
>
> Key: BEAM-6557
> URL: https://issues.apache.org/jira/browse/BEAM-6557
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: David Cavazos
>Assignee: David Cavazos
>Priority: Minor
>  Labels: triaged
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5775) Make the spark runner not serialize data unless spark is spilling to disk

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5775?focusedWorklogId=230558=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230558
 ]

ASF GitHub Bot logged work on BEAM-5775:


Author: ASF GitHub Bot
Created on: 22/Apr/19 00:19
Start Date: 22/Apr/19 00:19
Worklog Time Spent: 10m 
  Work Description: mikekap commented on pull request #8371: [BEAM-5775] 
Move (most) of the batch spark pipelines' transformations to using lazy 
serialization.
URL: https://github.com/apache/beam/pull/8371
 
 
   This avoids unnecessary serialization. For example, if a groupByKey is 
happening & part of the shuffle ends up on the current worker, we'll avoid the 
unnecessary serialize/deserialize cycle.
   
   The main actual change in this PR (other than replacing `byte[]` with 
`ValueAndCoderSerializable`) is in `GroupNonMergingWindowsFunctions`. The 
semantics are slightly different in that we defer to Spark's serializer to 
serialize the values. This allows the previous optimization to keep working in 
a lazy way - if there are a lot of windows for a single value, Spark *should* 
serialize them only once since it's the same reference. In case Kryo is being 
used, the option `spark.kryo.referenceTracking` controls this behavior & 
defaults to true. For Java serialization, it's the only behavior available.
   
   I didn't touch spark streaming in this PR because I'm not sure how to 
address the backwards compatibility problem. Any thoughts there?
   
   R: @iemejia 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-5822) Vendor bytebuddy

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5822?focusedWorklogId=230534=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230534
 ]

ASF GitHub Bot logged work on BEAM-5822:


Author: ASF GitHub Bot
Created on: 21/Apr/19 21:09
Start Date: 21/Apr/19 21:09
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #8357: [BEAM-5822] 
Vendor bytebuddy
URL: https://github.com/apache/beam/pull/8357#issuecomment-485282514
 
 
   There are not very good instructions. We have only published a few vendored 
artifacts. You can read the mailing lists for the discussions and votes. That 
is the best I know of.
   
   What you will need to do is check in the vendored project first in a 
separate pull request.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230534)
Time Spent: 1h  (was: 50m)

> Vendor bytebuddy
> 
>
> Key: BEAM-5822
> URL: https://issues.apache.org/jira/browse/BEAM-5822
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3489) Expose the message id of received messages within PubsubMessage

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3489?focusedWorklogId=230531=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230531
 ]

ASF GitHub Bot logged work on BEAM-3489:


Author: ASF GitHub Bot
Created on: 21/Apr/19 20:18
Start Date: 21/Apr/19 20:18
Worklog Time Spent: 10m 
  Work Description: thinhha commented on issue #8370: [BEAM-3489] add 
PubSub messageId in PubsubMessage
URL: https://github.com/apache/beam/pull/8370#issuecomment-485279607
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230531)
Time Spent: 0.5h  (was: 20m)

> Expose the message id of received messages within PubsubMessage
> ---
>
> Key: BEAM-3489
> URL: https://issues.apache.org/jira/browse/BEAM-3489
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Luke Cwik
>Assignee: Thinh Ha
>Priority: Minor
>  Labels: newbie, starter
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This task is about passing forward the message id from the pubsub proto to 
> the java PubsubMessage.
> Add a message id field to PubsubMessage.
> Update the coder for PubsubMessage to encode the message id.
> Update the translation from the Pubsub proto message to the Dataflow message:
> https://github.com/apache/beam/blob/2e275264b21db45787833502e5e42907b05e28b8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java#L976



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5795) Can SQL Query 5 be simplified?

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5795?focusedWorklogId=230530=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230530
 ]

ASF GitHub Bot logged work on BEAM-5795:


Author: ASF GitHub Bot
Created on: 21/Apr/19 19:54
Start Date: 21/Apr/19 19:54
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #6757: [BEAM-5795] 
Simplify SQL Query 5
URL: https://github.com/apache/beam/pull/6757#issuecomment-485278143
 
 
   Looks like the plan for the current version even results in a join with a 
condition that is non-equijoin, which we don't support. So there's more to it. 
Probably the plan for the revised version is the same, but not necessarily. So 
that would be another benefit.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230530)
Time Spent: 1h 10m  (was: 1h)

> Can SQL Query 5 be simplified?
> --
>
> Key: BEAM-5795
> URL: https://issues.apache.org/jira/browse/BEAM-5795
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: triaged
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The original CQL query uses the ALL operator over the set of rows that are 
> within a certain period from the watermark. We instead have a fancy join, due 
> to windowing. Nonetheless, can this be simplified?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7127) Timer parameter not supported in timer callback

2019-04-21 Thread Thomas Weise (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822730#comment-16822730
 ] 

Thomas Weise commented on BEAM-7127:


{code:java}
  File 
"/Users/tweise/src/beam/sdks/python/apache_beam/examples/flink/flink_state.py", 
line 78, in run

    | 'statefulCount' >> beam.ParDo(StateTimerFn())

  File "apache_beam/transforms/core.py", line 979, in __init__

    super(ParDo, self).__init__(fn, *args, **kwargs)

  File "apache_beam/transforms/ptransform.py", line 689, in __init__

    self.fn = pickler.loads(pickler.dumps(self.fn))

  File "apache_beam/internal/pickler.py", line 230, in dumps

    s = dill.dumps(o)

  File 
"/Users/tweise/python-ve/beam/lib/python2.7/site-packages/dill/_dill.py", line 
294, in dumps

    dump(obj, file, protocol, byref, fmode, recurse)#, strictio)

  File 
"/Users/tweise/python-ve/beam/lib/python2.7/site-packages/dill/_dill.py", line 
287, in dump

    pik.dump(obj)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 224, in dump

    self.save(obj)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 331, in save

    self.save_reduce(obj=obj, *rv)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 396, in save_reduce

    save(cls)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 286, in save

    f(self, obj) # Call unbound method with explicit self

  File "apache_beam/internal/pickler.py", line 107, in wrapper

    return fun(pickler, obj)

  File 
"/Users/tweise/python-ve/beam/lib/python2.7/site-packages/dill/_dill.py", line 
1315, in save_type

    obj.__bases__, _dict), obj=obj)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 401, in save_reduce

    save(args)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 286, in save

    f(self, obj) # Call unbound method with explicit self

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 562, in save_tuple

    save(element)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 286, in save

    f(self, obj) # Call unbound method with explicit self

  File "apache_beam/internal/pickler.py", line 198, in new_save_module_dict

    return old_save_module_dict(pickler, obj)

  File 
"/Users/tweise/python-ve/beam/lib/python2.7/site-packages/dill/_dill.py", line 
902, in save_module_dict

    StockPickler.save_dict(pickler, obj)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 649, in save_dict

    self._batch_setitems(obj.iteritems())

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 681, in _batch_setitems

    save(v)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 286, in save

    f(self, obj) # Call unbound method with explicit self

  File 
"/Users/tweise/python-ve/beam/lib/python2.7/site-packages/dill/_dill.py", line 
1394, in save_function

    obj.__dict__), obj=obj)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 405, in save_reduce

    self.memoize(obj)

  File 
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py",
 line 244, in memoize

    assert id(obj) not in self.memo

AssertionError{code}

> Timer parameter not supported in timer callback
> ---
>
> Key: BEAM-7127
> URL: https://issues.apache.org/jira/browse/BEAM-7127
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Priority: Major
>  Labels: portability
>
> Referencing the timer in its on_timer callback produces a recursive pickle 
> error.  
> {code:java}
> @userstate.on_timer(timer_spec)
> def process_timer(self, timer_1=beam.DoFn.TimerParam(timer_spec)):
> {code}
> Unit test: 
> [https://github.com/apache/beam/blob/cbe4dfbdbe5d0da5152568853ee5e17334dd1b54/sdks/python/apache_beam/transforms/userstate_test.py#L67]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7127) Timer parameter not supported in timer callback

2019-04-21 Thread Thomas Weise (JIRA)
Thomas Weise created BEAM-7127:
--

 Summary: Timer parameter not supported in timer callback
 Key: BEAM-7127
 URL: https://issues.apache.org/jira/browse/BEAM-7127
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-harness
Affects Versions: 2.12.0
Reporter: Thomas Weise


Referencing the timer in its on_timer callback produces a recursive pickle 
error.  
{code:java}
@userstate.on_timer(timer_spec)
def process_timer(self, timer_1=beam.DoFn.TimerParam(timer_spec)):
{code}
Unit test: 
[https://github.com/apache/beam/blob/cbe4dfbdbe5d0da5152568853ee5e17334dd1b54/sdks/python/apache_beam/transforms/userstate_test.py#L67]
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3489) Expose the message id of received messages within PubsubMessage

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3489?focusedWorklogId=230519=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230519
 ]

ASF GitHub Bot logged work on BEAM-3489:


Author: ASF GitHub Bot
Created on: 21/Apr/19 15:16
Start Date: 21/Apr/19 15:16
Worklog Time Spent: 10m 
  Work Description: thinhha commented on issue #8370: [BEAM-3489] add 
PubSub messageId in PubsubMessage
URL: https://github.com/apache/beam/pull/8370#issuecomment-485259621
 
 
   R: @lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230519)
Time Spent: 20m  (was: 10m)

> Expose the message id of received messages within PubsubMessage
> ---
>
> Key: BEAM-3489
> URL: https://issues.apache.org/jira/browse/BEAM-3489
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Luke Cwik
>Assignee: Thinh Ha
>Priority: Minor
>  Labels: newbie, starter
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This task is about passing forward the message id from the pubsub proto to 
> the java PubsubMessage.
> Add a message id field to PubsubMessage.
> Update the coder for PubsubMessage to encode the message id.
> Update the translation from the Pubsub proto message to the Dataflow message:
> https://github.com/apache/beam/blob/2e275264b21db45787833502e5e42907b05e28b8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java#L976



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3489) Expose the message id of received messages within PubsubMessage

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3489?focusedWorklogId=230518=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230518
 ]

ASF GitHub Bot logged work on BEAM-3489:


Author: ASF GitHub Bot
Created on: 21/Apr/19 15:11
Start Date: 21/Apr/19 15:11
Worklog Time Spent: 10m 
  Work Description: thinhha commented on pull request #8370: [BEAM-3489] 
add PubSub messageId in PubsubMessage
URL: https://github.com/apache/beam/pull/8370
 
 
   Make the `recordId` field from `PubsubClient.IncomingMessage` available as a 
new field called `messageId` in `PubsubMessage`.
   Updated the coder for `PubsubMessage` to encode and decode the `messageId`.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

[jira] [Work logged] (BEAM-7112) State cleanup interferes with user timer callback

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7112?focusedWorklogId=230514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230514
 ]

ASF GitHub Bot logged work on BEAM-7112:


Author: ASF GitHub Bot
Created on: 21/Apr/19 13:59
Start Date: 21/Apr/19 13:59
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #8351: [BEAM-7112] [flink] 
Defer state cleanup till bundle completion
URL: https://github.com/apache/beam/pull/8351#issuecomment-485253923
 
 
   @mxm thanks for the expedited review!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230514)
Time Spent: 3h 40m  (was: 3.5h)

> State cleanup interferes with user timer callback
> -
>
> Key: BEAM-7112
> URL: https://issues.apache.org/jira/browse/BEAM-7112
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Cleanup timers and user timers are fired at the watermark. Processing of 
> timers in the SDK worker is asynchronous, so it is possible that the state is 
> already removed when the user timer callback executes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7112) State cleanup interferes with user timer callback

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7112?focusedWorklogId=230513=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230513
 ]

ASF GitHub Bot logged work on BEAM-7112:


Author: ASF GitHub Bot
Created on: 21/Apr/19 13:58
Start Date: 21/Apr/19 13:58
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #8351: [BEAM-7112] 
[flink] Defer state cleanup till bundle completion
URL: https://github.com/apache/beam/pull/8351#discussion_r277169275
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
 ##
 @@ -400,8 +398,35 @@ private void setTimer(WindowedValue timerElement, 
TimerInternals.TimerDa
 }
   }
 
+  @SuppressWarnings("ByteBufferBackingArray")
+  private void fireCleanupTimers() {
+while (!cleanupTimers.isEmpty()) {
+  InternalTimer timer = 
cleanupTimers.remove();
+  final ByteBuffer encodedKey = (ByteBuffer) timer.getKey();
+  try {
+// still need to process as timer, see CleanupTimer
+stateBackendLock.lock();
 
 Review comment:
   Probably, it would also require to first check that there is at least one 
timer before calling this method to not take the lock unnecessarily.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230513)
Time Spent: 3.5h  (was: 3h 20m)

> State cleanup interferes with user timer callback
> -
>
> Key: BEAM-7112
> URL: https://issues.apache.org/jira/browse/BEAM-7112
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Cleanup timers and user timers are fired at the watermark. Processing of 
> timers in the SDK worker is asynchronous, so it is possible that the state is 
> already removed when the user timer callback executes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7112) State cleanup interferes with user timer callback

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7112?focusedWorklogId=230512=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230512
 ]

ASF GitHub Bot logged work on BEAM-7112:


Author: ASF GitHub Bot
Created on: 21/Apr/19 13:56
Start Date: 21/Apr/19 13:56
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #8351: [BEAM-7112] 
[flink] Defer state cleanup till bundle completion
URL: https://github.com/apache/beam/pull/8351
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230512)
Time Spent: 3h 20m  (was: 3h 10m)

> State cleanup interferes with user timer callback
> -
>
> Key: BEAM-7112
> URL: https://issues.apache.org/jira/browse/BEAM-7112
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Cleanup timers and user timers are fired at the watermark. Processing of 
> timers in the SDK worker is asynchronous, so it is possible that the state is 
> already removed when the user timer callback executes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3489) Expose the message id of received messages within PubsubMessage

2019-04-21 Thread Thinh Ha (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thinh Ha updated BEAM-3489:
---
Attachment: index.html

> Expose the message id of received messages within PubsubMessage
> ---
>
> Key: BEAM-3489
> URL: https://issues.apache.org/jira/browse/BEAM-3489
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Luke Cwik
>Assignee: Ahmed El.Hussaini
>Priority: Minor
>  Labels: newbie, starter
> Attachments: index.html
>
>
> This task is about passing forward the message id from the pubsub proto to 
> the java PubsubMessage.
> Add a message id field to PubsubMessage.
> Update the coder for PubsubMessage to encode the message id.
> Update the translation from the Pubsub proto message to the Dataflow message:
> https://github.com/apache/beam/blob/2e275264b21db45787833502e5e42907b05e28b8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java#L976



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7112) State cleanup interferes with user timer callback

2019-04-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7112?focusedWorklogId=230509=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230509
 ]

ASF GitHub Bot logged work on BEAM-7112:


Author: ASF GitHub Bot
Created on: 21/Apr/19 12:06
Start Date: 21/Apr/19 12:06
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8351: [BEAM-7112] 
[flink] Defer state cleanup till bundle completion
URL: https://github.com/apache/beam/pull/8351#discussion_r277164943
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
 ##
 @@ -400,8 +398,35 @@ private void setTimer(WindowedValue timerElement, 
TimerInternals.TimerDa
 }
   }
 
+  @SuppressWarnings("ByteBufferBackingArray")
+  private void fireCleanupTimers() {
+while (!cleanupTimers.isEmpty()) {
+  InternalTimer timer = 
cleanupTimers.remove();
+  final ByteBuffer encodedKey = (ByteBuffer) timer.getKey();
+  try {
+// still need to process as timer, see CleanupTimer
+stateBackendLock.lock();
 
 Review comment:
   Consider only locking once before processing all the timers, but probably 
negligible.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 230509)
Time Spent: 3h 10m  (was: 3h)

> State cleanup interferes with user timer callback
> -
>
> Key: BEAM-7112
> URL: https://issues.apache.org/jira/browse/BEAM-7112
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.12.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Cleanup timers and user timers are fired at the watermark. Processing of 
> timers in the SDK worker is asynchronous, so it is possible that the state is 
> already removed when the user timer callback executes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)