[jira] [Commented] (BEAM-7274) Protobuf Beam Schema support

2019-08-30 Thread Alex Van Boxel (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920033#comment-16920033
 ] 

Alex Van Boxel commented on BEAM-7274:
--

PR ready for review

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-5967) ProtoCoder doesn't support DynamicMessage

2019-08-30 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel resolved BEAM-5967.
--
Fix Version/s: 2.16.0
   Resolution: Fixed

Object equality now handled by ProtoDomain. Upgradability is tested from 2.14.0 
to  -- 2.16.0-SNAPSHOT. Waiting for reviewers.

> ProtoCoder doesn't support DynamicMessage
> -
>
> Key: BEAM-5967
> URL: https://issues.apache.org/jira/browse/BEAM-5967
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The ProtoCoder does make some assumptions about static messages being 
> available. The DynamicMessage doesn't have some of them, mainly because the 
> proto schema is defined at runtime and not at compile time.
> Does it make sense to make a special coder for DynamicMessage or build it 
> into the normal ProtoCoder.
> Here is an example of the assumtion being made in the current Codec:
> {code:java}
> try {
>   @SuppressWarnings("unchecked")
>   T protoMessageInstance = (T) 
> protoMessageClass.getMethod("getDefaultInstance").invoke(null);
>   @SuppressWarnings("unchecked")
>   Parser tParser = (Parser) protoMessageInstance.getParserForType();
>   memoizedParser = tParser;
> } catch (IllegalAccessException | InvocationTargetException | 
> NoSuchMethodException e) {
>   throw new IllegalArgumentException(e);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (BEAM-7312) SchemaProvider can't be used with dynamic types

2019-08-30 Thread Alex Van Boxel (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Van Boxel closed BEAM-7312.

Fix Version/s: 2.14.0
   Resolution: Won't Fix

This is not the right mechanism for handling dynamic types. Closing with Won't 
Fix.

> SchemaProvider can't be used with dynamic types
> ---
>
> Key: BEAM-7312
> URL: https://issues.apache.org/jira/browse/BEAM-7312
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.14.0
>
>
> Looking at the java doc comment of SchemaProvider it hints at getting 
> schema's from external system. But as the provider only access type this is 
> in general impossible:
> Say you have 2 dynamic types, say Avro, as a java type they have both 
> GenericRecord. Using the current interface it's impossible to make the 
> difference between both dynamic types.
> As getting information from an external system I propose extending the 
> Provider interface by adding an extra parameter to the interface. It would be 
> a string with a URN.
> The URN could indicated for example
>  * Pub/Sub subscription/topic
>  * Kafka topic
>  * whatever... 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8125) SchemaCoder/RowCoder.verifyDeterministic throws NPE when schema has a logical type

2019-08-30 Thread Brian Hulette (Jira)
Brian Hulette created BEAM-8125:
---

 Summary: SchemaCoder/RowCoder.verifyDeterministic throws NPE when 
schema has a logical type
 Key: BEAM-8125
 URL: https://issues.apache.org/jira/browse/BEAM-8125
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Affects Versions: 2.15.0
Reporter: Brian Hulette
Assignee: Brian Hulette






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=304688=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304688
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:32
Start Date: 31/Aug/19 00:32
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #9433: [BEAM-7389] Update to 
use util.ToString transform
URL: https://github.com/apache/beam/pull/9433#issuecomment-526784639
 
 
   @davidcavazos for future PRs, could you keep adding new commits instead of 
squashing and force pushing. Reviewing deltas are much easier that way. Before 
merging I/you one of us can do the squashing.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304688)
Time Spent: 54h 20m  (was: 54h 10m)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 54h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=304687=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304687
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:32
Start Date: 31/Aug/19 00:32
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9433: [BEAM-7389] 
Update to use util.ToString transform
URL: https://github.com/apache/beam/pull/9433
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304687)
Time Spent: 54h 10m  (was: 54h)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 54h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=304686=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304686
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:31
Start Date: 31/Aug/19 00:31
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #9257: [BEAM-7389] Add DoFn 
methods sample
URL: https://github.com/apache/beam/pull/9257#issuecomment-526784513
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304686)
Time Spent: 54h  (was: 53h 50m)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 54h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=304685=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304685
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:31
Start Date: 31/Aug/19 00:31
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #9262: [BEAM-7389] Add code 
examples for Regex page
URL: https://github.com/apache/beam/pull/9262#issuecomment-526784475
 
 
   I think it is confusing. I will rather fix that and assume that people 
understand basics of regex patterns.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304685)
Time Spent: 53h 50m  (was: 53h 40m)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 53h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-6858) Support side inputs injected into a DoFn

2019-08-30 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919976#comment-16919976
 ] 

Ahmet Altay commented on BEAM-6858:
---

PR #9275 is resulting in test issues 
(https://issues.apache.org/jira/browse/BEAM-8102). Could you look at that 
please?

> Support side inputs injected into a DoFn
> 
>
> Key: BEAM-6858
> URL: https://issues.apache.org/jira/browse/BEAM-6858
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Beam currently supports injecting main inputs into a DoFn process method. A 
> user can write the following:
> @ProcessElement public void process(@Element InputT element)
> And Beam will (using ByteBuddy code generation) inject the input element into 
> the process method.
> We would like to also support the same for side inputs. For example:
> @ProcessElement public void process(@Element InputT element, 
> @SideInput("tag1") String input1, @SideInput("tag2") Integer input2) 
> This requires the existing process-method analysis framework to capture these 
> side inputs. The ParDo code would have to verify the type of the side input 
> and include them in the list of side inputs. This would also eliminate the 
> need for the user to explicitly call withSideInputs on the ParDo.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=304683=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304683
 ]

ASF GitHub Bot logged work on BEAM-7060:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:21
Start Date: 31/Aug/19 00:21
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9283: [BEAM-7060] Type 
hints from Python 3 annotations
URL: https://github.com/apache/beam/pull/9283#discussion_r319599614
 
 

 ##
 File path: sdks/python/apache_beam/transforms/ptransform.py
 ##
 @@ -821,7 +823,9 @@ def expand(self, pcoll):
 
 # TODO(BEAM-5878) Support keyword-only arguments.
 try:
-  if 'type_hints' in getfullargspec(self._fn).args:
+  # TODO(udim): This looks like unused code. When is 'type_hints' used as 
an
 
 Review comment:
   I added a few tests in `TestPTransformFn`. The interface relies on the 
internal data structure `IOTypeHints`, which is not ideal. What problem is the 
'type_hints' argument intended to solve? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304683)
Time Spent: 13h 10m  (was: 13h)

> Design Py3-compatible typehints annotation support in Beam 3.
> -
>
> Key: BEAM-7060
> URL: https://issues.apache.org/jira/browse/BEAM-7060
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> Existing [Typehints implementaiton in 
> Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/
> ] heavily relies on internal details of CPython implementation, and some of 
> the assumptions of this implementation broke as of Python 3.6, see for 
> example: https://issues.apache.org/jira/browse/BEAM-6877, which makes  
> typehints support unusable on Python 3.6 as of now. [Python 3 Kanban 
> Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245=detail]
>  lists several specific typehints-related breakages, prefixed with "TypeHints 
> Py3 Error".
> We need to decide whether to:
> - Deprecate in-house typehints implementation.
> - Continue to support in-house implementation, which at this point is a stale 
> code and has other known issues.
> - Attempt to use some off-the-shelf libraries for supporting 
> type-annotations, like  Pytype, Mypy, PyAnnotate.
> WRT to this decision we also need to plan on immediate next steps to unblock 
> adoption of Beam for  Python 3.6+ users. One potential option may be to have 
> Beam SDK ignore any typehint annotations on Py 3.6+.
> cc: [~udim], [~altay], [~robertwb].



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=304682=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304682
 ]

ASF GitHub Bot logged work on BEAM-7060:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:21
Start Date: 31/Aug/19 00:21
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9283: [BEAM-7060] Type 
hints from Python 3 annotations
URL: https://github.com/apache/beam/pull/9283#discussion_r319695820
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -265,23 +368,34 @@ def _unpack_positional_arg_hints(arg, hint):
   return hint
 
 
-def getcallargs_forhints(func, *typeargs, **typekwargs):
-  """Like inspect.getcallargs, but understands that Tuple[] and an Any unpack.
+def getcallargs_forhints(using_var_hints, func, *typeargs, **typekwargs):
+  """Like inspect.getcallargs, with support for declaring default args as Any.
+
+  In Python 2, understands that Tuple[] and an Any unpack.
+
+  Args:
+using_var_hints: For variable length arguments, whether to expect the bound
 
 Review comment:
   It's only possible to tell which hints apply to a variadic argument by 
calling getacallargs or signature.bind, so I don't see how to preprocess the 
hint (unless we create special types, which complicates usage and consistency 
checks).
   
   Post-processing is similarly inelegant, since we would have to export 
parameter types (vararg, varkw, etc.) and logic that doesn't belong outside of 
the typehints module.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304682)
Time Spent: 13h  (was: 12h 50m)

> Design Py3-compatible typehints annotation support in Beam 3.
> -
>
> Key: BEAM-7060
> URL: https://issues.apache.org/jira/browse/BEAM-7060
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> Existing [Typehints implementaiton in 
> Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/
> ] heavily relies on internal details of CPython implementation, and some of 
> the assumptions of this implementation broke as of Python 3.6, see for 
> example: https://issues.apache.org/jira/browse/BEAM-6877, which makes  
> typehints support unusable on Python 3.6 as of now. [Python 3 Kanban 
> Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245=detail]
>  lists several specific typehints-related breakages, prefixed with "TypeHints 
> Py3 Error".
> We need to decide whether to:
> - Deprecate in-house typehints implementation.
> - Continue to support in-house implementation, which at this point is a stale 
> code and has other known issues.
> - Attempt to use some off-the-shelf libraries for supporting 
> type-annotations, like  Pytype, Mypy, PyAnnotate.
> WRT to this decision we also need to plan on immediate next steps to unblock 
> adoption of Beam for  Python 3.6+ users. One potential option may be to have 
> Beam SDK ignore any typehint annotations on Py 3.6+.
> cc: [~udim], [~altay], [~robertwb].



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=304681=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304681
 ]

ASF GitHub Bot logged work on BEAM-7060:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:21
Start Date: 31/Aug/19 00:21
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9283: [BEAM-7060] Type 
hints from Python 3 annotations
URL: https://github.com/apache/beam/pull/9283#discussion_r319613537
 
 

 ##
 File path: sdks/python/apache_beam/typehints/typehints_test.py
 ##
 @@ -1062,57 +797,92 @@ def test_hint_helper(self):
 self.assertTrue(is_consistent_with(str, Union[str, int]))
 self.assertFalse(is_consistent_with(Union[str, int], str))
 
-  def test_positional_arg_hints(self):
-self.assertEqual(typehints.Any, _positional_arg_hints('x', {}))
-self.assertEqual(int, _positional_arg_hints('x', {'x': int}))
-self.assertEqual(typehints.Tuple[int, typehints.Any],
- _positional_arg_hints(['x', 'y'], {'x': int}))
-
   def test_getcallargs_forhints(self):
 def func(a, b_c, *d):
   b, c = b_c # pylint: disable=unused-variable
   return None
 self.assertEqual(
 {'a': Any, 'b_c': Any, 'd': Tuple[Any, ...]},
-getcallargs_forhints(func, *[Any, Any]))
+getcallargs_forhints(False, func, *[Any, Any]))
+if sys.version_info >= (3,):
+  self.assertEqual(
+  {'a': Any, 'b_c': Any, 'd': Tuple[Union[int, str], ...]},
+  getcallargs_forhints(False, func, *[Any, Any, str, int]))
+else:
+  self.assertEqual(
+  {'a': Any, 'b_c': Any, 'd': Tuple[Any, ...]},
+  getcallargs_forhints(False, func, *[Any, Any, Any, int]))
 
 Review comment:
   Done and done (BEAM-8122 opened).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304681)
Time Spent: 13h  (was: 12h 50m)

> Design Py3-compatible typehints annotation support in Beam 3.
> -
>
> Key: BEAM-7060
> URL: https://issues.apache.org/jira/browse/BEAM-7060
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> Existing [Typehints implementaiton in 
> Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/
> ] heavily relies on internal details of CPython implementation, and some of 
> the assumptions of this implementation broke as of Python 3.6, see for 
> example: https://issues.apache.org/jira/browse/BEAM-6877, which makes  
> typehints support unusable on Python 3.6 as of now. [Python 3 Kanban 
> Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245=detail]
>  lists several specific typehints-related breakages, prefixed with "TypeHints 
> Py3 Error".
> We need to decide whether to:
> - Deprecate in-house typehints implementation.
> - Continue to support in-house implementation, which at this point is a stale 
> code and has other known issues.
> - Attempt to use some off-the-shelf libraries for supporting 
> type-annotations, like  Pytype, Mypy, PyAnnotate.
> WRT to this decision we also need to plan on immediate next steps to unblock 
> adoption of Beam for  Python 3.6+ users. One potential option may be to have 
> Beam SDK ignore any typehint annotations on Py 3.6+.
> cc: [~udim], [~altay], [~robertwb].



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=304684=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304684
 ]

ASF GitHub Bot logged work on BEAM-7060:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:21
Start Date: 31/Aug/19 00:21
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9283: [BEAM-7060] Type 
hints from Python 3 annotations
URL: https://github.com/apache/beam/pull/9283#discussion_r319600523
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -171,22 +269,55 @@ def simple_output_type(self, context):
 if self.output_types:
   args, kwargs = self.output_types
   if len(args) != 1 or kwargs:
-raise TypeError('Expected simple output type hint for %s' % context)
+raise TypeError(
+'Expected single output type hint for %s but got: %s' % (
+context, self.output_types))
   return args[0]
 
+  def has_simple_output_type(self):
+"""Whether there's a single positional output type."""
+return (self.output_types and len(self.output_types[0]) == 1 and
+not self.output_types[1])
+
+  def strip_iterable(self):
+"""Removes outer Iterable (or equivalent) from output type.
+
+Only affects instances with simple output types, otherwise is a no-op.
+
+Example: Generator[Tuple(int, int)] becomes Tuple(int, int)
+
+Raises:
+  ValueError if output type is simple and not iterable.
+"""
+if not self.has_simple_output_type():
+  return
+yielded_type = typehints.get_yielded_type(self.output_types[0][0])
+self.output_types = ((yielded_type,), {})
+
   def copy(self):
 return IOTypeHints(self.input_types, self.output_types)
 
   def with_defaults(self, hints):
 if not hints:
   return self
-elif not self:
-  return hints
-return IOTypeHints(self.input_types or hints.input_types,
-   self.output_types or hints.output_types)
+if self._has_input_types():
+  input_types = self.input_types
+else:
+  input_types = hints.input_types
+if self._has_output_types():
+  output_types = self.output_types
+else:
+  output_types = hints.output_types
+return IOTypeHints(input_types, output_types)
+
+  def _has_input_types(self):
+return self.input_types is not None and any(self.input_types)
 
 Review comment:
   self.input_types can be `None` or `((), {})`, and both of these values mean 
"no input hints."
   ```py
   >>> a = ((), {})
   >>> bool(a)
   True
   >>> any(a)
   False
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304684)
Time Spent: 13h 10m  (was: 13h)

> Design Py3-compatible typehints annotation support in Beam 3.
> -
>
> Key: BEAM-7060
> URL: https://issues.apache.org/jira/browse/BEAM-7060
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> Existing [Typehints implementaiton in 
> Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/
> ] heavily relies on internal details of CPython implementation, and some of 
> the assumptions of this implementation broke as of Python 3.6, see for 
> example: https://issues.apache.org/jira/browse/BEAM-6877, which makes  
> typehints support unusable on Python 3.6 as of now. [Python 3 Kanban 
> Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245=detail]
>  lists several specific typehints-related breakages, prefixed with "TypeHints 
> Py3 Error".
> We need to decide whether to:
> - Deprecate in-house typehints implementation.
> - Continue to support in-house implementation, which at this point is a stale 
> code and has other known issues.
> - Attempt to use some off-the-shelf libraries for supporting 
> type-annotations, like  Pytype, Mypy, PyAnnotate.
> WRT to this decision we also need to plan on immediate next steps to unblock 
> adoption of Beam for  Python 3.6+ users. One potential option may be to have 
> Beam SDK ignore any typehint annotations on Py 3.6+.
> cc: [~udim], [~altay], [~robertwb].



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=304679=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304679
 ]

ASF GitHub Bot logged work on BEAM-7060:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:21
Start Date: 31/Aug/19 00:21
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #9283: [BEAM-7060] Type hints 
from Python 3 annotations
URL: https://github.com/apache/beam/pull/9283#issuecomment-526783557
 
 
   run python 2 postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304679)
Time Spent: 12h 50m  (was: 12h 40m)

> Design Py3-compatible typehints annotation support in Beam 3.
> -
>
> Key: BEAM-7060
> URL: https://issues.apache.org/jira/browse/BEAM-7060
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> Existing [Typehints implementaiton in 
> Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/
> ] heavily relies on internal details of CPython implementation, and some of 
> the assumptions of this implementation broke as of Python 3.6, see for 
> example: https://issues.apache.org/jira/browse/BEAM-6877, which makes  
> typehints support unusable on Python 3.6 as of now. [Python 3 Kanban 
> Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245=detail]
>  lists several specific typehints-related breakages, prefixed with "TypeHints 
> Py3 Error".
> We need to decide whether to:
> - Deprecate in-house typehints implementation.
> - Continue to support in-house implementation, which at this point is a stale 
> code and has other known issues.
> - Attempt to use some off-the-shelf libraries for supporting 
> type-annotations, like  Pytype, Mypy, PyAnnotate.
> WRT to this decision we also need to plan on immediate next steps to unblock 
> adoption of Beam for  Python 3.6+ users. One potential option may be to have 
> Beam SDK ignore any typehint annotations on Py 3.6+.
> cc: [~udim], [~altay], [~robertwb].



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=304680=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304680
 ]

ASF GitHub Bot logged work on BEAM-7060:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:21
Start Date: 31/Aug/19 00:21
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9283: [BEAM-7060] Type 
hints from Python 3 annotations
URL: https://github.com/apache/beam/pull/9283#discussion_r319691656
 
 

 ##
 File path: sdks/python/apache_beam/typehints/typed_pipeline_test.py
 ##
 @@ -290,5 +306,26 @@ def test_flat_type_hint(self):
   self.test_input | self.CustomTransform().with_output_types(int)
 
 
+class AnnotationsTest(unittest.TestCase):
+
+  def test_pardo_wrapper_builtin(self):
+th = beam.ParDo(str.strip).get_type_hints()
+if sys.version_info < (3, 7):
+  self.assertEqual(th.input_types, ((str,), {}))
+else:
+  # Python 3.7+ has annotations for CPython builtins
+  # (_MethodDescriptorType).
+  self.assertEqual(th.input_types, ((str, typehints.Any), {}))
+self.assertEqual(th.output_types, ((typehints.Any,), {}))
+
+th = beam.ParDo(list).get_type_hints()
+self.assertIsNone(th.input_types)
+self.assertIsNone(th.output_types)
 
 Review comment:
   Added inference of return types for callables that are instances of `type`.
   Removed below test case for `len`, since it doesn't return a valid iterable 
(couldn't find another 'builtin_function_or_method' to replace it with, see 
typed_pipeline_test_py3.py for cases where we raise a ValueError if return type 
annotation is not iterable).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304680)
Time Spent: 13h  (was: 12h 50m)

> Design Py3-compatible typehints annotation support in Beam 3.
> -
>
> Key: BEAM-7060
> URL: https://issues.apache.org/jira/browse/BEAM-7060
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> Existing [Typehints implementaiton in 
> Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/
> ] heavily relies on internal details of CPython implementation, and some of 
> the assumptions of this implementation broke as of Python 3.6, see for 
> example: https://issues.apache.org/jira/browse/BEAM-6877, which makes  
> typehints support unusable on Python 3.6 as of now. [Python 3 Kanban 
> Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245=detail]
>  lists several specific typehints-related breakages, prefixed with "TypeHints 
> Py3 Error".
> We need to decide whether to:
> - Deprecate in-house typehints implementation.
> - Continue to support in-house implementation, which at this point is a stale 
> code and has other known issues.
> - Attempt to use some off-the-shelf libraries for supporting 
> type-annotations, like  Pytype, Mypy, PyAnnotate.
> WRT to this decision we also need to plan on immediate next steps to unblock 
> adoption of Beam for  Python 3.6+ users. One potential option may be to have 
> Beam SDK ignore any typehint annotations on Py 3.6+.
> cc: [~udim], [~altay], [~robertwb].



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=304678=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304678
 ]

ASF GitHub Bot logged work on BEAM-7060:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:21
Start Date: 31/Aug/19 00:21
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #9283: [BEAM-7060] Type hints 
from Python 3 annotations
URL: https://github.com/apache/beam/pull/9283#issuecomment-526783520
 
 
   run python 3.7 postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304678)
Time Spent: 12h 40m  (was: 12.5h)

> Design Py3-compatible typehints annotation support in Beam 3.
> -
>
> Key: BEAM-7060
> URL: https://issues.apache.org/jira/browse/BEAM-7060
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Existing [Typehints implementaiton in 
> Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/
> ] heavily relies on internal details of CPython implementation, and some of 
> the assumptions of this implementation broke as of Python 3.6, see for 
> example: https://issues.apache.org/jira/browse/BEAM-6877, which makes  
> typehints support unusable on Python 3.6 as of now. [Python 3 Kanban 
> Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245=detail]
>  lists several specific typehints-related breakages, prefixed with "TypeHints 
> Py3 Error".
> We need to decide whether to:
> - Deprecate in-house typehints implementation.
> - Continue to support in-house implementation, which at this point is a stale 
> code and has other known issues.
> - Attempt to use some off-the-shelf libraries for supporting 
> type-annotations, like  Pytype, Mypy, PyAnnotate.
> WRT to this decision we also need to plan on immediate next steps to unblock 
> adoption of Beam for  Python 3.6+ users. One potential option may be to have 
> Beam SDK ignore any typehint annotations on Py 3.6+.
> cc: [~udim], [~altay], [~robertwb].



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7993) portable python precommit is flaky

2019-08-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919973#comment-16919973
 ] 

Valentyn Tymofieiev commented on BEAM-7993:
---

If the error is reoccurring, we could try to SSH into the VM, SSH into the 
failed container and inspect contents of SDK tarball, or installed version of 
Beam SDK in the container. If SDK tarball differs in passing (say, python2) and 
failing (python3) container for a postcommit run against the same PR, it may 
mean that we build SDK tarball twice, and as we know, it does not work well if 
we build it in parallel. Perhaps we can find out other clues. 

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22   

[jira] [Work logged] (BEAM-7060) Design Py3-compatible typehints annotation support in Beam 3.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7060?focusedWorklogId=304676=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304676
 ]

ASF GitHub Bot logged work on BEAM-7060:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:07
Start Date: 31/Aug/19 00:07
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #9283: [BEAM-7060] Type hints 
from Python 3 annotations
URL: https://github.com/apache/beam/pull/9283#issuecomment-526782079
 
 
   run python 2.7 postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304676)
Time Spent: 12.5h  (was: 12h 20m)

> Design Py3-compatible typehints annotation support in Beam 3.
> -
>
> Key: BEAM-7060
> URL: https://issues.apache.org/jira/browse/BEAM-7060
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Udi Meiri
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> Existing [Typehints implementaiton in 
> Beam|[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/
> ] heavily relies on internal details of CPython implementation, and some of 
> the assumptions of this implementation broke as of Python 3.6, see for 
> example: https://issues.apache.org/jira/browse/BEAM-6877, which makes  
> typehints support unusable on Python 3.6 as of now. [Python 3 Kanban 
> Board|https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=245=detail]
>  lists several specific typehints-related breakages, prefixed with "TypeHints 
> Py3 Error".
> We need to decide whether to:
> - Deprecate in-house typehints implementation.
> - Continue to support in-house implementation, which at this point is a stale 
> code and has other known issues.
> - Attempt to use some off-the-shelf libraries for supporting 
> type-annotations, like  Pytype, Mypy, PyAnnotate.
> WRT to this decision we also need to plan on immediate next steps to unblock 
> adoption of Beam for  Python 3.6+ users. One potential option may be to have 
> Beam SDK ignore any typehint annotations on Py 3.6+.
> cc: [~udim], [~altay], [~robertwb].



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?focusedWorklogId=304674=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304674
 ]

ASF GitHub Bot logged work on BEAM-7972:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:05
Start Date: 31/Aug/19 00:05
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #9334: [BEAM-7972] 
Always use Global window in reshuffle and then apply wind…
URL: https://github.com/apache/beam/pull/9334
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304674)
Time Spent: 4h 10m  (was: 4h)

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?focusedWorklogId=304675=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304675
 ]

ASF GitHub Bot logged work on BEAM-7972:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:05
Start Date: 31/Aug/19 00:05
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #9334: [BEAM-7972] Always 
use Global window in reshuffle and then apply wind…
URL: https://github.com/apache/beam/pull/9334#issuecomment-526781791
 
 
   Thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304675)
Time Spent: 4h 20m  (was: 4h 10m)

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7993) portable python precommit is flaky

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7993?focusedWorklogId=304673=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304673
 ]

ASF GitHub Bot logged work on BEAM-7993:


Author: ASF GitHub Bot
Created on: 31/Aug/19 00:04
Start Date: 31/Aug/19 00:04
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #9460: [BEAM-7993] Run 
Portable PreCommit tests sequentially
URL: https://github.com/apache/beam/pull/9460#issuecomment-526781712
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304673)
Time Spent: 1.5h  (was: 1h 20m)

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> 

[jira] [Work logged] (BEAM-8116) Beam Release Retrospective Improvements

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8116?focusedWorklogId=304669=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304669
 ]

ASF GitHub Bot logged work on BEAM-8116:


Author: ASF GitHub Bot
Created on: 30/Aug/19 23:32
Start Date: 30/Aug/19 23:32
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #9457: 
[BEAM-8116] Update website instructions in the build_rc script.
URL: https://github.com/apache/beam/pull/9457#discussion_r319699103
 
 

 ##
 File path: release/src/main/scripts/build_release_candidate.sh
 ##
 @@ -280,15 +280,12 @@ echo "2. Source distribution deployed to 
https://dist.apache.org/repos/dist/dev/
 echo "3. Website pull request published the Java API reference manual the 
Python API reference manual."
 
 echo "==Things Needed To Be Done Manually=="
-echo "1.You need to update website updates PR with a new commit: "
+echo "1.Make sure a pull request is created to update the javadoc and pydoc to 
the beam-site: "
 echo "  - cd 
~/${LOCAL_WEBSITE_UPDATE_DIR}/${LOCAL_WEBSITE_REPO}/${WEBSITE_ROOT_DIR}"
 echo "  - git checkout updates_release_${RELEASE}"
-echo "  - Add new release into src/get-started/downloads.md "
+echo "  - Check if both javadoc/ and pydoc/ exist."
 echo "  - commit your changes"
-echo "2.You need to update website updates PR with another commit: 
src/get-started/downloads.md"
-echo "  - add new release download links like commit: "
-echo "
https://github.com/apache/beam-site/commit/29394625ce54f0c5584c3db730d3eb6bf365a80c#diff-abdcc989e94369c2324cf64b66659eda;
-echo "  - update last release download links from release to archive like 
commit: "
-echo "
https://github.com/apache/beam-site/commit/6b9bdb31324d5c0250a79224507da0ea7ae8ccbf#diff-abdcc989e94369c2324cf64b66659eda;
+echo "2.Create a pull request to update the release in the beam/website:"
+echo "  - An example pull request:https://github.com/apache/beam/pull/9341;
 
 Review comment:
   The `Release Note` url may not very obvious to get. We have an instruction 
[here](https://beam.apache.org/contribute/release-guide/#review-release-notes-in-jira).
 Can you add this link somewhere in this script in case people don't know how 
to find that?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304669)
Time Spent: 40m  (was: 0.5h)

> Beam Release Retrospective Improvements
> ---
>
> Key: BEAM-8116
> URL: https://issues.apache.org/jira/browse/BEAM-8116
> Project: Beam
>  Issue Type: Task
>  Components: project-management, testing, website
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We summarized some pain points during the 2.15 release. Log bugs under this 
> task to track the progress.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8116) Beam Release Retrospective Improvements

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8116?focusedWorklogId=304668=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304668
 ]

ASF GitHub Bot logged work on BEAM-8116:


Author: ASF GitHub Bot
Created on: 30/Aug/19 23:31
Start Date: 30/Aug/19 23:31
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #9457: 
[BEAM-8116] Update website instructions in the build_rc script.
URL: https://github.com/apache/beam/pull/9457#discussion_r319699103
 
 

 ##
 File path: release/src/main/scripts/build_release_candidate.sh
 ##
 @@ -280,15 +280,12 @@ echo "2. Source distribution deployed to 
https://dist.apache.org/repos/dist/dev/
 echo "3. Website pull request published the Java API reference manual the 
Python API reference manual."
 
 echo "==Things Needed To Be Done Manually=="
-echo "1.You need to update website updates PR with a new commit: "
+echo "1.Make sure a pull request is created to update the javadoc and pydoc to 
the beam-site: "
 echo "  - cd 
~/${LOCAL_WEBSITE_UPDATE_DIR}/${LOCAL_WEBSITE_REPO}/${WEBSITE_ROOT_DIR}"
 echo "  - git checkout updates_release_${RELEASE}"
-echo "  - Add new release into src/get-started/downloads.md "
+echo "  - Check if both javadoc/ and pydoc/ exist."
 echo "  - commit your changes"
-echo "2.You need to update website updates PR with another commit: 
src/get-started/downloads.md"
-echo "  - add new release download links like commit: "
-echo "
https://github.com/apache/beam-site/commit/29394625ce54f0c5584c3db730d3eb6bf365a80c#diff-abdcc989e94369c2324cf64b66659eda;
-echo "  - update last release download links from release to archive like 
commit: "
-echo "
https://github.com/apache/beam-site/commit/6b9bdb31324d5c0250a79224507da0ea7ae8ccbf#diff-abdcc989e94369c2324cf64b66659eda;
+echo "2.Create a pull request to update the release in the beam/website:"
+echo "  - An example pull request:https://github.com/apache/beam/pull/9341;
 
 Review comment:
   The `Release Note` url may not very obvious to get. We have an instruction 
[here](https://beam.apache.org/contribute/release-guide/#review-release-notes-in-jira).
 Can you add this guide here in case people don't know how to find that?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304668)
Time Spent: 0.5h  (was: 20m)

> Beam Release Retrospective Improvements
> ---
>
> Key: BEAM-8116
> URL: https://issues.apache.org/jira/browse/BEAM-8116
> Project: Beam
>  Issue Type: Task
>  Components: project-management, testing, website
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We summarized some pain points during the 2.15 release. Log bugs under this 
> task to track the progress.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7993) portable python precommit is flaky

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7993?focusedWorklogId=304666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304666
 ]

ASF GitHub Bot logged work on BEAM-7993:


Author: ASF GitHub Bot
Created on: 30/Aug/19 23:26
Start Date: 30/Aug/19 23:26
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #9460: [BEAM-7993] Run 
Portable PreCommit tests sequentially
URL: https://github.com/apache/beam/pull/9460#issuecomment-526777132
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304666)
Time Spent: 1h 20m  (was: 1h 10m)

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> 

[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=304664=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304664
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 30/Aug/19 23:15
Start Date: 30/Aug/19 23:15
Worklog Time Spent: 10m 
  Work Description: davidcavazos commented on pull request #9433: 
[BEAM-7389] Update to use util.ToString transform
URL: https://github.com/apache/beam/pull/9433#discussion_r319697186
 
 

 ##
 File path: 
sdks/python/apache_beam/examples/snippets/transforms/element_wise/to_string_test.py
 ##
 @@ -19,36 +19,81 @@
 from __future__ import absolute_import
 from __future__ import print_function
 
+import sys
 import unittest
 
 import mock
 
-from apache_beam.examples.snippets.transforms.element_wise.to_string import *
 from apache_beam.testing.test_pipeline import TestPipeline
 from apache_beam.testing.util import assert_that
 from apache_beam.testing.util import equal_to
 
+from . import to_string
+
+
+def check_plants(actual):
+  # [START plants]
+  plants = [
+  ',Strawberry',
+  '凌,Carrot',
+  ',Eggplant',
+  ',Tomato',
+  '凜,Potato',
+  ]
+  # [END plants]
+  assert_that(actual, equal_to(plants))
+
+
+def check_plant_lists(actual):
+  # [START plant_lists]
+  plant_lists = [
+  "['', 'Strawberry', 'perennial']",
+  "['凌', 'Carrot', 'biennial']",
+  "['', 'Eggplant', 'perennial']",
+  "['', 'Tomato', 'annual']",
+  "['凜', 'Potato', 'perennial']",
+  ]
+  # [END plant_lists]
+
+  # Some unicode characters become escaped with double backslashes.
+  import apache_beam as beam
+
+  def normalize_escaping(elem):
 
 Review comment:
   Sure, I just added it alongside another comment.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304664)
Time Spent: 53h 40m  (was: 53.5h)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 53h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8124) Clean tests after Python 2 deprecation

2019-08-30 Thread David Cavazos (Jira)
David Cavazos created BEAM-8124:
---

 Summary: Clean tests after Python 2 deprecation
 Key: BEAM-8124
 URL: https://issues.apache.org/jira/browse/BEAM-8124
 Project: Beam
  Issue Type: Task
  Components: sdk-py-core
Reporter: David Cavazos


Clean up or simplify tests that have special handling for Python 2 after its 
deprecation.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?focusedWorklogId=304661=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304661
 ]

ASF GitHub Bot logged work on BEAM-7972:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:58
Start Date: 30/Aug/19 22:58
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #9334: [BEAM-7972] Always 
use Global window in reshuffle and then apply wind…
URL: https://github.com/apache/beam/pull/9334#issuecomment-526773193
 
 
   LGTM
   
   On Fri, Aug 30, 2019, 3:37 PM Ankur  wrote:
   
   > *@angoenka* commented on this pull request.
   > --
   >
   > In sdks/python/apache_beam/transforms/util.py
   > :
   >
   > >  for value in values]
   >
   >  ungrouped = pcoll | Map(reify_timestamps)
   > +
   > +# TODO(BEAM-8104) Using global window as one of the standard window.
   > +# This is to mitigate the Java Runner Harness limitation to
   >
   > done
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or mute the thread
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304661)
Time Spent: 4h  (was: 3h 50m)

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-6158) Using --save_main_session fails on Python 3 when main module has invocations of superclass method using 'super' .

2019-08-30 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919945#comment-16919945
 ] 

Valentyn Tymofieiev commented on BEAM-6158:
---

The error is happens when main pipeline module has class methods that refer to 
superclass methods using super(). A reference to super in the method code 
creates a cyclical reference inside the object, which dill  currently handles 
via pickling objects by reference. Such approach does not work for restoring a 
pickled  a main session, since object classes need to be defined at the moment 
of unpickling . This issue will be addressed after  
https://github.com/uqfoundation/dill/issues/300. is fixed or we start using 
CloudPickle as a pickler, which is investigated in BEAM-8123. 

In the meantime following workarounds are available:
- don't use super() in the main module.
- refer to superclass methods via SuperClassName.method(self, ...). This is NOT 
an equivalent replacement, but may work in simple class hierarchies. 

> Using --save_main_session fails on Python 3 when main module has invocations 
> of superclass method using 'super' .
> -
>
> Key: BEAM-6158
> URL: https://issues.apache.org/jira/browse/BEAM-6158
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: Mark Liu
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> A typical manifestation of this failure, which can be observed on several 
> Beam examples:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
> "__main__", mod_spec)
>   File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
> exec(code, run_globals)
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
>  line 164, in 
> run()
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
>  line 158, in run 
> | 'WriteUserScoreSums' >> beam.io.WriteToText(args.output))
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/pipeline.py",
>  line 426, in __exit__
>  
> self.run().wait_until_finish()
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1338, in wait_until_finish   
> (self.state, getattr(self._runner, 'last_error_msg', None)), self)
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
> Dataflow pipeline failed. State: FAILED, Error:   
>  
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
>
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", 
> line 280, in load_session 
>
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
>   File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 474, in 
> find_class
> return StockUnpickler.find_class(self, module, name)
> AttributeError: Can't get attribute 'ParseGameEventFn' on  'dataflow_worker.start' from 
> '/usr/local/lib/python3.5/site-packages/dataflow_worker/start.py'> {noformat}
>  
> Note that the example has the following code [1]:
> {code:python}
> class ParseGameEventFn(beam.DoFn):
>   def __init__(self):
>     super(ParseGameEventFn, self).__init__()
> {code}
> https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/examples/complete/game/user_score.py#L81
> +cc: [~tvalentyn] [~robertwb] [~altay]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=304659=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304659
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:54
Start Date: 30/Aug/19 22:54
Worklog Time Spent: 10m 
  Work Description: davidcavazos commented on issue #9262: [BEAM-7389] Add 
code examples for Regex page
URL: https://github.com/apache/beam/pull/9262#issuecomment-526772472
 
 
   It's to show that `' *'` can match zero or *more* spaces. If you think it's 
confusing I can make them all the same.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304659)
Time Spent: 53.5h  (was: 53h 20m)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 53.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=304657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304657
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:48
Start Date: 30/Aug/19 22:48
Worklog Time Spent: 10m 
  Work Description: davidcavazos commented on pull request #9257: 
[BEAM-7389] Add DoFn methods sample
URL: https://github.com/apache/beam/pull/9257#discussion_r319693091
 
 

 ##
 File path: 
sdks/python/apache_beam/examples/snippets/transforms/element_wise/pardo_test.py
 ##
 @@ -58,24 +58,44 @@ def check_dofn_params(actual):
 type(window) -> 
 window.start -> Timestamp(1584675660) (2020-03-20 03:41:00)
 window.end -> Timestamp(1584675690) (2020-03-20 03:41:30)
-window.max_timestamp() -> Timestamp(1584675689.99) (2020-03-20 
03:41:29.99)'''
-  # [END dofn_params]
+window.max_timestamp() -> Timestamp(1584675689.99) (2020-03-20 
03:41:29.99)
+[END dofn_params]'''.splitlines()[1:-1])
   # pylint: enable=line-too-long
   assert_that(actual, equal_to([dofn_params]))
 
 
+def check_dofn_methods(actual):
+  results = '''[START results]
+__init__
+setup
+start_bundle
+* process: 
+* process: 凌
+* process: 
+* process: 
+* process: 凜
+* finish_bundle: 
+teardown
+[END results]'''.splitlines()[1:-1]
+  results = [line for line in results if line.startswith('*')]
+  assert_that(actual, equal_to(results))
 
 Review comment:
   I'm changing the test to actually check for both the order of the methods 
executed, as well as all the elements being present without ordering 
constraints.
   
   Regarding if `DoFn` methods work on all runners, looking at #7994 I could 
only find that [BEAM-7885 `DoFn.setup` doesn't run for streaming jobs in the 
DirectRunner](https://issues.apache.org/jira/browse/BEAM-7885), and [BEAM-7340 
`DoFn.teardown` metrics are 
lost](https://issues.apache.org/jira/browse/BEAM-7340). I'll add those notes 
into the docs as well (rather than comments in the code, for screen clutter :).
   
   I looked into Jira too, but that was all that I could find, are you aware of 
any other issues with other runners?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304657)
Time Spent: 53h 20m  (was: 53h 10m)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 53h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=304655=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304655
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:43
Start Date: 30/Aug/19 22:43
Worklog Time Spent: 10m 
  Work Description: davidcavazos commented on pull request #9257: 
[BEAM-7389] Add DoFn methods sample
URL: https://github.com/apache/beam/pull/9257#discussion_r319693091
 
 

 ##
 File path: 
sdks/python/apache_beam/examples/snippets/transforms/element_wise/pardo_test.py
 ##
 @@ -58,24 +58,44 @@ def check_dofn_params(actual):
 type(window) -> 
 window.start -> Timestamp(1584675660) (2020-03-20 03:41:00)
 window.end -> Timestamp(1584675690) (2020-03-20 03:41:30)
-window.max_timestamp() -> Timestamp(1584675689.99) (2020-03-20 
03:41:29.99)'''
-  # [END dofn_params]
+window.max_timestamp() -> Timestamp(1584675689.99) (2020-03-20 
03:41:29.99)
+[END dofn_params]'''.splitlines()[1:-1])
   # pylint: enable=line-too-long
   assert_that(actual, equal_to([dofn_params]))
 
 
+def check_dofn_methods(actual):
+  results = '''[START results]
+__init__
+setup
+start_bundle
+* process: 
+* process: 凌
+* process: 
+* process: 
+* process: 凜
+* finish_bundle: 
+teardown
+[END results]'''.splitlines()[1:-1]
+  results = [line for line in results if line.startswith('*')]
+  assert_that(actual, equal_to(results))
 
 Review comment:
   I'm changing the test to actually check for both the order of the methods 
executed, as well as all the elements being present without ordering 
constraints.
   
   Regarding if `DoFn` methods work on all runners, looking at #7994 I could 
only find that [BEAM-7885 `DoFn.setup` doesn't run for streaming jobs in the 
DirectRunner](https://issues.apache.org/jira/browse/BEAM-7885), and [BEAM-7340 
`DoFn.teardown` metrics are 
lost](https://issues.apache.org/jira/browse/BEAM-7340). I'm adding those into 
comments.
   
   I looked into Jira too, but that was all that I could find, are you aware of 
any other issues with other runners?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304655)
Time Spent: 53h 10m  (was: 53h)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 53h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304654=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304654
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:39
Start Date: 30/Aug/19 22:39
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9374: [BEAM-5428] 
Implement Runner support for cache tokens
URL: https://github.com/apache/beam/pull/9374#discussion_r319690110
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
 ##
 @@ -248,31 +250,37 @@ private StateRequestHandler 
getStateRequestHandler(ExecutableStage executableSta
 return StateRequestHandlers.delegateBasedUponType(handlerMap);
   }
 
-  private static class BagUserStateFactory
+  static class BagUserStateFactory
   implements StateRequestHandlers.BagUserStateHandlerFactory {
 
 private final StateInternals stateInternals;
 private final KeyedStateBackend keyedStateBackend;
 private final Lock stateBackendLock;
+/** Holds the current valid cache token for this operator. */
+private final ByteString cacheToken;
 
-private BagUserStateFactory(
+BagUserStateFactory(
 StateInternals stateInternals,
 KeyedStateBackend keyedStateBackend,
 Lock stateBackendLock) {
 
   this.stateInternals = stateInternals;
   this.keyedStateBackend = keyedStateBackend;
   this.stateBackendLock = stateBackendLock;
+  this.cacheToken = ByteString.copyFrom(UUID.randomUUID().toString(), 
Charsets.UTF_8);
 
 Review comment:
   nit: Please take in an `IdGenerator` as a parameter since it will allow one 
to explicitly test knowing what the cache token value(s) will be. Also consider 
using `IdGenerators#incrementingLongs`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304654)
Time Spent: 12.5h  (was: 12h 20m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?focusedWorklogId=304652=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304652
 ]

ASF GitHub Bot logged work on BEAM-7972:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:37
Start Date: 30/Aug/19 22:37
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #9334: [BEAM-7972] 
Always use Global window in reshuffle and then apply wind…
URL: https://github.com/apache/beam/pull/9334#discussion_r319692131
 
 

 ##
 File path: sdks/python/apache_beam/transforms/util.py
 ##
 @@ -612,29 +611,27 @@ def restore_timestamps(element):
 for (value, timestamp) in values]
 
 else:
-  # The linter is confused.
-  # hash(1) is used to force "runtime" selection of _IdentityWindowFn
-  # pylint: disable=abstract-class-instantiated
-  cls = hash(1) and _IdentityWindowFn
-  window_fn = cls(
-  windowing_saved.windowfn.get_window_coder())
-
-  def reify_timestamps(element, timestamp=DoFn.TimestampParam):
+  def reify_timestamps(element,
+   timestamp=DoFn.TimestampParam,
+   window=DoFn.WindowParam):
 key, value = element
-return key, TimestampedValue(value, timestamp)
+# Transport the window as part of the value and restore it later.
+return key, windowed_value.WindowedValue(value, timestamp, [window])
 
-  def restore_timestamps(element, window=DoFn.WindowParam):
-# Pass the current window since _IdentityWindowFn wouldn't know how
-# to generate it.
+  def restore_timestamps(element):
 key, values = element
 return [
 windowed_value.WindowedValue(
-(key, value.value), value.timestamp, [window])
+(key, value.value), value.timestamp, value.windows)
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304652)
Time Spent: 3h 40m  (was: 3.5h)

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?focusedWorklogId=304653=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304653
 ]

ASF GitHub Bot logged work on BEAM-7972:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:37
Start Date: 30/Aug/19 22:37
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #9334: [BEAM-7972] 
Always use Global window in reshuffle and then apply wind…
URL: https://github.com/apache/beam/pull/9334#discussion_r319692142
 
 

 ##
 File path: sdks/python/apache_beam/transforms/util.py
 ##
 @@ -612,29 +611,27 @@ def restore_timestamps(element):
 for (value, timestamp) in values]
 
 else:
-  # The linter is confused.
-  # hash(1) is used to force "runtime" selection of _IdentityWindowFn
-  # pylint: disable=abstract-class-instantiated
-  cls = hash(1) and _IdentityWindowFn
-  window_fn = cls(
-  windowing_saved.windowfn.get_window_coder())
-
-  def reify_timestamps(element, timestamp=DoFn.TimestampParam):
+  def reify_timestamps(element,
+   timestamp=DoFn.TimestampParam,
+   window=DoFn.WindowParam):
 key, value = element
-return key, TimestampedValue(value, timestamp)
+# Transport the window as part of the value and restore it later.
+return key, windowed_value.WindowedValue(value, timestamp, [window])
 
-  def restore_timestamps(element, window=DoFn.WindowParam):
-# Pass the current window since _IdentityWindowFn wouldn't know how
-# to generate it.
+  def restore_timestamps(element):
 key, values = element
 return [
 windowed_value.WindowedValue(
-(key, value.value), value.timestamp, [window])
+(key, value.value), value.timestamp, value.windows)
 for value in values]
 
 ungrouped = pcoll | Map(reify_timestamps)
+
+# TODO(BEAM-8104) Using global window as one of the standard window.
+# This is to mitigate the Java Runner Harness limitation to
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304653)
Time Spent: 3h 50m  (was: 3h 40m)

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-6158) Using --save_main_session fails on Python 3 when main module has invocations of superclass method using 'super' .

2019-08-30 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-6158:
--
Summary: Using --save_main_session fails on Python 3 when main module has 
invocations of superclass method using 'super' .  (was: Using 
--save_main_session fails on Python 3 when main module has superclass 
constructor calls.)

> Using --save_main_session fails on Python 3 when main module has invocations 
> of superclass method using 'super' .
> -
>
> Key: BEAM-6158
> URL: https://issues.apache.org/jira/browse/BEAM-6158
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: Mark Liu
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> A typical manifestation of this failure, which can be observed on several 
> Beam examples:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python3.5/runpy.py", line 193, in _run_module_as_main
> "__main__", mod_spec)
>   File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
> exec(code, run_globals)
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
>  line 164, in 
> run()
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/examples/complete/game/user_score.py",
>  line 158, in run 
> | 'WriteUserScoreSums' >> beam.io.WriteToText(args.output))
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/pipeline.py",
>  line 426, in __exit__
>  
> self.run().wait_until_finish()
>   File 
> "/usr/local/google/home/valentyn/tmp/r2.14.0_py3.5_env/lib/python3.5/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1338, in wait_until_finish   
> (self.state, getattr(self._runner, 'last_error_msg', None)), self)
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
> Dataflow pipeline failed. State: FAILED, Error:   
>  
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 
> 773, in run
> self._load_main_session(self.local_staging_directory)
>   File 
> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", line 
> 489, in _load_main_session
>
> pickler.load_session(session_file)
>   File 
> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", 
> line 280, in load_session 
>
> return dill.load_session(file_path)
>   File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 410, in 
> load_session
> module = unpickler.load()
>   File "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 474, in 
> find_class
> return StockUnpickler.find_class(self, module, name)
> AttributeError: Can't get attribute 'ParseGameEventFn' on  'dataflow_worker.start' from 
> '/usr/local/lib/python3.5/site-packages/dataflow_worker/start.py'> {noformat}
>  
> Note that the example has the following code [1]:
> {code:python}
> class ParseGameEventFn(beam.DoFn):
>   def __init__(self):
>     super(ParseGameEventFn, self).__init__()
> {code}
> https://github.com/apache/beam/blob/0325c360bef17a6673e2d43051e59174b8e5ccc9/sdks/python/apache_beam/examples/complete/game/user_score.py#L81
> +cc: [~tvalentyn] [~robertwb] [~altay]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8123) Support CloudPickle as pickler for Apache Beam.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8123?focusedWorklogId=304649=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304649
 ]

ASF GitHub Bot logged work on BEAM-8123:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:30
Start Date: 30/Aug/19 22:30
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8978: [BEAM-8123]  
[DO NOT MERGE] POC: use cloudpickle for pickling.
URL: https://github.com/apache/beam/pull/8978#discussion_r302340299
 
 

 ##
 File path: sdks/python/apache_beam/transforms/ptransform.py
 ##
 @@ -696,8 +697,9 @@ def __init__(self, fn, *args, **kwargs):
 # Ensure fn and side inputs are picklable for remote execution.
 try:
   self.fn = pickler.loads(pickler.dumps(self.fn))
-except RuntimeError:
-  raise RuntimeError('Unable to pickle fn %s' % self.fn)
+except Exception:
+  message = traceback.format_exc()
+  raise RuntimeError('Unable to pickle fn %s %s.' % (self.fn, message))
 
 Review comment:
   Comment without fixup.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304649)
Time Spent: 0.5h  (was: 20m)

> Support CloudPickle as pickler for Apache Beam.
> ---
>
> Key: BEAM-8123
> URL: https://issues.apache.org/jira/browse/BEAM-8123
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7993) portable python precommit is flaky

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7993?focusedWorklogId=304648=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304648
 ]

ASF GitHub Bot logged work on BEAM-7993:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:30
Start Date: 30/Aug/19 22:30
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #9460: [BEAM-7993] Run 
Portable PreCommit tests sequentially
URL: https://github.com/apache/beam/pull/9460#issuecomment-526768601
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304648)
Time Spent: 1h 10m  (was: 1h)

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> 

[jira] [Work logged] (BEAM-8123) Support CloudPickle as pickler for Apache Beam.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8123?focusedWorklogId=304647=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304647
 ]

ASF GitHub Bot logged work on BEAM-8123:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:30
Start Date: 30/Aug/19 22:30
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #8978: [BEAM-8123]  [DO 
NOT MERGE] POC: use cloudpickle for pickling.
URL: https://github.com/apache/beam/pull/8978#issuecomment-526768597
 
 
   Run Python 3.6 postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304647)
Time Spent: 20m  (was: 10m)

> Support CloudPickle as pickler for Apache Beam.
> ---
>
> Key: BEAM-8123
> URL: https://issues.apache.org/jira/browse/BEAM-8123
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7993) portable python precommit is flaky

2019-08-30 Thread Mark Liu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919934#comment-16919934
 ] 

Mark Liu commented on BEAM-7993:


Thank you for investigating Hannah. You are right, only one sdist runs and both 
py2 and py35 container use same output tarball. So now, I don't know what cause 
this ImportError. It's likely py35 container wasn't built correctly, or some 
step in pipeline execution is wrong under current test configurations. 

Just in case, I still refactord the Jenkins config to make those tests run in 
sequential. Let's see if the failure still exist or not.

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22  at 
> 

[jira] [Work logged] (BEAM-8123) Support CloudPickle as pickler for Apache Beam.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8123?focusedWorklogId=304643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304643
 ]

ASF GitHub Bot logged work on BEAM-8123:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:25
Start Date: 30/Aug/19 22:25
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #8978: [BEAM-8123]  [DO 
NOT MERGE] POC: use cloudpickle for pickling.
URL: https://github.com/apache/beam/pull/8978#issuecomment-526767723
 
 
   retest this please
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304643)
Remaining Estimate: 0h
Time Spent: 10m

> Support CloudPickle as pickler for Apache Beam.
> ---
>
> Key: BEAM-8123
> URL: https://issues.apache.org/jira/browse/BEAM-8123
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8123) Support CloudPickle as pickler for Apache Beam.

2019-08-30 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-8123:
--
Status: Open  (was: Triage Needed)

> Support CloudPickle as pickler for Apache Beam.
> ---
>
> Key: BEAM-8123
> URL: https://issues.apache.org/jira/browse/BEAM-8123
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8123) Support CloudPickle as pickler for Apache Beam.

2019-08-30 Thread Valentyn Tymofieiev (Jira)
Valentyn Tymofieiev created BEAM-8123:
-

 Summary: Support CloudPickle as pickler for Apache Beam.
 Key: BEAM-8123
 URL: https://issues.apache.org/jira/browse/BEAM-8123
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev
Assignee: Valentyn Tymofieiev






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7993) portable python precommit is flaky

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7993?focusedWorklogId=304639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304639
 ]

ASF GitHub Bot logged work on BEAM-7993:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:11
Start Date: 30/Aug/19 22:11
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #9460: 
[BEAM-7993] Run Portable PreCommit tests sequentially
URL: https://github.com/apache/beam/pull/9460
 
 
   Let's run portable precommit tests sequentially to see if the ImportError is 
caused by parallel execution.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-7993) portable python precommit is flaky

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7993?focusedWorklogId=304640=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304640
 ]

ASF GitHub Bot logged work on BEAM-7993:


Author: ASF GitHub Bot
Created on: 30/Aug/19 22:11
Start Date: 30/Aug/19 22:11
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #9460: [BEAM-7993] Run 
Portable PreCommit tests sequentially
URL: https://github.com/apache/beam/pull/9460#issuecomment-526765170
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304640)
Time Spent: 1h  (was: 50m)

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> 

[jira] [Resolved] (BEAM-3489) Expose the message id of received messages within PubsubMessage

2019-08-30 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-3489.
-
Fix Version/s: 2.16.0
   Resolution: Fixed

> Expose the message id of received messages within PubsubMessage
> ---
>
> Key: BEAM-3489
> URL: https://issues.apache.org/jira/browse/BEAM-3489
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Luke Cwik
>Assignee: Thinh Ha
>Priority: Minor
>  Labels: newbie, starter
> Fix For: 2.16.0
>
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> This task is about passing forward the message id from the pubsub proto to 
> the java PubsubMessage.
> Add a message id field to PubsubMessage.
> Update the coder for PubsubMessage to encode the message id.
> Update the translation from the Pubsub proto message to the Dataflow message:
> https://github.com/apache/beam/blob/2e275264b21db45787833502e5e42907b05e28b8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java#L976



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-3489) Expose the message id of received messages within PubsubMessage

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3489?focusedWorklogId=304630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304630
 ]

ASF GitHub Bot logged work on BEAM-3489:


Author: ASF GitHub Bot
Created on: 30/Aug/19 21:56
Start Date: 30/Aug/19 21:56
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #8370: [BEAM-3489] add 
PubSub messageId in PubsubMessage
URL: https://github.com/apache/beam/pull/8370#issuecomment-526762046
 
 
   Thanks a lot for contributing this and dealing with the additional 
complexity that Dataflow added.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304630)
Time Spent: 8h 20m  (was: 8h 10m)

> Expose the message id of received messages within PubsubMessage
> ---
>
> Key: BEAM-3489
> URL: https://issues.apache.org/jira/browse/BEAM-3489
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Luke Cwik
>Assignee: Thinh Ha
>Priority: Minor
>  Labels: newbie, starter
> Fix For: 2.16.0
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> This task is about passing forward the message id from the pubsub proto to 
> the java PubsubMessage.
> Add a message id field to PubsubMessage.
> Update the coder for PubsubMessage to encode the message id.
> Update the translation from the Pubsub proto message to the Dataflow message:
> https://github.com/apache/beam/blob/2e275264b21db45787833502e5e42907b05e28b8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java#L976



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-3489) Expose the message id of received messages within PubsubMessage

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3489?focusedWorklogId=304629=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304629
 ]

ASF GitHub Bot logged work on BEAM-3489:


Author: ASF GitHub Bot
Created on: 30/Aug/19 21:55
Start Date: 30/Aug/19 21:55
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #8370: [BEAM-3489] 
add PubSub messageId in PubsubMessage
URL: https://github.com/apache/beam/pull/8370
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304629)
Time Spent: 8h 10m  (was: 8h)

> Expose the message id of received messages within PubsubMessage
> ---
>
> Key: BEAM-3489
> URL: https://issues.apache.org/jira/browse/BEAM-3489
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Luke Cwik
>Assignee: Thinh Ha
>Priority: Minor
>  Labels: newbie, starter
> Fix For: 2.16.0
>
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> This task is about passing forward the message id from the pubsub proto to 
> the java PubsubMessage.
> Add a message id field to PubsubMessage.
> Update the coder for PubsubMessage to encode the message id.
> Update the translation from the Pubsub proto message to the Dataflow message:
> https://github.com/apache/beam/blob/2e275264b21db45787833502e5e42907b05e28b8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java#L976



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304620=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304620
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Aug/19 21:44
Start Date: 30/Aug/19 21:44
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9459: [BEAM-5428] Update 
generated portability proto bindings for Go.
URL: https://github.com/apache/beam/pull/9459#issuecomment-526759528
 
 
   R: @mxm 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304620)
Time Spent: 12h 20m  (was: 12h 10m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304615=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304615
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Aug/19 21:44
Start Date: 30/Aug/19 21:44
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9459: [BEAM-5428] 
Update generated portability proto bindings for Go.
URL: https://github.com/apache/beam/pull/9459
 
 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | 

[jira] [Work logged] (BEAM-7966) Write portable Flink application jar

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7966?focusedWorklogId=304601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304601
 ]

ASF GitHub Bot logged work on BEAM-7966:


Author: ASF GitHub Bot
Created on: 30/Aug/19 21:14
Start Date: 30/Aug/19 21:14
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9331: [BEAM-7966] Write 
portable Flink application jar
URL: https://github.com/apache/beam/pull/9331#issuecomment-526752539
 
 
   > With Flink you can bundle multiple entry points into the same jar file
   
   I think the most feasible way to do this would be to point multiple 
pipelines at the same output jar:
   
   ```py
   pipeline1.run().wait_until_finish()
   pipeline2.run().wait_until_finish()
   ```
   
   Then the jar creation code would rebuild the jar, including the new 
pipeline, options, and artifacts. If we went that route, we would need to 
somehow differentiate adding to a jar vs overwriting it. Maybe by adding an 
`--input_executable_path` flag. I also don't know how we could ensure the 
pipelines were created by the same job server (if they weren't, there could be 
ugly version mismatches). However, I can't think of a way around this short of 
larger additions to the SDKs.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304601)
Time Spent: 2h 20m  (was: 2h 10m)

> Write portable Flink application jar
> 
>
> Key: BEAM-7966
> URL: https://issues.apache.org/jira/browse/BEAM-7966
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> *[https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]*



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (BEAM-7993) portable python precommit is flaky

2019-08-30 Thread Hannah Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919866#comment-16919866
 ] 

Hannah Jiang edited comment on BEAM-7993 at 8/30/19 9:11 PM:
-

I investigated a little more.

We are not parallel running multiple sdist tasks for Python portable precommit 
tasks. Only one sdist task is running and all precommit tasks depend on it , 
which means all precommit tests are sharing the same beam tar ball, so I think 
we do have endpoints_pb2.py, because it doesn't fail with py2. But somehow, it 
is not recognized by py3.

We need to investigate 
 # Is above conclusion correct? [~markflyhigh], can you please help with it?
 # why import failure only happens at Jenkins, but not at local? What is env 
diff? who can help with it?
 # why import doesn't fail some time? If it's a py3 issue, it should happen 
consistently, but it doesn't fail some time, though this chance is small. 
[~tvalentyn], do you have any insights?
 # anything I missed here..


was (Author: hannahjiang):
I investigated a little more.

We are not parallel running multiple sdist tasks for Python portable precommit 
tasks. Only one sdist task is running and all precommit tasks depend on it , 
which means all precommit tests are sharing the same beam tar ball, so I think 
we do have endpoints_pb2.py, because it doesn't fail with py2. But somehow, it 
is not recognized.

We need to investigate 
 # Is above conclusion correct? [~markflyhigh], can you please help with it?
 # why import failure only happens at Jenkins, but not at local? What is env 
diff? who can help with it?
 # why import doesn't fail some time? If it's a py3 issue, it should happen 
consistently, but it doesn't fail some time, though this chance is small. 
[~tvalentyn], do you have any insights?
 # anything I missed here..

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 

[jira] [Work logged] (BEAM-7760) Interactive Beam Caching PCollections bound to user defined vars in notebook

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7760?focusedWorklogId=304599=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304599
 ]

ASF GitHub Bot logged work on BEAM-7760:


Author: ASF GitHub Bot
Created on: 30/Aug/19 21:10
Start Date: 30/Aug/19 21:10
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on issue #9278: [BEAM-7760] Added 
Interactive Beam module
URL: https://github.com/apache/beam/pull/9278#issuecomment-526751231
 
 
   Based on our offline discussion, we've decided not to make create_pipeline() 
and run_pipeline() API as part of interactive_beam module, but incorporating 
the features into existing beam module. I've updated the design doc: 
https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit?usp=sharing
   
   I've also removed the related code and tests from this PR.
   
   In future beam changes, 
   1. the beam.Pipeline() constructor would by default use 
runner=InteractiveRunner() when 
`apache_beam.runners.interactive.interactive_beam` module has been imported.
   2. the pipeline.run() function would be able to take in runner=? and 
options=? to run existing pipeline with selected runner and options. 'runner' 
option would support both string name and runner object.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304599)
Time Spent: 3h 20m  (was: 3h 10m)

> Interactive Beam Caching PCollections bound to user defined vars in notebook
> 
>
> Key: BEAM-7760
> URL: https://issues.apache.org/jira/browse/BEAM-7760
> Project: Beam
>  Issue Type: New Feature
>  Components: examples-python
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Cache only PCollections bound to user defined variables in a pipeline when 
> running pipeline with interactive runner in jupyter notebooks.
> [Interactive 
> Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]]
>  has been caching and using caches of "leaf" PCollections for interactive 
> execution in jupyter notebooks.
> The interactive execution is currently supported so that when appending new 
> transforms to existing pipeline for a new run, executed part of the pipeline 
> doesn't need to be re-executed. 
> A PCollection is "leaf" when it is never used as input in any PTransform in 
> the pipeline.
> The problem with building caches and pipeline to execute around "leaf" is 
> that when a PCollection is consumed by a sink with no output, the pipeline to 
> execute built will miss the subgraph generating and consuming that 
> PCollection.
> An example, "ReadFromPubSub --> WirteToPubSub" will result in an empty 
> pipeline.
> Caching around PCollections bound to user defined variables and replacing 
> transforms with source and sink of caches could resolve the pipeline to 
> execute properly under the interactive execution scenario. Also, cached 
> PCollection now can trace back to user code and can be used for user data 
> visualization if user wants to do it.
> E.g.,
> {code:java}
> // ...
> p = beam.Pipeline(interactive_runner.InteractiveRunner(),
>   options=pipeline_options)
> messages = p | "Read" >> beam.io.ReadFromPubSub(subscription='...')
> messages | "Write" >> beam.io.WriteToPubSub(topic_path)
> result = p.run()
> // ...
> visualize(messages){code}
>  The interactive runner automatically figures out that PCollection
> {code:java}
> messages{code}
> created by
> {code:java}
> p | "Read" >> beam.io.ReadFromPubSub(subscription='...'){code}
> should be cached and reused if the notebook user appends more transforms.
>  And once the pipeline gets executed, the user could use any 
> visualize(PCollection) module to visualize the data statically (batch) or 
> dynamically (stream)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7984) [python] The coder returned for typehints.List should be IterableCoder

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7984?focusedWorklogId=304584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304584
 ]

ASF GitHub Bot logged work on BEAM-7984:


Author: ASF GitHub Bot
Created on: 30/Aug/19 20:40
Start Date: 30/Aug/19 20:40
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9344: [BEAM-7984] The coder 
returned for typehints.List should be IterableCoder
URL: https://github.com/apache/beam/pull/9344#issuecomment-526742307
 
 
   These failures could be related to the PR changes: 
https://builds.apache.org/job/beam_PreCommit_Python_Phrase/761/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304584)
Time Spent: 1h 40m  (was: 1.5h)

> [python] The coder returned for typehints.List should be IterableCoder
> --
>
> Key: BEAM-7984
> URL: https://issues.apache.org/jira/browse/BEAM-7984
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> IterableCoder encodes a list and decodes to list, but 
> typecoders.registry.get_coder(typehints.List[bytes]) returns a 
> FastPrimitiveCoder.  I don't see any reason why this would be advantageous. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7993) portable python precommit is flaky

2019-08-30 Thread Hannah Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919866#comment-16919866
 ] 

Hannah Jiang commented on BEAM-7993:


I investigated a little more.

We are not parallel running multiple sdist tasks for Python portable precommit 
tasks. Only one sdist task is running and all precommit tasks depend on it , 
which means all precommit tests are sharing the same beam tar ball, so I think 
we do have endpoints_pb2.py, because it doesn't fail with py2. But somehow, it 
is not recognized.

We need to investigate 
 # Is above conclusion correct? [~markflyhigh], can you please help with it?
 # why import failure only happens at Jenkins, but not at local? What is env 
diff? who can help with it?
 # why import doesn't fail some time? If it's a py3 issue, it should happen 
consistently, but it doesn't fail some time, though this chance is small. 
[~tvalentyn], do you have any insights?
 # anything I missed here..

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> 

[jira] [Work logged] (BEAM-8021) Add Automatic-Module-Name headers for Beam Java modules

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8021?focusedWorklogId=304576=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304576
 ]

ASF GitHub Bot logged work on BEAM-8021:


Author: ASF GitHub Bot
Created on: 30/Aug/19 20:18
Start Date: 30/Aug/19 20:18
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9417: [BEAM-8021] Add 
Automatic-Module-Name headers to beam's artifacts.
URL: https://github.com/apache/beam/pull/9417#issuecomment-526735607
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304576)
Time Spent: 6h 50m  (was: 6h 40m)

> Add Automatic-Module-Name headers for Beam Java modules 
> 
>
> Key: BEAM-8021
> URL: https://issues.apache.org/jira/browse/BEAM-8021
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Lukasz Gajowy
>Priority: Minor
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> For compatibility with the Java Platform Module System (JPMS) in Java 9 and 
> later, every JAR should have a module name, even if the library does not 
> itself use modules. As [suggested in the mailing 
> list|https://lists.apache.org/thread.html/956065580ce049481e756482dc3ccfdc994fef3b8cdb37cab3e2d9b1@%3Cdev.beam.apache.org%3E],
>  this is a simple change that we can do and still be backwards compatible.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7966) Write portable Flink application jar

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7966?focusedWorklogId=304575=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304575
 ]

ASF GitHub Bot logged work on BEAM-7966:


Author: ASF GitHub Bot
Created on: 30/Aug/19 20:16
Start Date: 30/Aug/19 20:16
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #9331: [BEAM-7966] Write 
portable Flink application jar
URL: https://github.com/apache/beam/pull/9331#issuecomment-526734851
 
 
   Something else: With Flink you can bundle multiple entry points into the 
same jar file and specify which one to use with optional flags. It may be 
desirable to allow inclusion of multiple pipelines for this tool also, although 
that would require a different workflow. Absent this option, it becomes quite 
convoluted for users that need the flexibility to choose which pipeline to 
launch at submission time.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304575)
Time Spent: 2h 10m  (was: 2h)

> Write portable Flink application jar
> 
>
> Key: BEAM-7966
> URL: https://issues.apache.org/jira/browse/BEAM-7966
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> *[https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]*



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=304573=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304573
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 30/Aug/19 20:12
Start Date: 30/Aug/19 20:12
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #8915: [DO NOT MERGE] 
[BEAM-4046] Remove old project name mappings.
URL: https://github.com/apache/beam/pull/8915#issuecomment-526733853
 
 
   This pull request has been closed due to lack of activity. If you think that 
is incorrect, or the pull request requires review, you can revive the PR at any 
time.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304573)
Time Spent: 41h 40m  (was: 41.5h)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 41h 40m
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-4046) Decouple gradle project names and maven artifact ids

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4046?focusedWorklogId=304574=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304574
 ]

ASF GitHub Bot logged work on BEAM-4046:


Author: ASF GitHub Bot
Created on: 30/Aug/19 20:12
Start Date: 30/Aug/19 20:12
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on pull request #8915: [DO NOT 
MERGE] [BEAM-4046] Remove old project name mappings.
URL: https://github.com/apache/beam/pull/8915
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304574)
Time Spent: 41h 50m  (was: 41h 40m)

> Decouple gradle project names and maven artifact ids
> 
>
> Key: BEAM-4046
> URL: https://issues.apache.org/jira/browse/BEAM-4046
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Michael Luckey
>Priority: Major
>  Time Spent: 41h 50m
>  Remaining Estimate: 0h
>
> In our first draft, we had gradle projects like {{":beam-sdks-java-core"}}. 
> It is clumsy and requires a hacky settings.gradle that is not idiomatic.
> In our second draft, we changed them to names that work well with Gradle, 
> like {{":sdks:java:core"}}. This caused Maven artifact IDs to be wonky.
> In our third draft, we regressed to the first draft to get the Maven artifact 
> ids right.
> These should be able to be decoupled. It seems there are many StackOverflow 
> questions on the subject.
> Since it is unidiomatic and a poor user experience, if it does turn out to be 
> mandatory then it needs to be documented inline everywhere - the 
> settings.gradle should say why it is so bizarre, and each build.gradle should 
> indicate what its project id is.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7966) Write portable Flink application jar

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7966?focusedWorklogId=304572=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304572
 ]

ASF GitHub Bot logged work on BEAM-7966:


Author: ASF GitHub Bot
Created on: 30/Aug/19 20:09
Start Date: 30/Aug/19 20:09
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9331: [BEAM-7966] Write 
portable Flink application jar
URL: https://github.com/apache/beam/pull/9331#issuecomment-526732989
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304572)
Time Spent: 2h  (was: 1h 50m)

> Write portable Flink application jar
> 
>
> Key: BEAM-7966
> URL: https://issues.apache.org/jira/browse/BEAM-7966
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> *[https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]*



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8088) PCollection boundedness should be tracked and propagated

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8088?focusedWorklogId=304569=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304569
 ]

ASF GitHub Bot logged work on BEAM-8088:


Author: ASF GitHub Bot
Created on: 30/Aug/19 20:08
Start Date: 30/Aug/19 20:08
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9426: [BEAM-8088] Track 
PCollection boundedness in python sdk
URL: https://github.com/apache/beam/pull/9426#issuecomment-526732412
 
 
   > Just to clarify, this is to have an unbounded PCollection during the 
expansion of the WriteToPubSub write?
   
   Correct.  
   
   `PubSubIO.Write.expand` looks like this:
   
   ```java
   
   @Override
   public PDone expand(PCollection input) {
 if (getTopicProvider() == null) {
   throw new IllegalStateException("need to set the topic of a 
PubsubIO.Write transform");
 }
   
 switch (input.isBounded()) {
   case BOUNDED:
 input.apply(
 ParDo.of(
 new PubsubBoundedWriter(
 MoreObjects.firstNonNull(getMaxBatchSize(), 
MAX_PUBLISH_BATCH_SIZE),
 MoreObjects.firstNonNull(
 getMaxBatchBytesSize(), 
MAX_PUBLISH_BATCH_BYTE_SIZE_DEFAULT;
 return PDone.in(input.getPipeline());
   case UNBOUNDED:
 return input
 .apply(MapElements.via(getFormatFn()))
 .apply(
 new PubsubUnboundedSink(
 
Optional.ofNullable(getPubsubClientFactory()).orElse(FACTORY),
 NestedValueProvider.of(getTopicProvider(), new 
TopicPathTranslator()),
 getTimestampAttribute(),
 getIdAttribute(),
 100 /* numShards */,
 MoreObjects.firstNonNull(
 getMaxBatchSize(), 
PubsubUnboundedSink.DEFAULT_PUBLISH_BATCH_SIZE),
 MoreObjects.firstNonNull(
 getMaxBatchBytesSize(),
 PubsubUnboundedSink.DEFAULT_PUBLISH_BATCH_BYTES)));
 }
 throw new RuntimeException(); // cases are exhaustive.
   }
   ```
   
   The `ReadFromPubsub` was expanded by Java, but Python was not properly 
handling the unbounded PCollections that it returned, so when `WriteToPubsub` 
was subsequently expanded, it always produced a  `PubsubBoundedWriter`.
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304569)
Time Spent: 40m  (was: 0.5h)

> PCollection boundedness should be tracked and propagated
> 
>
> Key: BEAM-8088
> URL: https://issues.apache.org/jira/browse/BEAM-8088
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> As far as I can tell Python does not care about boundedness of PCollections 
> even in streaming mode, but external transforms _do_.  In my ongoing effort 
> to get PubsubIO external transforms working I discovered that I could not 
> generate an unbounded write. 
> My pipeline looks like this:
> {code:python}
> (
> pipe
> | 'PubSubInflow' >> 
> external.pubsub.ReadFromPubSub(subscription=subscription, 
> with_attributes=True)
> | 'PubSubOutflow' >> 
> external.pubsub.WriteToPubSub(topic=OUTPUT_TOPIC, with_attributes=True)
> )
> {code}
> The PCollections returned from the external Read are Unbounded, as expected, 
> but python is responsible for creating the intermediate PCollection, which is 
> always Bounded, and thus external Write is always Bounded. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8034) Upgrade Flink Runner to 1.8.1 and fix Avro Schema serialization problems

2019-08-30 Thread Maximilian Michels (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919858#comment-16919858
 ] 

Maximilian Michels commented on BEAM-8034:
--

Thanks!

> Upgrade Flink Runner to 1.8.1 and fix Avro Schema serialization problems
> 
>
> Key: BEAM-8034
> URL: https://issues.apache.org/jira/browse/BEAM-8034
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: sunjincheng
>Priority: Major
>
> The Flink Runner should be upgrade to 1.8.1. Users have reported 
> serialization problems related to Avro's Schema:
> {noformat}
> Caused by: java.io.NotSerializableException: org.apache.avro.Schema$Field
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at java.util.ArrayList.writeObject(ArrayList.java:766)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
> at 
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
> at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at 
> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:576)
> at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:122)
> ... 45 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8088) PCollection boundedness should be tracked and propagated

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8088?focusedWorklogId=304567=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304567
 ]

ASF GitHub Bot logged work on BEAM-8088:


Author: ASF GitHub Bot
Created on: 30/Aug/19 20:00
Start Date: 30/Aug/19 20:00
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9426: [BEAM-8088] Track 
PCollection boundedness in python sdk
URL: https://github.com/apache/beam/pull/9426#issuecomment-526729902
 
 
   Just to clarify, this is to have an unbounded PCollection during the 
expansion of the WriteToPubSub write?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304567)
Time Spent: 0.5h  (was: 20m)

> PCollection boundedness should be tracked and propagated
> 
>
> Key: BEAM-8088
> URL: https://issues.apache.org/jira/browse/BEAM-8088
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As far as I can tell Python does not care about boundedness of PCollections 
> even in streaming mode, but external transforms _do_.  In my ongoing effort 
> to get PubsubIO external transforms working I discovered that I could not 
> generate an unbounded write. 
> My pipeline looks like this:
> {code:python}
> (
> pipe
> | 'PubSubInflow' >> 
> external.pubsub.ReadFromPubSub(subscription=subscription, 
> with_attributes=True)
> | 'PubSubOutflow' >> 
> external.pubsub.WriteToPubSub(topic=OUTPUT_TOPIC, with_attributes=True)
> )
> {code}
> The PCollections returned from the external Read are Unbounded, as expected, 
> but python is responsible for creating the intermediate PCollection, which is 
> always Bounded, and thus external Write is always Bounded. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=304562=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304562
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:50
Start Date: 30/Aug/19 19:50
Worklog Time Spent: 10m 
  Work Description: je-ik commented on pull request #9451: [BEAM-8113] 
Stage files from context classloader
URL: https://github.com/apache/beam/pull/9451#discussion_r319654179
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/ClassLoaderUtils.java
 ##
 @@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import static 
org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage;
+
+import java.net.URLClassLoader;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+import org.apache.beam.sdk.util.common.ReflectHelpers;
+
+class ClassLoaderUtils {
+
+  static List detectFilesToStageFromClass(Class cls) {
 
 Review comment:
   I was thinking about that, but I was not 100% sure about the consequences. I 
will explore the contexts from which this call is made and put the code there, 
if possible. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304562)
Time Spent: 3h  (was: 2h 50m)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=304555=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304555
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:42
Start Date: 30/Aug/19 19:42
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9451: [BEAM-8113] Stage 
files from context classloader
URL: https://github.com/apache/beam/pull/9451#discussion_r319651751
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/ClassLoaderUtils.java
 ##
 @@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import static 
org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage;
+
+import java.net.URLClassLoader;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+import org.apache.beam.sdk.util.common.ReflectHelpers;
+
+class ClassLoaderUtils {
 
 Review comment:
   Why is this a Flink specific util?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304555)
Time Spent: 2h 40m  (was: 2.5h)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=304556=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304556
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:42
Start Date: 30/Aug/19 19:42
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9451: [BEAM-8113] Stage 
files from context classloader
URL: https://github.com/apache/beam/pull/9451#discussion_r319651895
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/ClassLoaderUtils.java
 ##
 @@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import static 
org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage;
+
+import java.net.URLClassLoader;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+import java.util.stream.Stream;
+import org.apache.beam.sdk.util.common.ReflectHelpers;
+
+class ClassLoaderUtils {
+
+  static List detectFilesToStageFromClass(Class cls) {
 
 Review comment:
   Should this go into `PipelineResources` and replace 
`detectFilesToStageFromClass`?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304556)
Time Spent: 2h 50m  (was: 2h 40m)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=304553=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304553
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:39
Start Date: 30/Aug/19 19:39
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9451: [BEAM-8113] Stage files 
from context classloader
URL: https://github.com/apache/beam/pull/9451#issuecomment-526723599
 
 
   @lukecwik Like you, I'm inclined to remove auto-staging entirely because it 
causes too much problems across different classloaders and java versions. As 
@je-ik pointed out, it would be a breaking change. IMHO the changes here do not 
fundamentally alter the current behavior, perhaps just make it execute 
correctly by traversing also the context classloader.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304553)
Time Spent: 2.5h  (was: 2h 20m)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7738) Support PubSubIO to be configured externally for use with other SDKs

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7738?focusedWorklogId=304550=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304550
 ]

ASF GitHub Bot logged work on BEAM-7738:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:38
Start Date: 30/Aug/19 19:38
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9268: [BEAM-7738] Add 
external transform support to PubsubIO
URL: https://github.com/apache/beam/pull/9268#issuecomment-526723440
 
 
   PubsubIO is officially working on Flink!
   
   Some notes:
   - In order for this to work from python, this PR needs to be merged: 
https://github.com/apache/beam/pull/9426
   - This PR is _not_ based on my pending "user-friendly external transform" 
PR: https://github.com/apache/beam/pull/9098.  There are still some issues that 
I need to work out with @mxm on that one, so I think this should go first and 
I'll update that one afterward.
   
   There's one giant caveat:  this actually does not work unless you have this 
special commit:  
https://github.com/chadrik/beam/commit/d12b99084809ec34fcf0be616e94301d3aca4870.
  It turns out the same is currently true for KafkaIO external transform 
support, which came as quite a surprise.  The Jira issue for this problem is 
here:  https://jira.apache.org/jira/browse/BEAM-7870.  I'd love to see some 
movement on this.  Happy to help where I can!
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304550)
Time Spent: 2.5h  (was: 2h 20m)

> Support PubSubIO to be configured externally for use with other SDKs
> 
>
> Key: BEAM-7738
> URL: https://issues.apache.org/jira/browse/BEAM-7738
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp, runner-flink, sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Labels: portability
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Now that KafkaIO is supported via the external transform API (BEAM-7029) we 
> should add support for PubSub.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304549=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304549
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:33
Start Date: 30/Aug/19 19:33
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9440: [BEAM-5428] Modify 
cache token Proto design to only include tokens in ProcessBundleRequest
URL: https://github.com/apache/beam/pull/9440
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304549)
Time Spent: 12h  (was: 11h 50m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7984) [python] The coder returned for typehints.List should be IterableCoder

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7984?focusedWorklogId=304541=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304541
 ]

ASF GitHub Bot logged work on BEAM-7984:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:28
Start Date: 30/Aug/19 19:28
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9344: [BEAM-7984] The coder 
returned for typehints.List should be IterableCoder
URL: https://github.com/apache/beam/pull/9344#issuecomment-526720649
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304541)
Time Spent: 1.5h  (was: 1h 20m)

> [python] The coder returned for typehints.List should be IterableCoder
> --
>
> Key: BEAM-7984
> URL: https://issues.apache.org/jira/browse/BEAM-7984
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> IterableCoder encodes a list and decodes to list, but 
> typecoders.registry.get_coder(typehints.List[bytes]) returns a 
> FastPrimitiveCoder.  I don't see any reason why this would be advantageous. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7966) Write portable Flink application jar

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7966?focusedWorklogId=304534=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304534
 ]

ASF GitHub Bot logged work on BEAM-7966:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:21
Start Date: 30/Aug/19 19:21
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9331: [BEAM-7966] Write 
portable Flink application jar
URL: https://github.com/apache/beam/pull/9331#issuecomment-526718781
 
 
   Squashed commits together. I think tests were failing from when master was 
broken.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304534)
Time Spent: 1h 50m  (was: 1h 40m)

> Write portable Flink application jar
> 
>
> Key: BEAM-7966
> URL: https://issues.apache.org/jira/browse/BEAM-7966
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> *[https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]*



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304532=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304532
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:18
Start Date: 30/Aug/19 19:18
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9440: [BEAM-5428] Modify cache 
token Proto design to only include tokens in ProcessBundleRequest
URL: https://github.com/apache/beam/pull/9440#issuecomment-526717957
 
 
   Not sure what is the issue, but I get this generated code (excerpt below) 
which does not compile. Everything else is setup correctly and I can run the 
tests. This is not important for this PR, but I just wondered if I'm doing 
anything obvious wrong.
   
   ```diff
   diff --git a/sdks/go/pkg/beam/model/fnexecution_v1/beam_fn_api.pb.go 
b/sdks/go/pkg/beam/model/fnexecution_v1/beam_fn_api.pb.go
   index eff5c3258e..8fae66602f 100644
   --- a/sdks/go/pkg/beam/model/fnexecution_v1/beam_fn_api.pb.go
   +++ b/sdks/go/pkg/beam/model/fnexecution_v1/beam_fn_api.pb.go
   @@ -3,17 +3,18 @@
   
package fnexecution_v1
   
   -import proto "github.com/golang/protobuf/proto"
   -import fmt "fmt"
   -import math "math"
   -import pipeline_v1 
"github.com/apache/beam/sdks/go/pkg/beam/model/pipeline_v1"
   -import _ "github.com/golang/protobuf/protoc-gen-go/descriptor"
   -import timestamp "github.com/golang/protobuf/ptypes/timestamp"
   -import _ "github.com/golang/protobuf/ptypes/wrappers"
   -
import (
   -   context "golang.org/x/net/context"
   +   context "context"
   +   fmt "fmt"
   +   pipeline_v1 
"github.com/apache/beam/sdks/go/pkg/beam/model/pipeline_v1"
   +   proto "github.com/golang/protobuf/proto"
   +   _ "github.com/golang/protobuf/protoc-gen-go/descriptor"
   +   timestamp "github.com/golang/protobuf/ptypes/timestamp"
   +   _ "github.com/golang/protobuf/ptypes/wrappers"
   grpc "google.golang.org/grpc"
   +   codes "google.golang.org/grpc/codes"
   +   status "google.golang.org/grpc/status"
   +   math "math"
)
   
// Reference imports to suppress errors if they are not otherwise used.
   @@ -25,7 +26,7 @@ var _ = math.Inf
// is compatible with the proto package it is being compiled against.
// A compilation error at this line likely means your copy of the
// proto package needs to be updated.
   -const _ = proto.ProtoPackageIsVersion2 // please upgrade the proto package
   +const _ = proto.ProtoPackageIsVersion3 // please upgrade the proto package
   ```
   ...
   
   Also worth mentioning I get this during the code generation:
   
   ```
   beam_fn_api.proto:45:1: warning: Import google/protobuf/wrappers.proto but 
not used.
   beam_fn_api.proto:43:1: warning: Import google/protobuf/descriptor.proto but 
not used.
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304532)
Time Spent: 11h 50m  (was: 11h 40m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=304527=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304527
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:11
Start Date: 30/Aug/19 19:11
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files 
from context classloader
URL: https://github.com/apache/beam/pull/9451#issuecomment-526715671
 
 
   Fair points, thanks. I thought this will be a "small enough" change, so that 
it doesn't need any tests, but it is actually bigger. Regarding user specified 
classpath - true. But then I'd say we should auto stage nothing. Currently it 
already behaves the sometimes-things-work-sometimes-not-way. I'm fine with 
dropping the autostaging, but that would be a breaking change. The current 
state is not working on JDK9 and later, so we cannot keep it either.
   
   That seems to give a conclusion, that this proposed change is "somewhat 
better than current state", although not perfect. So, if I will add tests (I'm 
a little tempted to put that into @ValidatesRunner, but I'm afraid, that it 
would be much more work, than I previously planned to invest into this :-)), 
will you then be fine with the proposed approach?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304527)
Time Spent: 2h 20m  (was: 2h 10m)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=304526=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304526
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:10
Start Date: 30/Aug/19 19:10
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files 
from context classloader
URL: https://github.com/apache/beam/pull/9451#issuecomment-526715671
 
 
   Fair points, thanks. I thought this will be a "small enough" change, so that 
it doesn't need any tests, but it is actually bigger. Regarding user specified 
classpath - true. But then I'd say we shouldn't auto stage nothing. Currently 
it already behaves the sometimes-things-work-sometimes-not-way. I'm fine with 
dropping the autostaging, but that would be a breaking change. The current 
state is not working on JDK9 and later, so we cannot keep it either.
   
   That seems to give a conclusion, that this proposed change is "somewhat 
better than current state", although not perfect. So, if I will add tests (I'm 
a little tempted to put that into @ValidatesRunner, but I'm afraid, that it 
would be much more work, than I previously planned to invest into this :-)), 
will you then be fine with the proposed approach?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304526)
Time Spent: 2h 10m  (was: 2h)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304523=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304523
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Aug/19 19:07
Start Date: 30/Aug/19 19:07
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9440: [BEAM-5428] Modify cache 
token Proto design to only include tokens in ProcessBundleRequest
URL: https://github.com/apache/beam/pull/9440#issuecomment-526714663
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304523)
Time Spent: 11h 40m  (was: 11.5h)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8096) Allow runner to configure "subnetwork"

2019-08-30 Thread Jack Whelpton (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919817#comment-16919817
 ] 

Jack Whelpton commented on BEAM-8096:
-

Great, thanks! PR is here:

[https://github.com/apache/beam/pull/9458]

> Allow runner to configure "subnetwork"
> --
>
> Key: BEAM-8096
> URL: https://issues.apache.org/jira/browse/BEAM-8096
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.15.0
>Reporter: Jack Whelpton
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When running a Dataflow job, the network can be specified using the --network 
> flag; however, there is no support for doing the same for the subnetwork. 
> This would be the go equivalent of the following Java code:
> [https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/options/DataflowPipelineWorkerPoolOptions.html#getSubnetwork--|https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java#L151]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8096) Allow runner to configure "subnetwork"

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8096?focusedWorklogId=304497=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304497
 ]

ASF GitHub Bot logged work on BEAM-8096:


Author: ASF GitHub Bot
Created on: 30/Aug/19 18:25
Start Date: 30/Aug/19 18:25
Worklog Time Spent: 10m 
  Work Description: jackwhelpton commented on issue #9458: [BEAM-8096] 
Allow runner to configure "subnetwork"
URL: https://github.com/apache/beam/pull/9458#issuecomment-526702013
 
 
   R: @danoliveira
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304497)
Time Spent: 20m  (was: 10m)

> Allow runner to configure "subnetwork"
> --
>
> Key: BEAM-8096
> URL: https://issues.apache.org/jira/browse/BEAM-8096
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.15.0
>Reporter: Jack Whelpton
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When running a Dataflow job, the network can be specified using the --network 
> flag; however, there is no support for doing the same for the subnetwork. 
> This would be the go equivalent of the following Java code:
> [https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/options/DataflowPipelineWorkerPoolOptions.html#getSubnetwork--|https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java#L151]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8096) Allow runner to configure "subnetwork"

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8096?focusedWorklogId=304494=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304494
 ]

ASF GitHub Bot logged work on BEAM-8096:


Author: ASF GitHub Bot
Created on: 30/Aug/19 18:22
Start Date: 30/Aug/19 18:22
Worklog Time Spent: 10m 
  Work Description: jackwhelpton commented on pull request #9458: 
[BEAM-8096] Allow runner to configure "subnetwork"
URL: https://github.com/apache/beam/pull/9458
 
 
   Enables the subnetwork to be configured for the Dataflow runner when using 
the Go SDK, by analogy with the same functionality for Java and Python. The 
flag name matches that for Python:
   
   ```
   go run src.go \
 --runner dataflow \
 ...
 --network {network} \
 --subnetwork regions/{region}/subnetworks/{subnetwork}
 ...
   ```
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-4152) Support Go session windowing

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4152?focusedWorklogId=304487=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304487
 ]

ASF GitHub Bot logged work on BEAM-4152:


Author: ASF GitHub Bot
Created on: 30/Aug/19 18:12
Start Date: 30/Aug/19 18:12
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #8111: [BEAM-4152] Kinds 
& URNs for Session Windows
URL: https://github.com/apache/beam/pull/8111#issuecomment-526697757
 
 
   This pull request has been closed due to lack of activity. If you think that 
is incorrect, or the pull request requires review, you can revive the PR at any 
time.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304487)
Time Spent: 4h 10m  (was: 4h)

> Support Go session windowing
> 
>
> Key: BEAM-4152
> URL: https://issues.apache.org/jira/browse/BEAM-4152
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Support session windowing and how to handle merging windows.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-4152) Support Go session windowing

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4152?focusedWorklogId=304488=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304488
 ]

ASF GitHub Bot logged work on BEAM-4152:


Author: ASF GitHub Bot
Created on: 30/Aug/19 18:12
Start Date: 30/Aug/19 18:12
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on pull request #8111: [BEAM-4152] 
Kinds & URNs for Session Windows
URL: https://github.com/apache/beam/pull/8111
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304488)
Time Spent: 4h 20m  (was: 4h 10m)

> Support Go session windowing
> 
>
> Key: BEAM-4152
> URL: https://issues.apache.org/jira/browse/BEAM-4152
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Support session windowing and how to handle merging windows.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7984) [python] The coder returned for typehints.List should be IterableCoder

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7984?focusedWorklogId=304462=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304462
 ]

ASF GitHub Bot logged work on BEAM-7984:


Author: ASF GitHub Bot
Created on: 30/Aug/19 18:00
Start Date: 30/Aug/19 18:00
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #9344: [BEAM-7984] The coder 
returned for typehints.List should be IterableCoder
URL: https://github.com/apache/beam/pull/9344#issuecomment-526693811
 
 
   I find the python precommit test rather inscrutable.  It seems that it 
_always_ fails on :sdks:python:setupVirtualenv and I don't understand whether I 
should care or not. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304462)
Time Spent: 1h 20m  (was: 1h 10m)

> [python] The coder returned for typehints.List should be IterableCoder
> --
>
> Key: BEAM-7984
> URL: https://issues.apache.org/jira/browse/BEAM-7984
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> IterableCoder encodes a list and decodes to list, but 
> typecoders.registry.get_coder(typehints.List[bytes]) returns a 
> FastPrimitiveCoder.  I don't see any reason why this would be advantageous. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7665) Support TypeDefinition options in beam.Combine()

2019-08-30 Thread Tianyang Hu (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyang Hu resolved BEAM-7665.
---
Fix Version/s: 2.14.0
   Resolution: Fixed

> Support TypeDefinition options in beam.Combine()
> 
>
> Key: BEAM-7665
> URL: https://issues.apache.org/jira/browse/BEAM-7665
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Tianyang Hu
>Assignee: Tianyang Hu
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Today, CombineFn's output type must be either concrete or a generic type that 
> can be inferred from the input type. This doesn't work if the input type is 
> struct\{U, V\} while the output type is V (where U and V are generic types).
> Similar to ParDo, we could add TypeDefinition options support to Combine. 
> This will bind generic types in the CombineFn to the specified concrete types 
> at pipeline construction time.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (BEAM-8122) Py2 version of getcallargs_forhints always returns generic hint for varargs

2019-08-30 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919768#comment-16919768
 ] 

Udi Meiri edited comment on BEAM-8122 at 8/30/19 5:42 PM:
--

Upcoming https://github.com/apache/beam/pull/9283 will add the relevant 
TODO(BEAM-8122) to the codebase.


was (Author: udim):
Upcoming https://github.com/apache/beam/pull/9283 will add the relevant 
TODO(BEAM-8122) in the codebase.

> Py2 version of getcallargs_forhints always returns generic hint for varargs
> ---
>
> Key: BEAM-8122
> URL: https://issues.apache.org/jira/browse/BEAM-8122
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Priority: Minor
>
> The hint variadic positional arguments (*args) is always Tuple[Any, ...].
> Minor priority since Py2 is being phased out.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8122) Py2 version of getcallargs_forhints always returns generic hint for varargs

2019-08-30 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919768#comment-16919768
 ] 

Udi Meiri commented on BEAM-8122:
-

Upcoming https://github.com/apache/beam/pull/9283 will add the relevant 
TODO(BEAM-8122) in the codebase.

> Py2 version of getcallargs_forhints always returns generic hint for varargs
> ---
>
> Key: BEAM-8122
> URL: https://issues.apache.org/jira/browse/BEAM-8122
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Priority: Minor
>
> The hint variadic positional arguments (*args) is always Tuple[Any, ...].
> Minor priority since Py2 is being phased out.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8117) Improve the preparation_before_release script

2019-08-30 Thread Mark Liu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919765#comment-16919765
 ] 

Mark Liu commented on BEAM-8117:


people should select 4096 bits key when executing `gpg --full-generate-key` 
(one of the interactive question).

> Improve the preparation_before_release script
> -
>
> Key: BEAM-8117
> URL: https://issues.apache.org/jira/browse/BEAM-8117
> Project: Beam
>  Issue Type: Sub-task
>  Components: project-management
>Reporter: yifan zou
>Priority: Major
>
> * Setup GPG keys: 
>  * The preparation_before_release.sh interrupted. Git command failed when 
> configuring git signing key.
>  * It required a PMC to add the key in dev@ list, the script doesn’t really 
> help.
>  * Apache requires the key has at least 4096 bits, but script generates the 
> 3072b key by default. There were a few options to select the size of the key, 
> but there was no instruction indicates which option the release manager 
> should choose. 
>  * *Solution*: I follow the Apache official [release signing 
> guide|https://www.apache.org/dev/release-signing.html] to generate the RSA 
> keys then asked a PMC member adding it to the dev and release key list.
>  * Reference: [GPG Cheat Sheet|http://irtfweb.ifa.hawaii.edu/~lockhart/gpg/]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8122) Py2 version of getcallargs_forhints always returns generic hint for varargs

2019-08-30 Thread Udi Meiri (Jira)
Udi Meiri created BEAM-8122:
---

 Summary: Py2 version of getcallargs_forhints always returns 
generic hint for varargs
 Key: BEAM-8122
 URL: https://issues.apache.org/jira/browse/BEAM-8122
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Udi Meiri


The hint variadic positional arguments (*args) is always Tuple[Any, ...].
Minor priority since Py2 is being phased out.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8116) Beam Release Retrospective Improvements

2019-08-30 Thread yifan zou (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yifan zou updated BEAM-8116:

Summary: Beam Release Retrospective Improvements  (was: Beam 2.15.0 Release 
Retrospective Improvements)

> Beam Release Retrospective Improvements
> ---
>
> Key: BEAM-8116
> URL: https://issues.apache.org/jira/browse/BEAM-8116
> Project: Beam
>  Issue Type: Task
>  Components: project-management, testing, website
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We summarized some pain points during the 2.15 release. Log bugs under this 
> task to track the progress.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8096) Allow runner to configure "subnetwork"

2019-08-30 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919759#comment-16919759
 ] 

Daniel Oliveira commented on BEAM-8096:
---

The change looks good to me. You're right that there don't seem to be any tests 
for this yet. Feel free to make write the PR and add me or [~lostluck]  as a 
reviewer to it (I believe he'll be back sometime next week).

> Allow runner to configure "subnetwork"
> --
>
> Key: BEAM-8096
> URL: https://issues.apache.org/jira/browse/BEAM-8096
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.15.0
>Reporter: Jack Whelpton
>Assignee: Robert Burke
>Priority: Major
>
> When running a Dataflow job, the network can be specified using the --network 
> flag; however, there is no support for doing the same for the subnetwork. 
> This would be the go equivalent of the following Java code:
> [https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/options/DataflowPipelineWorkerPoolOptions.html#getSubnetwork--|https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java#L151]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work started] (BEAM-8118) Update build_release_candidate script with the beam-site changes.

2019-08-30 Thread yifan zou (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8118 started by yifan zou.
---
> Update build_release_candidate script with the beam-site changes.
> -
>
> Key: BEAM-8118
> URL: https://issues.apache.org/jira/browse/BEAM-8118
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>
> We moved the beam-site to beam/website, the instructions (at the end) in the 
> release scripts did not update.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8118) Update build_release_candidate script with the beam-site changes.

2019-08-30 Thread yifan zou (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yifan zou updated BEAM-8118:

Status: Open  (was: Triage Needed)

> Update build_release_candidate script with the beam-site changes.
> -
>
> Key: BEAM-8118
> URL: https://issues.apache.org/jira/browse/BEAM-8118
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>
> We moved the beam-site to beam/website, the instructions (at the end) in the 
> release scripts did not update.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (BEAM-8120) Update the finalization in the release guide

2019-08-30 Thread yifan zou (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yifan zou closed BEAM-8120.
---
Fix Version/s: Not applicable
   Resolution: Won't Fix

> Update the finalization in the release guide
> 
>
> Key: BEAM-8120
> URL: https://issues.apache.org/jira/browse/BEAM-8120
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
> Fix For: Not applicable
>
>
> * command failure: Deploy Python artifacts to pypi (third step), twine upload 
> . -> twine upload * 
>  * typo: upldaed -> updated (in the section Deploy source release to 
> dist.apache.org)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8120) Update the finalization in the release guide

2019-08-30 Thread yifan zou (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919751#comment-16919751
 ] 

yifan zou commented on BEAM-8120:
-

The revised the release guide was reverted. This won't be fixed. See 
https://issues.apache.org/jira/browse/BEAM-8097 for release guide reverse.

> Update the finalization in the release guide
> 
>
> Key: BEAM-8120
> URL: https://issues.apache.org/jira/browse/BEAM-8120
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>
> * command failure: Deploy Python artifacts to pypi (third step), twine upload 
> . -> twine upload * 
>  * typo: upldaed -> updated (in the section Deploy source release to 
> dist.apache.org)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7354) Starcgen tool not working when no identifiers specified

2019-08-30 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira resolved BEAM-7354.
---
Fix Version/s: Not applicable
   Resolution: Fixed

> Starcgen tool not working when no identifiers specified
> ---
>
> Key: BEAM-7354
> URL: https://issues.apache.org/jira/browse/BEAM-7354
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Stumbled onto this bug, starcgen tool is currently used only with identifiers 
> specified so this was missed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8116) Beam 2.15.0 Release Retrospective Improvements

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8116?focusedWorklogId=304401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304401
 ]

ASF GitHub Bot logged work on BEAM-8116:


Author: ASF GitHub Bot
Created on: 30/Aug/19 17:16
Start Date: 30/Aug/19 17:16
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #9457: [BEAM-8116] Update 
website instructions in the build_rc script.
URL: https://github.com/apache/beam/pull/9457#issuecomment-526679812
 
 
   +R: @markflyhigh 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304401)
Time Spent: 20m  (was: 10m)

> Beam 2.15.0 Release Retrospective Improvements
> --
>
> Key: BEAM-8116
> URL: https://issues.apache.org/jira/browse/BEAM-8116
> Project: Beam
>  Issue Type: Task
>  Components: project-management, testing, website
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We summarized some pain points during the 2.15 release. Log bugs under this 
> task to track the progress.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8032) JdbcIO.readRows() throws exception when the statementPreparator is not provided for simple Select statement

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8032?focusedWorklogId=304400=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304400
 ]

ASF GitHub Bot logged work on BEAM-8032:


Author: ASF GitHub Bot
Created on: 30/Aug/19 17:15
Start Date: 30/Aug/19 17:15
Worklog Time Spent: 10m 
  Work Description: charithe commented on issue #9425: [BEAM-8032] Fix JDBC 
readRows exception when statement preparator is null
URL: https://github.com/apache/beam/pull/9425#issuecomment-526679679
 
 
   Ah! That's a good idea. Thanks. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304400)
Time Spent: 1h  (was: 50m)

> JdbcIO.readRows() throws exception when the statementPreparator is not 
> provided for simple Select statement
> ---
>
> Key: BEAM-8032
> URL: https://issues.apache.org/jira/browse/BEAM-8032
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc
>Affects Versions: 2.14.0
>Reporter: Kishor Joshi
>Assignee: Charith Ellawala
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I want to read table data with a query without parameters (select * from 
> table_name). 
> As per my understanding, this should not require "StatementPreperator". 
> However, if I use the newly added "readRows" function, I get an exception 
> that seems to force me to use the "StatementPreperator". 
> Stacktrace below.  
>  
> java.lang.IllegalArgumentException: statementPreparator can not be null
>         at 
> org.apache.beam.vendor.guava.v20_0.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>         at 
> org.apache.beam.sdk.io.jdbc.JdbcIO$Read.withStatementPreparator(JdbcIO.java:600)
>         at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadRows.expand(JdbcIO.java:499)
>         at org.apache.beam.sdk.io.jdbc.JdbcIO$ReadRows.expand(JdbcIO.java:410)
>         at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
>         at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:471)
>         at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44)
>         at 
> com.nokia.csf.dfle.transforms.DfleRdbmsSource.expand(DfleRdbmsSource.java:34)
>         at 
> com.nokia.csf.dfle.transforms.DfleRdbmsSource.expand(DfleRdbmsSource.java:10)
>         at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
>         at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:488)
>         at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56)
>         at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:182)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8116) Beam 2.15.0 Release Retrospective Improvements

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8116?focusedWorklogId=304399=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304399
 ]

ASF GitHub Bot logged work on BEAM-8116:


Author: ASF GitHub Bot
Created on: 30/Aug/19 17:15
Start Date: 30/Aug/19 17:15
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on pull request #9457: [BEAM-8116] 
Update website instructions in the build_rc script.
URL: https://github.com/apache/beam/pull/9457
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Commented] (BEAM-7993) portable python precommit is flaky

2019-08-30 Thread Mark Liu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919734#comment-16919734
 ] 

Mark Liu commented on BEAM-7993:


I don't have much insight to test itself. But let's make those tests running 
sequentially and see how it goes.

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22  at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494)
> 11:51:22  ... 3 more
> {code}
> https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5512/consoleFull



--
This message was sent by Atlassian Jira

[jira] [Resolved] (BEAM-5709) Tests in BeamFnControlServiceTest are flaky.

2019-08-30 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira resolved BEAM-5709.
---
Fix Version/s: (was: Not applicable)
   2.14.0
   Resolution: Fixed

My bad, completely forgot to close this.

> Tests in BeamFnControlServiceTest are flaky.
> 
>
> Key: BEAM-5709
> URL: https://issues.apache.org/jira/browse/BEAM-5709
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Critical
> Fix For: 2.14.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/fn/BeamFnControlServiceTest.java
> Tests for BeamFnControlService are currently flaky. The test attempts to 
> verify that onCompleted was called on the mock streams, but that function 
> gets called on a separate thread, so occasionally the function will not have 
> been called yet, despite the server being shut down.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7274) Protobuf Beam Schema support

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7274?focusedWorklogId=304370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304370
 ]

ASF GitHub Bot logged work on BEAM-7274:


Author: ASF GitHub Bot
Created on: 30/Aug/19 16:30
Start Date: 30/Aug/19 16:30
Worklog Time Spent: 10m 
  Work Description: alexvanboxel commented on pull request #8690: 
[BEAM-7274] Implement the Protobuf schema provider
URL: https://github.com/apache/beam/pull/8690#discussion_r319588147
 
 

 ##
 File path: 
sdks/java/extensions/protobuf/src/main/java/org/apache/beam/sdk/extensions/protobuf/ProtoRow.java
 ##
 @@ -0,0 +1,556 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.protobuf;
+
+import com.google.protobuf.ByteString;
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.DynamicMessage;
+import com.google.protobuf.Message;
+import com.google.protobuf.Timestamp;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.SchemaCoder;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.Instant;
+
+/**
+ * ProtoRow extends the Row and does late materialisation. It hold a reference 
to the original proto
+ * message and has an overlay on each field of the proto message. It doesn't 
have it's own Coder as
+ * it relies on the SchemaCoder to (de)serialize the message over the wire.
+ *
+ * Each row has a FieldOverlay that handles specific field conversions, as 
well has special
+ * overlays for Well Know Types, Repeatables, Maps and Nullables.
+ */
 
 Review comment:
   ProtoRow completely removed. Everything delegated to RowWithGetters.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304370)
Time Spent: 6h  (was: 5h 50m)

> Protobuf Beam Schema support
> 
>
> Key: BEAM-7274
> URL: https://issues.apache.org/jira/browse/BEAM-7274
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Add support for the new Beam Schema to the Protobuf extension.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?focusedWorklogId=304369=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304369
 ]

ASF GitHub Bot logged work on BEAM-7972:


Author: ASF GitHub Bot
Created on: 30/Aug/19 16:29
Start Date: 30/Aug/19 16:29
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #9334: [BEAM-7972] 
Always use Global window in reshuffle and then apply wind…
URL: https://github.com/apache/beam/pull/9334#discussion_r319587604
 
 

 ##
 File path: sdks/python/apache_beam/transforms/util.py
 ##
 @@ -612,29 +611,27 @@ def restore_timestamps(element):
 for (value, timestamp) in values]
 
 else:
-  # The linter is confused.
-  # hash(1) is used to force "runtime" selection of _IdentityWindowFn
-  # pylint: disable=abstract-class-instantiated
-  cls = hash(1) and _IdentityWindowFn
-  window_fn = cls(
-  windowing_saved.windowfn.get_window_coder())
-
-  def reify_timestamps(element, timestamp=DoFn.TimestampParam):
+  def reify_timestamps(element,
+   timestamp=DoFn.TimestampParam,
+   window=DoFn.WindowParam):
 key, value = element
-return key, TimestampedValue(value, timestamp)
+# Transport the window as part of the value and restore it later.
+return key, windowed_value.WindowedValue(value, timestamp, [window])
 
-  def restore_timestamps(element, window=DoFn.WindowParam):
-# Pass the current window since _IdentityWindowFn wouldn't know how
-# to generate it.
+  def restore_timestamps(element):
 key, values = element
 return [
 windowed_value.WindowedValue(
-(key, value.value), value.timestamp, [window])
+(key, value.value), value.timestamp, value.windows)
 
 Review comment:
   You could do
   
   ```
   key, windowed_values = element
   return [wv.with_value((key, wv.value)) for wv in windowed_values]
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304369)
Time Spent: 3.5h  (was: 3h 20m)

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-30 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?focusedWorklogId=304368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304368
 ]

ASF GitHub Bot logged work on BEAM-7972:


Author: ASF GitHub Bot
Created on: 30/Aug/19 16:29
Start Date: 30/Aug/19 16:29
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #9334: [BEAM-7972] 
Always use Global window in reshuffle and then apply wind…
URL: https://github.com/apache/beam/pull/9334#discussion_r319587701
 
 

 ##
 File path: sdks/python/apache_beam/transforms/util.py
 ##
 @@ -612,29 +611,27 @@ def restore_timestamps(element):
 for (value, timestamp) in values]
 
 else:
-  # The linter is confused.
-  # hash(1) is used to force "runtime" selection of _IdentityWindowFn
-  # pylint: disable=abstract-class-instantiated
-  cls = hash(1) and _IdentityWindowFn
-  window_fn = cls(
-  windowing_saved.windowfn.get_window_coder())
-
-  def reify_timestamps(element, timestamp=DoFn.TimestampParam):
+  def reify_timestamps(element,
+   timestamp=DoFn.TimestampParam,
+   window=DoFn.WindowParam):
 key, value = element
-return key, TimestampedValue(value, timestamp)
+# Transport the window as part of the value and restore it later.
+return key, windowed_value.WindowedValue(value, timestamp, [window])
 
-  def restore_timestamps(element, window=DoFn.WindowParam):
-# Pass the current window since _IdentityWindowFn wouldn't know how
-# to generate it.
+  def restore_timestamps(element):
 key, values = element
 return [
 windowed_value.WindowedValue(
-(key, value.value), value.timestamp, [window])
+(key, value.value), value.timestamp, value.windows)
 for value in values]
 
 ungrouped = pcoll | Map(reify_timestamps)
+
+# TODO(BEAM-8104) Using global window as one of the standard window.
+# This is to mitigate the Java Runner Harness limitation to
 
 Review comment:
   s/Java Runner Harness/Dataflow Java Runner Harness/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304368)
Time Spent: 3.5h  (was: 3h 20m)

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


  1   2   >