[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-10-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=154120=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154120
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 13/Oct/18 18:40
Start Date: 13/Oct/18 18:40
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #6592: 
[BEAM-4176] Enable Post Commit JAVA PVR tests for Flink
URL: https://github.com/apache/beam/pull/6592#discussion_r224968563
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1498,6 +1498,7 @@ artifactId=${project.name}
 testClassesDirs = 
project.files(project.project(":beam-sdks-java-core").sourceSets.test.output.classesDirs,
 project.project(":beam-runners-core-java").sourceSets.test.output.classesDirs)
 maxParallelForks config.parallelism
 useJUnit(config.testCategories)
+dependsOn ':beam-sdks-java-container:docker'
 
 Review comment:
   Is this later going to change to use the process environment?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154120)
Time Spent: 33h 40m  (was: 33.5h)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
>     URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ankur Goenka
>Priority: Major
> Attachments: 81VxNWtFtke.png, Screen Shot 2018-08-14 at 4.18.31 
> PM.png, Screen Shot 2018-09-03 at 11.07.38 AM.png
>
>  Time Spent: 33h 40m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5708) Support caching of SDKHarness environments in flink

2018-10-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5708?focusedWorklogId=154119=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154119
 ]

ASF GitHub Bot logged work on BEAM-5708:


Author: ASF GitHub Bot
Created on: 13/Oct/18 18:21
Start Date: 13/Oct/18 18:21
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6638: [BEAM-5708] Cache 
environment in portable flink runner
URL: https://github.com/apache/beam/pull/6638
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy 
b/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
index 290f72399db..0ba980b27e5 100644
--- a/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
+++ b/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
@@ -1484,6 +1484,7 @@ artifactId=${project.name}
   def beamTestPipelineOptions = [
 
"--runner=org.apache.beam.runners.reference.testing.TestPortableRunner",
 "--jobServerDriver=${config.jobServerDriver}",
+"--environmentCacheMillis=1",
   ]
   if (config.jobServerConfig) {
 
beamTestPipelineOptions.add("--jobServerConfig=${config.jobServerConfig}")
diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
index 988a94826fb..bb2b9dcbe16 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/ReferenceCountingFlinkExecutableStageContextFactory.java
@@ -24,12 +24,17 @@
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.Executors;
 import java.util.concurrent.ScheduledExecutorService;
+import java.util.concurrent.TimeUnit;
 import java.util.concurrent.atomic.AtomicInteger;
+import org.apache.beam.runners.core.construction.PipelineOptionsTranslation;
 import org.apache.beam.runners.core.construction.graph.ExecutableStage;
 import org.apache.beam.runners.fnexecution.control.StageBundleFactory;
 import org.apache.beam.runners.fnexecution.provisioning.JobInfo;
 import org.apache.beam.sdk.fn.function.ThrowingFunction;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.PortablePipelineOptions;
 import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.java.ExecutionEnvironment;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -104,9 +109,30 @@ private void scheduleRelease(JobInfo jobInfo) {
 WrappedContext wrapper = getCache().get(jobInfo.jobId());
 Preconditions.checkState(
 wrapper != null, "Releasing context for unknown job: " + 
jobInfo.jobId());
-// Do not release this asynchronously, as the releasing could fail due to 
the classloader not being
-// available anymore after the tasks have been removed from the execution 
engine.
-release(wrapper);
+
+PipelineOptions pipelineOptions =
+PipelineOptionsTranslation.fromProto(jobInfo.pipelineOptions());
+int environmentCacheTTLMillis =
+
pipelineOptions.as(PortablePipelineOptions.class).getEnvironmentCacheMillis();
+if (environmentCacheTTLMillis > 0) {
+  if (this.getClass().getClassLoader() != 
ExecutionEnvironment.class.getClassLoader()) {
+LOG.warn(
+"{} is not loaded on parent Flink classloader. "
++ "Falling back to synchronous environment release for job 
{}.",
+this.getClass(),
+jobInfo.jobId());
+release(wrapper);
+  } else {
+// Schedule task to clean the container later.
+// Ensure that this class is loaded in the parent Flink classloader.
+getExecutor()
+.schedule(() -> release(wrapper), environmentCacheTTLMillis, 
TimeUnit.MILLISECONDS);
+  }
+} else {
+  // Do not release this asynchronously, as the releasing could fail due 
to the classloader not
+  // being available anymore after the tasks have been removed from the 
execution engine.
+  release(wrapper);
+}
   }
 
   private ConcurrentHashMap getCache() {
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sd

[jira] [Work logged] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-10-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5442?focusedWorklogId=154104=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154104
 ]

ASF GitHub Bot logged work on BEAM-5442:


Author: ASF GitHub Bot
Created on: 13/Oct/18 17:07
Start Date: 13/Oct/18 17:07
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6683: [BEAM-5442] Revert 
#6675 "Revert PRs #6557 #6589 #6600"
URL: https://github.com/apache/beam/pull/6683
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154104)
Time Spent: 9h 50m  (was: 9h 40m)

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.9.0
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5097) Increment counter for "small words" in go SDK example

2018-10-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5097?focusedWorklogId=154100=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154100
 ]

ASF GitHub Bot logged work on BEAM-5097:


Author: ASF GitHub Bot
Created on: 13/Oct/18 14:07
Start Date: 13/Oct/18 14:07
Worklog Time Spent: 10m 
  Work Description: stale[bot] closed pull request #6157: [BEAM-5097][WIP] 
Add counter to combine example in go sdk
URL: https://github.com/apache/beam/pull/6157
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/go/examples/cookbook/combine/combine.go 
b/sdks/go/examples/cookbook/combine/combine.go
index 7e24aa1fb30..1950a687d24 100644
--- a/sdks/go/examples/cookbook/combine/combine.go
+++ b/sdks/go/examples/cookbook/combine/combine.go
@@ -63,11 +63,16 @@ type extractFn struct {
MinLength int `json:"min_length"`
 }
 
+// A global context for simplicity.
+var ctx = context.Background()
+
 func (f *extractFn) ProcessElement(row WordRow, emit func(string, string)) {
+   small_words := beam.NewCounter("example.namespace", "small_words")
if len(row.Word) >= f.MinLength {
emit(row.Word, row.Corpus)
+   } else {
+   small_words.Inc(ctx, 1)
}
-   // TODO(herohde) 7/14/2017: increment counter for "small words"
 }
 
 // TODO(herohde) 7/14/2017: the choice of a string (instead of []string) for 
the


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154100)
Time Spent: 1h 40m  (was: 1.5h)

> Increment counter for "small words" in go SDK example
> -
>
> Key: BEAM-5097
> URL: https://issues.apache.org/jira/browse/BEAM-5097
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: holdenk
>Assignee: holdenk
>Priority: Trivial
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Increment counter for "small words" in go SDK example



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5097) Increment counter for "small words" in go SDK example

2018-10-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5097?focusedWorklogId=154099=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154099
 ]

ASF GitHub Bot logged work on BEAM-5097:


Author: ASF GitHub Bot
Created on: 13/Oct/18 14:07
Start Date: 13/Oct/18 14:07
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #6157: [BEAM-5097][WIP] 
Add counter to combine example in go sdk
URL: https://github.com/apache/beam/pull/6157#issuecomment-429545012
 
 
   This pull request has been closed due to lack of activity. If you think that 
is incorrect, or the pull request requires review, you can revive the PR at any 
time.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154099)
Time Spent: 1.5h  (was: 1h 20m)

> Increment counter for "small words" in go SDK example
> -
>
> Key: BEAM-5097
> URL: https://issues.apache.org/jira/browse/BEAM-5097
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: holdenk
>Assignee: holdenk
>Priority: Trivial
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Increment counter for "small words" in go SDK example



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5626) Several IO tests fail in Python 3 with RuntimeError('dictionary changed size during iteration',)}

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5626?focusedWorklogId=154082=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154082
 ]

ASF GitHub Bot logged work on BEAM-5626:


Author: ASF GitHub Bot
Created on: 13/Oct/18 04:25
Start Date: 13/Oct/18 04:25
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on a change in pull request #6587: 
[BEAM-5626] Fix hadoop filesystem test for py3.
URL: https://github.com/apache/beam/pull/6587#discussion_r224949182
 
 

 ##
 File path: sdks/python/apache_beam/io/hadoopfilesystem_test.py
 ##
 @@ -214,6 +214,11 @@ def setUp(self):
   url = self.fs.join(self.tmpdir, filename)
   self.fs.create(url).close()
 
+try:# Python 2
 
 Review comment:
   @tvalentynI came across this python2->python3 doc from python.org, LINK: 
 
 python-2-3 Difference  . 
   
   Interesting read. Section "Use feature detection instead of version 
detection" talks about replying on sys.version_info. figured maybe we can weigh 
in the point this doc brought up before we apply the change to every place. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154082)
Time Spent: 5h 40m  (was: 5.5h)

> Several IO tests fail in Python 3 with RuntimeError('dictionary changed size 
> during iteration',)}
> -
>
> Key: BEAM-5626
> URL: https://issues.apache.org/jira/browse/BEAM-5626
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Ruoyun Huang
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
>  ERROR: test_delete_dir 
> (apache_beam.io.hadoopfilesystem_test.HadoopFileSystemTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/hadoopfilesystem_test.py",
>  line 506, in test_delete_dir
>  self.fs.delete([url_t1])
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/hadoopfilesystem.py",
>  line 370, in delete
>  raise BeamIOError("Delete operation failed", exceptions)
>  apache_beam.io.filesystem.BeamIOError: Delete operation failed with 
> exceptions {'hdfs://test_dir/new_dir1': RuntimeError('dictionary changed size 
> during iteration',   )}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5626) Several IO tests fail in Python 3 with RuntimeError('dictionary changed size during iteration',)}

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5626?focusedWorklogId=154083=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154083
 ]

ASF GitHub Bot logged work on BEAM-5626:


Author: ASF GitHub Bot
Created on: 13/Oct/18 04:28
Start Date: 13/Oct/18 04:28
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on a change in pull request #6587: 
[BEAM-5626] Fix hadoop filesystem test for py3.
URL: https://github.com/apache/beam/pull/6587#discussion_r224949182
 
 

 ##
 File path: sdks/python/apache_beam/io/hadoopfilesystem_test.py
 ##
 @@ -214,6 +214,11 @@ def setUp(self):
   url = self.fs.join(self.tmpdir, filename)
   self.fs.create(url).close()
 
+try:# Python 2
 
 Review comment:
   @tvalentynI came across this python2->python3 doc from python.org, LINK: 
 
 python-2-3 Difference  . 
   
   Interesting read. Section "Use feature detection instead of version 
detection" talks about replying on sys.version_info. Though the point does 
makes our use here into a negative.  
   
   just FYI.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154083)
Time Spent: 5h 50m  (was: 5h 40m)

> Several IO tests fail in Python 3 with RuntimeError('dictionary changed size 
> during iteration',)}
> -
>
> Key: BEAM-5626
> URL: https://issues.apache.org/jira/browse/BEAM-5626
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Ruoyun Huang
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
>  ERROR: test_delete_dir 
> (apache_beam.io.hadoopfilesystem_test.HadoopFileSystemTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/hadoopfilesystem_test.py",
>  line 506, in test_delete_dir
>  self.fs.delete([url_t1])
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/hadoopfilesystem.py",
>  line 370, in delete
>  raise BeamIOError("Delete operation failed", exceptions)
>  apache_beam.io.filesystem.BeamIOError: Delete operation failed with 
> exceptions {'hdfs://test_dir/new_dir1': RuntimeError('dictionary changed size 
> during iteration',   )}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5626) Several IO tests fail in Python 3 with RuntimeError('dictionary changed size during iteration',)}

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5626?focusedWorklogId=154081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154081
 ]

ASF GitHub Bot logged work on BEAM-5626:


Author: ASF GitHub Bot
Created on: 13/Oct/18 04:24
Start Date: 13/Oct/18 04:24
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on a change in pull request #6587: 
[BEAM-5626] Fix hadoop filesystem test for py3.
URL: https://github.com/apache/beam/pull/6587#discussion_r224949182
 
 

 ##
 File path: sdks/python/apache_beam/io/hadoopfilesystem_test.py
 ##
 @@ -214,6 +214,11 @@ def setUp(self):
   url = self.fs.join(self.tmpdir, filename)
   self.fs.create(url).close()
 
+try:# Python 2
 
 Review comment:
   I came across this python2->python3 doc from python.org.  
 python-2-3 Difference 
   
   Interesting read. Section "Use feature detection instead of version 
detection" talks about replying on sys.version_info, figured maybe we can weigh 
in the point this doc brought up before we apply the change to every place. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154081)
Time Spent: 5.5h  (was: 5h 20m)

> Several IO tests fail in Python 3 with RuntimeError('dictionary changed size 
> during iteration',)}
> -
>
> Key: BEAM-5626
> URL: https://issues.apache.org/jira/browse/BEAM-5626
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Ruoyun Huang
>Priority: Major
> Fix For: 2.8.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
>  ERROR: test_delete_dir 
> (apache_beam.io.hadoopfilesystem_test.HadoopFileSystemTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/hadoopfilesystem_test.py",
>  line 506, in test_delete_dir
>  self.fs.delete([url_t1])
>File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/hadoopfilesystem.py",
>  line 370, in delete
>  raise BeamIOError("Delete operation failed", exceptions)
>  apache_beam.io.filesystem.BeamIOError: Delete operation failed with 
> exceptions {'hdfs://test_dir/new_dir1': RuntimeError('dictionary changed size 
> during iteration',   )}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5636) Java support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5636?focusedWorklogId=154080=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154080
 ]

ASF GitHub Bot logged work on BEAM-5636:


Author: ASF GitHub Bot
Created on: 13/Oct/18 02:44
Start Date: 13/Oct/18 02:44
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #6665: [BEAM-5636] Java 
support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6665#issuecomment-429505015
 
 
   @herohde this PR is ready to merge.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154080)
Time Spent: 1h 20m  (was: 1h 10m)

> Java support for custom dataflow worker jar
> ---
>
> Key: BEAM-5636
> URL: https://issues.apache.org/jira/browse/BEAM-5636
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Java jobs. That requires a change to the Java boot 
> code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/java/container/boot.go#L107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5744) Investigate negative numbers represented as 'long' in Python SDK + Direct runner

2018-10-12 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648703#comment-16648703
 ] 

Ahmet Altay commented on BEAM-5744:
---

Should we revert this until we figure out how to fix it?

> Investigate negative numbers represented as 'long' in Python SDK + Direct 
> runner
> 
>
> Key: BEAM-5744
> URL: https://issues.apache.org/jira/browse/BEAM-5744
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.8.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5628) Several VcfIO tests fail in Python 3 with TypeError: cannot use a string pattern on a bytes-like object

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648695#comment-16648695
 ] 

Valentyn Tymofieiev commented on BEAM-5628:
---

cc [~arostami], one of VCF IO authors.

> Several VcfIO tests fail in Python 3 with  TypeError: cannot use a string 
> pattern on a bytes-like object
> 
>
> Key: BEAM-5628
> URL: https://issues.apache.org/jira/browse/BEAM-5628
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Simon
>Priority: Major
>
> ERROR: test_read_after_splitting (apache_beam.io.vcfio_test.VcfSourceTest)
> "
>  --
> Traceback (most recent call last):
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio_test.py"",
>  line 336, in test_read_after_splitting
> ] split_records.extend(source_test_utils.read_from_source(*source_info))
> ]   File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils.py"",
>  line 101, in read_from_source
>  for value in reader:
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio.py"",
>  line 264, in read_records
>  for line in record_iterator:
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio.py"",
>  line 330, in __next__
>  record = next(self._vcf_reader)
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/vcf/parser.py"",
>  line 543, in __next__
>  row = self._row_pattern.split(line.rstrip())
>  TypeError: cannot use a string pattern on a bytes-like object
> "



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5628) Several VcfIO tests fail in Python 3 with TypeError: cannot use a string pattern on a bytes-like object

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648694#comment-16648694
 ] 

Valentyn Tymofieiev edited comment on BEAM-5628 at 10/13/18 2:20 AM:
-

Looks like VCF IO is using another package (PyVCF). We need to understand 
whether the issue is because we don't use PyVCF correctly (VCF IO issue), or 
because PyVCF itself is not Python3-compatible as it claims to be.


was (Author: tvalentyn):
Looks like VCF IO is using another package (PyVCF). We need to understand 
whether the issue is because we don't use PyVCF correctly (VCF IO issue), or 
because PyVCF itself is not Python3-compatible.

> Several VcfIO tests fail in Python 3 with  TypeError: cannot use a string 
> pattern on a bytes-like object
> 
>
> Key: BEAM-5628
> URL: https://issues.apache.org/jira/browse/BEAM-5628
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Simon
>Priority: Major
>
> ERROR: test_read_after_splitting (apache_beam.io.vcfio_test.VcfSourceTest)
> "
>  --
> Traceback (most recent call last):
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio_test.py"",
>  line 336, in test_read_after_splitting
> ] split_records.extend(source_test_utils.read_from_source(*source_info))
> ]   File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils.py"",
>  line 101, in read_from_source
>  for value in reader:
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio.py"",
>  line 264, in read_records
>  for line in record_iterator:
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio.py"",
>  line 330, in __next__
>  record = next(self._vcf_reader)
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/vcf/parser.py"",
>  line 543, in __next__
>  row = self._row_pattern.split(line.rstrip())
>  TypeError: cannot use a string pattern on a bytes-like object
> "



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5628) Several VcfIO tests fail in Python 3 with TypeError: cannot use a string pattern on a bytes-like object

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648694#comment-16648694
 ] 

Valentyn Tymofieiev commented on BEAM-5628:
---

Looks like VCF IO is using another package (PyVCF). We need to understand 
whether the issue is because we don't use PyVCF correctly (VCF IO issue), or 
because PyVCF itself is not Python3-compatible.

> Several VcfIO tests fail in Python 3 with  TypeError: cannot use a string 
> pattern on a bytes-like object
> 
>
> Key: BEAM-5628
> URL: https://issues.apache.org/jira/browse/BEAM-5628
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Simon
>Priority: Major
>
> ERROR: test_read_after_splitting (apache_beam.io.vcfio_test.VcfSourceTest)
> "
>  --
> Traceback (most recent call last):
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio_test.py"",
>  line 336, in test_read_after_splitting
> ] split_records.extend(source_test_utils.read_from_source(*source_info))
> ]   File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils.py"",
>  line 101, in read_from_source
>  for value in reader:
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio.py"",
>  line 264, in read_records
>  for line in record_iterator:
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/vcfio.py"",
>  line 330, in __next__
>  record = next(self._vcf_reader)
>File 
> ""/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/site-packages/vcf/parser.py"",
>  line 543, in __next__
>  row = self._row_pattern.split(line.rstrip())
>  TypeError: cannot use a string pattern on a bytes-like object
> "



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5623) Several IO tests hang indefinitely during execution on Python 3.

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648693#comment-16648693
 ] 

Valentyn Tymofieiev commented on BEAM-5623:
---

What exactly happens when the test is hanging?

> Several IO tests hang indefinitely during execution on Python 3.
> 
>
> Key: BEAM-5623
> URL: https://issues.apache.org/jira/browse/BEAM-5623
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> test_read_empty_single_file_no_eol_gzip 
> (apache_beam.io.textio_test.TextSourceTest) 
> Also several tests cases in tfrecordio_test, for example:
> test_process_auto (apache_beam.io.tfrecordio_test.TestReadAllFromTFRecord)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5744) Investigate negative numbers represented as 'long' in Python SDK + Direct runner

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648688#comment-16648688
 ] 

Valentyn Tymofieiev commented on BEAM-5744:
---

cc [~altay] [~charleschen] [~juta] [~robertwb]

> Investigate negative numbers represented as 'long' in Python SDK + Direct 
> runner
> 
>
> Key: BEAM-5744
> URL: https://issues.apache.org/jira/browse/BEAM-5744
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.8.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-10-12 Thread Thomas Weise (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise updated BEAM-5442:
---
Fix Version/s: 2.9.0

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
> Fix For: 2.9.0
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5744) Investigate negative numbers represented as 'long' in Python SDK + Direct runner

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648681#comment-16648681
 ] 

Valentyn Tymofieiev commented on BEAM-5744:
---

https://github.com/apache/beam/pull/6685 illustrates the failure.

18:13:43 ==
18:13:43 ERROR: test_assert_that_passes_order_does_not_matter_with_negatives 
(apache_beam.testing.util_test.UtilTest)
18:13:43 --
18:13:43 Traceback (most recent call last):
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/testing/util_test.py",
 line 47, in test_assert_that_passes_order_does_not_matter_with_negatives
18:13:43 assert_that(p | Create([1, -2, 3]), equal_to([-2, 1, 3]))
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/pipeline.py",
 line 423, in __exit__
18:13:43 self.run().wait_until_finish()
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/testing/test_pipeline.py",
 line 104, in run
18:13:43 result = super(TestPipeline, self).run(test_runner_api)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/pipeline.py",
 line 403, in run
18:13:43 self.to_runner_api(), self.runner, self._options).run(False)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/pipeline.py",
 line 416, in run
18:13:43 return self.runner.run_pipeline(self)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/direct/direct_runner.py",
 line 138, in run_pipeline
18:13:43 return runner.run_pipeline(pipeline)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
 line 231, in run_pipeline
18:13:43 return self.run_via_runner_api(pipeline.to_runner_api())
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
 line 234, in run_via_runner_api
18:13:43 return self.run_stages(*self.create_stages(pipeline_proto))
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
 line 1017, in run_stages
18:13:43 pcoll_buffers, safe_coders)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
 line 1137, in run_stage
18:13:43 self._progress_frequency).process_bundle(data_input, data_output)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
 line 1393, in process_bundle
18:13:43 result_future = 
self._controller.control_handler.push(process_bundle)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
 line 1265, in push
18:13:43 response = self.worker.do_instruction(request)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/worker/sdk_worker.py",
 line 212, in do_instruction
18:13:43 request.instruction_id)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/worker/sdk_worker.py",
 line 234, in process_bundle
18:13:43 processor.process_bundle(instruction_id)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/worker/bundle_processor.py",
 line 420, in process_bundle
18:13:43 ].process_encoded(data.data)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/worker/bundle_processor.py",
 line 125, in process_encoded
18:13:43 self.output(decoded_value)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/worker/operations.py",
 line 169, in output
18:13:43 cython.cast(Receiver, 
self.receivers[output_index]).receive(windowed_value)
18:13:43   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/worker/operations.py",
 line 89, in receive
18:13:43 cython.cast(Operation, consumer).process(windowe

[jira] [Updated] (BEAM-2953) Timeseries processing extensions using state API

2018-10-12 Thread Reza ardeshir rokni (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reza ardeshir rokni updated BEAM-2953:
--
Summary: Timeseries processing extensions using state API  (was: Create 
more advanced Timeseries processing examples using state API)

> Timeseries processing extensions using state API
> 
>
> Key: BEAM-2953
> URL: https://issues.apache.org/jira/browse/BEAM-2953
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-ideas
>Affects Versions: 2.7.0
>Reporter: Reza ardeshir rokni
>Assignee: Reuven Lax
>Priority: Minor
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> A general set of timeseries transforms that abstract the user from the 
> process of dealing with some of the common problems when dealing with 
> timeseries using BEAM (in stream or batch mode).
> BEAM can be used to build out some very interesting pre-processing stages for 
> time series data. Some examples that will be useful:
>  - Downsampling time series based on simple MIN, MAX, COUNT, SUM, LAST, FIRST
>  - Creating a value for each downsampled window even if no value has been 
> emitted for the specific key. 
>  - Loading the value of a downsample with the previous value (used in FX with 
> previous close being brought into current open value)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=154078=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154078
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 13/Oct/18 01:29
Start Date: 13/Oct/18 01:29
Worklog Time Spent: 10m 
  Work Description: HuangLED edited a comment on issue #6680: [BEAM-5637] 
Python support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6680#issuecomment-429468108
 
 
   R:  @herohde 
   cc: @boyuanzz @pabloem 
   
   Addressed.  Also, option definition moved to WorkerOptions per Pablo's 
suggestion. 
   
   Thanks to Boyuan for pointing out the right place for error message. 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154078)
Time Spent: 2.5h  (was: 2h 20m)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Python jobs. That requires a change to the Python 
> boot code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/python/container/boot.go#L104



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154077=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154077
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 13/Oct/18 01:03
Start Date: 13/Oct/18 01:03
Worklog Time Spent: 10m 
  Work Description: mwylde commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r224944539
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 ##
 @@ -406,6 +417,56 @@ private void translateImpulse(
 
context.addDataStream(Iterables.getOnlyElement(pTransform.getOutputsMap().values()),
 source);
   }
 
+  @AutoService(NativeTransforms.IsNativeTransform.class)
+  public static class IsFlinkNativeTransform implements 
NativeTransforms.IsNativeTransform {
+@Override
+public boolean test(RunnerApi.PTransform pTransform) {
+  return 
STREAMING_IMPULSE_TRANSFORM_URL.equals(PTransformTranslation.urnForTransformOrNull(pTransform));
+}
+  }
+
+  private void translateStreamingImpulse(
+  String id, RunnerApi.Pipeline pipeline, StreamingTranslationContext 
context) {
+RunnerApi.PTransform pTransform = 
pipeline.getComponents().getTransformsOrThrow(id);
+
+ObjectMapper objectMapper = new ObjectMapper();
+
+int intervalMillis;
+int messageCount;
+try {
+  JsonNode config = 
objectMapper.readTree(pTransform.getSpec().getPayload().toByteArray());
+  intervalMillis = config.path("interval_ms").asInt(100);
+  messageCount = config.path("message_count").asInt(0);
+} catch (IOException e) {
+throw new RuntimeException("Failed to parse configuration for 
streaming impulse", e);
+}
+
+DataStreamSource> source =
+context
+.getExecutionEnvironment()
+.addSource(
+new RichParallelSourceFunction>() {
+  private AtomicBoolean cancelled = new AtomicBoolean(false);
+  private AtomicLong count = new AtomicLong();
+
+  @Override
+  public void run(SourceContext> ctx) 
throws Exception {
+while (!cancelled.get() && (messageCount == 0 || 
count.getAndIncrement() < messageCount)) {
+  ctx.collect(WindowedValue.valueInGlobalWindow(new byte[] 
{}));
+  Thread.sleep(intervalMillis);
 
 Review comment:
    


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154077)
Time Spent: 2h 20m  (was: 2h 10m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5744) Investigate negative numbers represented as 'long' in Python SDK + Direct runner

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648659#comment-16648659
 ] 

Valentyn Tymofieiev commented on BEAM-5744:
---

It appears that sometimes negative numbers are sometimes represented as `long`, 
which I observed in 
https://github.com/apache/beam/pull/6602#issuecomment-429468709, opened this 
issue to investigate if this is a regression in 2.8.0, since such behavior 
would be incompatible with https://github.com/apache/beam/pull/6602 on Python 
2, and https://github.com/apache/beam/pull/6602 is included in 2.8.0 release.

> Investigate negative numbers represented as 'long' in Python SDK + Direct 
> runner
> 
>
> Key: BEAM-5744
> URL: https://issues.apache.org/jira/browse/BEAM-5744
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.8.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5621?focusedWorklogId=154075=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154075
 ]

ASF GitHub Bot logged work on BEAM-5621:


Author: ASF GitHub Bot
Created on: 13/Oct/18 00:54
Start Date: 13/Oct/18 00:54
Worklog Time Spent: 10m 
  Work Description: tvalentyn edited a comment on issue #6602: [BEAM-5621] 
Fix unorderable types in python 3
URL: https://github.com/apache/beam/pull/6602#issuecomment-429497899
 
 
   I was not able to reproduce the issue where negative numbers are represented 
as `long` type I mentioned in 
https://github.com/apache/beam/pull/6602#issuecomment-429468709 using a fresh 
installation of apache-beam==2.7.0, so I instead opened 
https://issues.apache.org/jira/browse/BEAM-5744, to investigate if this is a 
regression in 2.8.0.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154075)
Time Spent: 3h  (was: 2h 50m)

> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154074=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154074
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 13/Oct/18 00:54
Start Date: 13/Oct/18 00:54
Worklog Time Spent: 10m 
  Work Description: mwylde commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r224944203
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 ##
 @@ -406,6 +417,56 @@ private void translateImpulse(
 
context.addDataStream(Iterables.getOnlyElement(pTransform.getOutputsMap().values()),
 source);
   }
 
+  @AutoService(NativeTransforms.IsNativeTransform.class)
+  public static class IsFlinkNativeTransform implements 
NativeTransforms.IsNativeTransform {
+@Override
+public boolean test(RunnerApi.PTransform pTransform) {
+  return 
STREAMING_IMPULSE_TRANSFORM_URL.equals(PTransformTranslation.urnForTransformOrNull(pTransform));
+}
+  }
+
+  private void translateStreamingImpulse(
+  String id, RunnerApi.Pipeline pipeline, StreamingTranslationContext 
context) {
+RunnerApi.PTransform pTransform = 
pipeline.getComponents().getTransformsOrThrow(id);
+
+ObjectMapper objectMapper = new ObjectMapper();
+
+int intervalMillis;
+int messageCount;
+try {
+  JsonNode config = 
objectMapper.readTree(pTransform.getSpec().getPayload().toByteArray());
+  intervalMillis = config.path("interval_ms").asInt(100);
+  messageCount = config.path("message_count").asInt(0);
+} catch (IOException e) {
+throw new RuntimeException("Failed to parse configuration for 
streaming impulse", e);
+}
+
+DataStreamSource> source =
+context
+.getExecutionEnvironment()
+.addSource(
+new RichParallelSourceFunction>() {
+  private AtomicBoolean cancelled = new AtomicBoolean(false);
+  private AtomicLong count = new AtomicLong();
 
 Review comment:
   Good to know that this will never be called concurrently. I'll change this 
to a long.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154074)
Time Spent: 2h 10m  (was: 2h)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5621?focusedWorklogId=154073=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154073
 ]

ASF GitHub Bot logged work on BEAM-5621:


Author: ASF GitHub Bot
Created on: 13/Oct/18 00:53
Start Date: 13/Oct/18 00:53
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #6602: [BEAM-5621] Fix 
unorderable types in python 3
URL: https://github.com/apache/beam/pull/6602#issuecomment-429497899
 
 
   I was not able to reproduce 
https://github.com/apache/beam/pull/6602#issuecomment-429468709 using a fresh 
installation of apache-beam==2.7.0, so I instead opened 
https://issues.apache.org/jira/browse/BEAM-5744, to investigate if this is a 
regression in 2.8.0.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154073)
Time Spent: 2h 50m  (was: 2h 40m)

> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5744) Investigate negative numbers represented as 'long' in Python SDK + Direct runner

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-5744:
--
Summary: Investigate negative numbers represented as 'long' in Python SDK + 
Direct runner  (was: Investigate negative numbers investigated as 'long' in 
Python SDK + Direct runner)

> Investigate negative numbers represented as 'long' in Python SDK + Direct 
> runner
> 
>
> Key: BEAM-5744
> URL: https://issues.apache.org/jira/browse/BEAM-5744
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.8.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5744) Investigate negative numbers investigated as 'long' in Python SDK + Direct runner

2018-10-12 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-5744:
-

 Summary: Investigate negative numbers investigated as 'long' in 
Python SDK + Direct runner
 Key: BEAM-5744
 URL: https://issues.apache.org/jira/browse/BEAM-5744
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev
Assignee: Ahmet Altay
 Fix For: 2.8.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154072=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154072
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 13/Oct/18 00:51
Start Date: 13/Oct/18 00:51
Worklog Time Spent: 10m 
  Work Description: mwylde commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r224944097
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 ##
 @@ -406,6 +417,56 @@ private void translateImpulse(
 
context.addDataStream(Iterables.getOnlyElement(pTransform.getOutputsMap().values()),
 source);
   }
 
+  @AutoService(NativeTransforms.IsNativeTransform.class)
+  public static class IsFlinkNativeTransform implements 
NativeTransforms.IsNativeTransform {
+@Override
+public boolean test(RunnerApi.PTransform pTransform) {
+  return 
STREAMING_IMPULSE_TRANSFORM_URL.equals(PTransformTranslation.urnForTransformOrNull(pTransform));
+}
+  }
+
+  private void translateStreamingImpulse(
+  String id, RunnerApi.Pipeline pipeline, StreamingTranslationContext 
context) {
+RunnerApi.PTransform pTransform = 
pipeline.getComponents().getTransformsOrThrow(id);
+
+ObjectMapper objectMapper = new ObjectMapper();
+
+int intervalMillis;
+int messageCount;
+try {
+  JsonNode config = 
objectMapper.readTree(pTransform.getSpec().getPayload().toByteArray());
+  intervalMillis = config.path("interval_ms").asInt(100);
+  messageCount = config.path("message_count").asInt(0);
+} catch (IOException e) {
+throw new RuntimeException("Failed to parse configuration for 
streaming impulse", e);
+}
+
+DataStreamSource> source =
+context
+.getExecutionEnvironment()
+.addSource(
+new RichParallelSourceFunction>() {
 
 Review comment:
    I've moved it to 
org.apache.beam.runners.flink.translation.wrappers.streaming.io


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154072)
Time Spent: 2h  (was: 1h 50m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5744) Investigate negative numbers investigated as 'long' in Python SDK + Direct runner

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reassigned BEAM-5744:
-

Assignee: Valentyn Tymofieiev  (was: Ahmet Altay)

> Investigate negative numbers investigated as 'long' in Python SDK + Direct 
> runner
> -
>
> Key: BEAM-5744
> URL: https://issues.apache.org/jira/browse/BEAM-5744
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.8.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5621?focusedWorklogId=154071=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154071
 ]

ASF GitHub Bot logged work on BEAM-5621:


Author: ASF GitHub Bot
Created on: 13/Oct/18 00:49
Start Date: 13/Oct/18 00:49
Worklog Time Spent: 10m 
  Work Description: tvalentyn edited a comment on issue #6602: [BEAM-5621] 
Fix unorderable types in python 3
URL: https://github.com/apache/beam/pull/6602#issuecomment-429478105
 
 
   Unfortunately I think 
https://github.com/apache/beam/pull/6602#issuecomment-429468709 may be a very 
unintuitive change, so we need to roll it back and either fix the underlying 
issue with typing of negative numbers or proceed with a different solution 
here. We would need to cherry-pick the change into the release branch, so I'll 
mark BEAM-5621 as release blocker until cherry-pick is in.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154071)
Time Spent: 2h 40m  (was: 2.5h)

> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5615) Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword argument for this function

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-5615:
--
Affects Version/s: (was: 2.8.0)

> Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword 
> argument for this function
> -
>
> Key: BEAM-5615
> URL: https://issues.apache.org/jira/browse/BEAM-5615
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> ERROR: test_top (apache_beam.transforms.combiners_test.CombineTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners_test.py",
>  line 89, in test_top
> names)  # Note parameter passed to comparator.
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 467, in apply
> label or transform.label)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 477, in apply
> return self.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 759, in expand
> return self._fn(pcoll, *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 185, in Of
> TopCombineFn(n, compare, key, reverse), *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1251, in expand
> default_value = combine_fn.apply([], *self.args, **self.kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 623, in apply
> *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 362, in extract_output
> self._sort_buffer(buffer, lt)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 295, in _sort_buffer
> key=self._key_fn)
> TypeError: 'cmp' is an invalid keyword argument for this function



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5714) RedisIO emit error of EXEC without MULTI

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5714?focusedWorklogId=154070=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154070
 ]

ASF GitHub Bot logged work on BEAM-5714:


Author: ASF GitHub Bot
Created on: 13/Oct/18 00:47
Start Date: 13/Oct/18 00:47
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #6651: [BEAM-5714] Fix 
RedisIO EXEC without MULTI error
URL: https://github.com/apache/beam/pull/6651#issuecomment-429497451
 
 
   cc: @vvarma might have an opinion


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154070)
Time Spent: 40m  (was: 0.5h)

> RedisIO emit error of EXEC without MULTI
> 
>
> Key: BEAM-5714
> URL: https://issues.apache.org/jira/browse/BEAM-5714
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-redis
>Affects Versions: 2.7.0
>Reporter: K.K. POON
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> RedisIO has EXEC without MULTI error after SET a batch of records.
>  
> By looking at the source code, I guess there is missing `pipeline.multi();` 
> after exec() the last batch.
> [https://github.com/apache/beam/blob/master/sdks/java/io/redis/src/main/java/org/apache/beam/sdk/io/redis/RedisIO.java#L555]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5707) Add a portable Flink streaming synthetic source for testing

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5707?focusedWorklogId=154069=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154069
 ]

ASF GitHub Bot logged work on BEAM-5707:


Author: ASF GitHub Bot
Created on: 13/Oct/18 00:42
Start Date: 13/Oct/18 00:42
Worklog Time Spent: 10m 
  Work Description: mwylde commented on a change in pull request #6637: 
[BEAM-5707] Add a periodic, streaming impulse source for Flink portable 
pipelines
URL: https://github.com/apache/beam/pull/6637#discussion_r224943673
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
 ##
 @@ -63,6 +64,8 @@
   public static final String GROUP_BY_KEY_TRANSFORM_URN =
   getUrn(StandardPTransforms.Primitives.GROUP_BY_KEY);
   public static final String IMPULSE_TRANSFORM_URN = 
getUrn(StandardPTransforms.Primitives.IMPULSE);
+  public static final String STREAMING_IMPULSE_TRANSFORM_URL = 
"flink:transform:streaming_impulse:v1";
 
 Review comment:
    moved to FlinkStreamingPortablePipelineTranslator


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154069)
Time Spent: 1h 50m  (was: 1h 40m)

> Add a portable Flink streaming synthetic source for testing
> ---
>
> Key: BEAM-5707
> URL: https://issues.apache.org/jira/browse/BEAM-5707
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Micah Wylde
>Assignee: Aljoscha Krettek
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently there are no built-in streaming sources for portable pipelines. 
> This makes it hard to test streaming functionality in the Python SDK.
> It would be very useful to add a periodic impulse source that (with some 
> configurable frequency) outputs an empty byte array, which can then be 
> transformed as desired inside the python pipeline. More context in this 
> [mailing list 
> discussion|https://lists.apache.org/thread.html/b44a648ab1d0cb200d8bfe4b280e9dad6368209c4725609cbfbbe410@%3Cdev.beam.apache.org%3E].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (BEAM-5539) Beam Dependency Update Request: google-cloud-pubsub

2018-10-12 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reopened BEAM-5539:
-

PR was rolled back

> Beam Dependency Update Request: google-cloud-pubsub
> ---
>
> Key: BEAM-5539
> URL: https://issues.apache.org/jira/browse/BEAM-5539
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Udi Meiri
>Priority: Major
> Fix For: Not applicable
>
>
>  - 2018-10-01 19:17:59.633423 
> -
> Please consider upgrading the dependency google-cloud-pubsub. 
> The current version is 0.26.0. The latest version is 0.38.0 
> cc: [~markflyhigh], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2018-10-08 12:11:22.339342 
> -
> Please consider upgrading the dependency google-cloud-pubsub. 
> The current version is 0.35.4. The latest version is 0.38.0 
> cc: [~markflyhigh], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=154064=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154064
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 12/Oct/18 23:58
Start Date: 12/Oct/18 23:58
Worklog Time Spent: 10m 
  Work Description: pabloem closed pull request #6205: [BEAM-4374] 
Implementing a subset of the new metrics framework in python.
URL: https://github.com/apache/beam/pull/6205
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/model/fn-execution/src/main/proto/beam_fn_api.proto 
b/model/fn-execution/src/main/proto/beam_fn_api.proto
index a0795a7c285..915686de6b3 100644
--- a/model/fn-execution/src/main/proto/beam_fn_api.proto
+++ b/model/fn-execution/src/main/proto/beam_fn_api.proto
@@ -40,6 +40,7 @@ option java_outer_classname = "BeamFnApi";
 
 import "beam_runner_api.proto";
 import "endpoints.proto";
+import "google/protobuf/descriptor.proto";
 import "google/protobuf/timestamp.proto";
 import "google/protobuf/wrappers.proto";
 
@@ -250,11 +251,16 @@ message ProcessBundleRequest {
 message ProcessBundleResponse {
   // (Optional) If metrics reporting is supported by the SDK, this represents
   // the final metrics to record for this bundle.
+  // DEPRECATED
   Metrics metrics = 1;
 
   // (Optional) Specifies that the bundle has been split since the last
   // ProcessBundleProgressResponse was sent.
   BundleSplit split = 2;
+
+  // (Required) The list of metrics or other MonitoredState
+  // collected while processing this bundle.
+  repeated MonitoringInfo monitoring_infos = 3;
 }
 
 // A request to report progress information for a given bundle.
@@ -275,9 +281,9 @@ message MonitoringInfo {
   // Sub types like field formats - int64, double, string.
   // Aggregation methods - SUM, LATEST, TOP-N, BOTTOM-N, DISTRIBUTION
   // valid values are:
-  // beam:metrics:[SumInt64|LatestInt64|Top-NInt64|Bottom-NInt64|
-  // SumDouble|LatestDouble|Top-NDouble|Bottom-NDouble|DistributionInt64|
-  // DistributionDouble|MonitoringDataTable]
+  // beam:metrics:[sum_int_64|latest_int_64|top_n_int_64|bottom_n_int_64|
+  // sum_double|latest_double|top_n_double|bottom_n_double|
+  // distribution_int_64|distribution_double|monitoring_data_table
   string type = 2;
 
   // The Metric or monitored state.
@@ -302,6 +308,45 @@ message MonitoringInfo {
   // Some systems such as Stackdriver will be able to aggregate the metrics
   // using a subset of the provided labels
   map labels = 5;
+
+  // The walltime of the most recent update.
+  // Useful for aggregation for Latest types such as LatestInt64.
+  google.protobuf.Timestamp timestamp = 6;
+}
+
+message MonitoringInfoUrns {
+  enum Enum {
+USER_COUNTER_URN_PREFIX = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metric:user"];
+
+ELEMENT_COUNT = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metric:element_count:v1"];
+
+START_BUNDLE_MSECS = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+  "beam:metric:pardo_execution_time:start_bundle_msecs:v1"];
+
+PROCESS_BUNDLE_MSECS = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metric:pardo_execution_time:process_bundle_msecs:v1"];
+
+FINISH_BUNDLE_MSECS = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metric:pardo_execution_time:finish_bundle_msecs:v1"];
+
+TOTAL_MSECS = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metric:ptransform_execution_time:total_msecs:v1"];
+  }
+}
+
+message MonitoringInfoTypeUrns {
+  enum Enum {
+SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metrics:sum_int_64"];
+
+DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metrics:distribution_int_64"];
+
+LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+  "beam:metrics:latest_int_64"];
+  }
 }
 
 message Metric {
@@ -525,12 +570,16 @@ message Metrics {
 }
 
 message ProcessBundleProgressResponse {
-  // (Required)
+  // DEPRECATED (Required)
   Metrics metrics = 1;
 
   // (Optional) Specifies that the bundle has been split since the last
   // ProcessBundleProgressResponse was sent.
   BundleSplit split = 2;
+
+  // (Required) The list of metrics or other MonitoredState
+  // collected while processing this bundle.
+  repeated MonitoringInfo monitoring_infos = 3;
 }
 
 message ProcessBundleSplitRequest {
@@ -795,7 +844,6 @@ m

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=154063=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154063
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 12/Oct/18 23:58
Start Date: 12/Oct/18 23:58
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #6205: [BEAM-4374] 
Implementing a subset of the new metrics framework in python.
URL: https://github.com/apache/beam/pull/6205#issuecomment-429492504
 
 
   Okay, as this looks good, I'll go ahead and merge.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154063)
Time Spent: 8h 20m  (was: 8h 10m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5636) Java support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5636?focusedWorklogId=154062=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154062
 ]

ASF GitHub Bot logged work on BEAM-5636:


Author: ASF GitHub Bot
Created on: 12/Oct/18 23:54
Start Date: 12/Oct/18 23:54
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #6665: [BEAM-5636] Java 
support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6665#issuecomment-429492075
 
 
   @boyuanzz Btw, please also update the doc describing this change to include 
Java.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154062)
Time Spent: 1h 10m  (was: 1h)

> Java support for custom dataflow worker jar
> ---
>
> Key: BEAM-5636
> URL: https://issues.apache.org/jira/browse/BEAM-5636
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Java jobs. That requires a change to the Java boot 
> code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/java/container/boot.go#L107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-10-12 Thread Thomas Weise (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise updated BEAM-5442:
---
Fix Version/s: (was: 2.8.0)

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (BEAM-5442) PortableRunner swallows custom options for Runner

2018-10-12 Thread Thomas Weise (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise reopened BEAM-5442:


> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5636) Java support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5636?focusedWorklogId=154061=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154061
 ]

ASF GitHub Bot logged work on BEAM-5636:


Author: ASF GitHub Bot
Created on: 12/Oct/18 23:40
Start Date: 12/Oct/18 23:40
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #6665: 
[BEAM-5636] Java support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6665#discussion_r224939002
 
 

 ##
 File path: sdks/java/container/boot.go
 ##
 @@ -103,7 +103,17 @@ func main() {
filepath.Join(jarsDir, "slf4j-jdk14.jar"),
filepath.Join(jarsDir, "beam-sdks-java-harness.jar"),
}
+
+   var hasWorkerExperiment = strings.Contains(options, 
"use_staged_dataflow_worker_jar")
for _, md := range artifacts {
+   if hasWorkerExperiment {
+   if strings.HasPrefix(md.Name, 
"beam-runners-google-cloud-dataflow-java-fn-api-worker") {
+   continue
+   }
+   if strings.HasPrefix(md.Name, "dataflow-worker.jar") {
 
 Review comment:
   Small comment: this can be == instead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154061)
Time Spent: 1h  (was: 50m)

> Java support for custom dataflow worker jar
> ---
>
> Key: BEAM-5636
> URL: https://issues.apache.org/jira/browse/BEAM-5636
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Java jobs. That requires a change to the Java boot 
> code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/java/container/boot.go#L107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3587) User reports TextIO failure in FlinkRunner on master

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3587?focusedWorklogId=154059=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154059
 ]

ASF GitHub Bot logged work on BEAM-3587:


Author: ASF GitHub Bot
Created on: 12/Oct/18 23:36
Start Date: 12/Oct/18 23:36
Worklog Time Spent: 10m 
  Work Description: swegner closed pull request #384: [BEAM-3587] Add a 
note to Gradle shadowJar for merge service files
URL: https://github.com/apache/beam-site/pull/384
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/src/documentation/runners/flink.md 
b/src/documentation/runners/flink.md
index 6dc6e7b69d..df64ba46f9 100644
--- a/src/documentation/runners/flink.md
+++ b/src/documentation/runners/flink.md
@@ -51,7 +51,7 @@ For more information, the [Flink 
Documentation](https://ci.apache.org/projects/f
 ```java
 
   org.apache.beam
-  beam-runners-flink_2.10
+  beam-runners-flink_2.11
   {{ site.release_latest }}
   runtime
 
@@ -81,6 +81,62 @@ $ mvn exec:java 
-Dexec.mainClass=org.apache.beam.examples.WordCount \
 If you have a Flink `JobManager` running on your local machine you can give 
`localhost:6123` for
 `flinkMaster`.
 
+Behind the hood, to create your shaded jar (containing your pipeline and the 
Flink runner dependencies), you have to use the `maven-shade-plugin`:
+
+```java
+
+org.apache.beam
+beam-runners-flink_2.10
+{{ site.release_latest }}
+
+```
+
+```java
+
+org.apache.maven.plugins
+maven-shade-plugin
+${maven-shade-plugin.version}
+
+
false
+
+
+*:*
+
+META-INF/*.SF
+META-INF/*.DSA
+META-INF/*.RSA
+
+
+
+
+
+
+package
+
+shade
+
+
+
true
+shaded
+
+
+
+
+
+
+
+```
+
+Then, Maven build will create the shaded jar.
+
+If you prefer to use Gradle, you can achieve the same using `shadowJar`:
+
+```java
+shadowJar {
+mergeServiceFiles()
+}
+```
+
 ## Pipeline options for the Flink Runner
 
 When executing your pipeline with the Flink Runner, you can set these pipeline 
options.
diff --git a/src/documentation/runners/spark.md 
b/src/documentation/runners/spark.md
index 1502f242c0..4b4479e0e3 100644
--- a/src/documentation/runners/spark.md
+++ b/src/documentation/runners/spark.md
@@ -37,7 +37,7 @@ You can add a dependency on the latest version of the Spark 
runner by adding to
 
 ### Deploying Spark with your application
 
-In some cases, such as running in local mode/Standalone, your (self-contained) 
application would be required to pack Spark by explicitly adding the following 
dependencies in your pom.xml:
+Most of the time (running in local mode/Standalone or using `spark-submit`), 
your (self-contained) application would be required to pack Spark by explicitly 
adding the following dependencies in your pom.xml:
 ```java
 
   org.apache.spark
@@ -94,6 +94,17 @@ After running mvn package, run ls 
target and you shoul
 beam-examples-1.0.0-shaded.jar
 ```
 
+If you are using gradle, you have to use `shadowJar` to create the shaded jar 
enabling `mergeServiceFiles()`:
+```java
+shadowJar {
+transform(AppendingTransformer) {
+resource = 'reference.conf'
+}
+relocate 'com.google.protobuf', 'shaded.protobuf'
+mergeServiceFiles()
+}
+```
+
 To run against a Standalone cluster simply run:
 ```
 spark-submit --class com.beam.examples.BeamPipeline --master spark://HOST:PORT 
target/beam-examples-1.0.0-shaded.jar --runner=SparkRunner


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154059)
Time Spent: 1h 20m  (was: 1h 10m)

> User reports TextIO failure in FlinkRun

[jira] [Work logged] (BEAM-3587) User reports TextIO failure in FlinkRunner on master

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3587?focusedWorklogId=154058=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154058
 ]

ASF GitHub Bot logged work on BEAM-3587:


Author: ASF GitHub Bot
Created on: 12/Oct/18 23:36
Start Date: 12/Oct/18 23:36
Worklog Time Spent: 10m 
  Work Description: swegner commented on issue #384: [BEAM-3587] Add a note 
to Gradle shadowJar for merge service files
URL: https://github.com/apache/beam-site/pull/384#issuecomment-429489867
 
 
   I've migrated the changes for this pull request onto the migrated website 
sources in the `apache/beam` repository: 
https://github.com/swegner/beam/tree/migrated-pr-384
   
   To pull the migrated changes into your local git client, run:
   
   ```
   git remote add swegner g...@github.com:swegner/beam.git && git fetch swegner
   git checkout -B bigqueryio swegner/migrated-pr-384
   ```
   
   You can then push the changes to your own branch and [open a new pull 
request](https://github.com/apache/beam/compare?expand=1) against the 
apache/beam repository.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154058)
Time Spent: 1h 10m  (was: 1h)

> User reports TextIO failure in FlinkRunner on master
> 
>
> Key: BEAM-3587
> URL: https://issues.apache.org/jira/browse/BEAM-3587
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Kenneth Knowles
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
> Fix For: Not applicable
>
> Attachments: screen1.png, screen2.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Reported here: 
> [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E]
> "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink 
> cluster, using the latest Beam git revision (ff37337). The job fails to start 
> with the Exception:
>   {{java.lang.UnsupportedOperationException: The transform  is currently not 
> supported.}}
> It does work with Beam 2.2.0 though. All code, logs, and reproduction steps  
> [https://github.com/pelletier/beam-flink-example];
> My initial thoughts: I have a guess that this has to do with switching to 
> running from a portable pipeline representation, and it looks like there's a 
> non-composite transform with an empty URN and it threw a bad error message. 
> We can try to root cause but may also mitigate short-term by removing the 
> round-trip through pipeline proto for now.
> What is curious is that the ValidatesRunner and WordCountIT are working - 
> they only run on a local Flink, yet this seems to be a translation issue that 
> would occur for local or distributed runs.
> We need to certainly run this repro on the RC if we don't totally get to the 
> bottom of it quickly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-1081) annotations should support custom messages and classes

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1081?focusedWorklogId=154060=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154060
 ]

ASF GitHub Bot logged work on BEAM-1081:


Author: ASF GitHub Bot
Created on: 12/Oct/18 23:36
Start Date: 12/Oct/18 23:36
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #6670: [BEAM-1081] 
Annotations custom message support and classes tests.
URL: https://github.com/apache/beam/pull/6670#issuecomment-429489882
 
 
   @jglezt it looks like there are test issues, could you look at those?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154060)
Time Spent: 50m  (was: 40m)

> annotations should support custom messages and classes
> --
>
> Key: BEAM-1081
> URL: https://issues.apache.org/jira/browse/BEAM-1081
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Priority: Minor
>  Labels: newbie, starter
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Update 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/utils/annotations.py
>  to add 2 new features:
> 1. ability to customize message
> 2. ability to tag classes (not only functions)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5636) Java support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5636?focusedWorklogId=154057=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154057
 ]

ASF GitHub Bot logged work on BEAM-5636:


Author: ASF GitHub Bot
Created on: 12/Oct/18 23:35
Start Date: 12/Oct/18 23:35
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on a change in pull request #6665: 
[BEAM-5636] Java support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6665#discussion_r224938513
 
 

 ##
 File path: sdks/java/container/boot.go
 ##
 @@ -103,7 +103,17 @@ func main() {
filepath.Join(jarsDir, "slf4j-jdk14.jar"),
filepath.Join(jarsDir, "beam-sdks-java-harness.jar"),
}
+
+  var has_worker_experiment = strings.Contains(options, 
"use_staged_dataflow_worker_jar")
 
 Review comment:
   Fixed, thanks~


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154057)
Time Spent: 50m  (was: 40m)

> Java support for custom dataflow worker jar
> ---
>
> Key: BEAM-5636
>     URL: https://issues.apache.org/jira/browse/BEAM-5636
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Java jobs. That requires a change to the Java boot 
> code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/java/container/boot.go#L107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5636) Java support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5636?focusedWorklogId=154055=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154055
 ]

ASF GitHub Bot logged work on BEAM-5636:


Author: ASF GitHub Bot
Created on: 12/Oct/18 23:17
Start Date: 12/Oct/18 23:17
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #6665: 
[BEAM-5636] Java support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6665#discussion_r224936531
 
 

 ##
 File path: sdks/java/container/boot.go
 ##
 @@ -103,7 +103,17 @@ func main() {
filepath.Join(jarsDir, "slf4j-jdk14.jar"),
filepath.Join(jarsDir, "beam-sdks-java-harness.jar"),
}
+
+  var has_worker_experiment = strings.Contains(options, 
"use_staged_dataflow_worker_jar")
 
 Review comment:
   nit: go uses camelCase for local variables. You also need to run "gofmt -w 
." to fix the indentation.
 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154055)
Time Spent: 40m  (was: 0.5h)

> Java support for custom dataflow worker jar
> ---
>
> Key: BEAM-5636
> URL: https://issues.apache.org/jira/browse/BEAM-5636
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Java jobs. That requires a change to the Java boot 
> code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/java/container/boot.go#L107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5735) Contributor Guide Improvements

2018-10-12 Thread Scott Wegner (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648557#comment-16648557
 ] 

Scott Wegner commented on BEAM-5735:


I made some small boilerplate changes while we were discussing these in person. 
It's not nearly ready to check-in, but in case it's useful: 
https://github.com/swegner/beam/commit/06aee6b3903020e04b7ce4000ba71be525e04721

> Contributor Guide Improvements
> --
>
> Key: BEAM-5735
> URL: https://issues.apache.org/jira/browse/BEAM-5735
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Scott Wegner
>Assignee: Alan Myrvold
>Priority: Major
>
> This is a wish-list for improvements to the Beam contributor guide.
> Many thanks to [~rohdesam] for the feedback which helped shape this list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5708) Support caching of SDKHarness environments in flink

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5708?focusedWorklogId=154054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154054
 ]

ASF GitHub Bot logged work on BEAM-5708:


Author: ASF GitHub Bot
Created on: 12/Oct/18 23:07
Start Date: 12/Oct/18 23:07
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6638: [BEAM-5708] Cache 
environment in portable flink runner
URL: https://github.com/apache/beam/pull/6638#issuecomment-429485973
 
 
   @angoenka did you test the fallback case?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154054)
Time Spent: 1h 50m  (was: 1h 40m)

> Support caching of SDKHarness environments in flink
> ---
>
> Key: BEAM-5708
> URL: https://issues.apache.org/jira/browse/BEAM-5708
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Cache and reuse environment to improve performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5740) Refactor permissions section into bullet-points

2018-10-12 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner reassigned BEAM-5740:
--

Assignee: Scott Wegner

> Refactor permissions section into bullet-points
> ---
>
> Key: BEAM-5740
> URL: https://issues.apache.org/jira/browse/BEAM-5740
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>
> The permissions section has good content, but it's not easily browseable if 
> you're looking for a specific thing (i.e. Slack permissions). We should 
> refactor it into bullet points.
> For permissions that require reaching out via email/Slack, we should link to 
> some previous example. It lowers the barrier to entry if a new contributor 
> can copy/paste some existing template.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5743) Move "works in progress" out of getting started guide

2018-10-12 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner reassigned BEAM-5743:
--

Assignee: Scott Wegner

> Move "works in progress" out of getting started guide
> -
>
> Key: BEAM-5743
> URL: https://issues.apache.org/jira/browse/BEAM-5743
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>
> These aren't relevant to a new contributor, and it makes the contributor 
> guide less focused. They should be moved to a separate page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5741) Move "Contact Us" to a top-level link

2018-10-12 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner reassigned BEAM-5741:
--

Assignee: (was: Melissa Pashniak)

> Move "Contact Us" to a top-level link
> -
>
> Key: BEAM-5741
> URL: https://issues.apache.org/jira/browse/BEAM-5741
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Priority: Major
>
> It should be very easy to figure out how to get in touch with community. 
> "Contact Us" should be a top-level link on the page.
> The page can also be improved with:
> * Some basic text on how to use subscribe / unsubscribe links
> * Recommendations on how to use various communications channels (Slack for 
> quick questions, dev@ for longer conversations. And all decisions should make 
> it back to dev@)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5740) Refactor permissions section into bullet-points

2018-10-12 Thread Scott Wegner (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner updated BEAM-5740:
---
Description: 
The permissions section has good content, but it's not easily browseable if 
you're looking for a specific thing (i.e. Slack permissions). We should 
refactor it into bullet points.

For permissions that require reaching out via email/Slack, we should link to 
some previous example. It lowers the barrier to entry if a new contributor can 
copy/paste some existing template.

  was:The permissions section has good content, but it's not easily browseable 
if you're looking for a specific thing (i.e. Slack permissions). We should 
refactor it into bullet points.


> Refactor permissions section into bullet-points
> ---
>
> Key: BEAM-5740
> URL: https://issues.apache.org/jira/browse/BEAM-5740
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Priority: Major
>
> The permissions section has good content, but it's not easily browseable if 
> you're looking for a specific thing (i.e. Slack permissions). We should 
> refactor it into bullet points.
> For permissions that require reaching out via email/Slack, we should link to 
> some previous example. It lowers the barrier to entry if a new contributor 
> can copy/paste some existing template.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5743) Move "works in progress" out of getting started guide

2018-10-12 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-5743:
--

 Summary: Move "works in progress" out of getting started guide
 Key: BEAM-5743
 URL: https://issues.apache.org/jira/browse/BEAM-5743
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Scott Wegner


These aren't relevant to a new contributor, and it makes the contributor guide 
less focused. They should be moved to a separate page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5742) Add expectations for code reviews

2018-10-12 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-5742:
--

 Summary: Add expectations for code reviews
 Key: BEAM-5742
 URL: https://issues.apache.org/jira/browse/BEAM-5742
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Scott Wegner


There should be a dedicated section about how we to code reviews, and 
expectations for contributors / reviewers. Things like:
* PR's should have linked JIRA
* Code changes should also include tests
* Small changes are easier to review
* How to find a reviewer, when you can expect reviewer to engage
* How automatic test runs work (PreCommits run by on each commit; build / test 
should be passing before asking for a reviewer)
* How to re-run tests, and how/when to run Post-Commits



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5741) Move "Contact Us" to a top-level link

2018-10-12 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-5741:
--

 Summary: Move "Contact Us" to a top-level link
 Key: BEAM-5741
 URL: https://issues.apache.org/jira/browse/BEAM-5741
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Scott Wegner
Assignee: Melissa Pashniak


It should be very easy to figure out how to get in touch with community. 
"Contact Us" should be a top-level link on the page.

The page can also be improved with:
* Some basic text on how to use subscribe / unsubscribe links
* Recommendations on how to use various communications channels (Slack for 
quick questions, dev@ for longer conversations. And all decisions should make 
it back to dev@)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5740) Refactor permissions section into bullet-points

2018-10-12 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-5740:
--

 Summary: Refactor permissions section into bullet-points
 Key: BEAM-5740
 URL: https://issues.apache.org/jira/browse/BEAM-5740
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Scott Wegner


The permissions section has good content, but it's not easily browseable if 
you're looking for a specific thing (i.e. Slack permissions). We should 
refactor it into bullet points.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5739) Contributor Story: "Submitting your first PR"

2018-10-12 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-5739:
--

 Summary: Contributor Story: "Submitting your first PR"
 Key: BEAM-5739
 URL: https://issues.apache.org/jira/browse/BEAM-5739
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Scott Wegner


We should write the user story for "Submitting your first PR", with 
prescriptive steps on getting started. It should include:

* Forking the repo and setting up the dev environment
* How to build/test
* Choosing an IDE
* language / SDK-specific tips + website
* "When will my changes go live?"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5738) Refactor Contributor Guide into "Stories"

2018-10-12 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-5738:
--

 Summary: Refactor Contributor Guide into "Stories"
 Key: BEAM-5738
 URL: https://issues.apache.org/jira/browse/BEAM-5738
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Scott Wegner


The Contributor Guide has become a dumping ground for topics that are maybe 
relevant to contributors. It's long and unorganized and it doesn't really tell 
a contributor how to get started.

We should refactor the contributor guide into "stories", like "How to submit 
your first PR", "How to contribute to the website", "How to propose a new 
feature", etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5737) Create a contributor FAQ page

2018-10-12 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-5737:
--

 Summary: Create a contributor FAQ page
 Key: BEAM-5737
 URL: https://issues.apache.org/jira/browse/BEAM-5737
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Scott Wegner


This should be an easy "dumping ground" place to add documentation, pretty much 
about anything. It should be low-barrier to add new content or update existing. 
High-quality or frequently-used topics can be promoted to top level pages.

It probably makes sense to put this on the wiki. But it should be linked very 
prominently from the top of the Contributor Guide



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5636) Java support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5636?focusedWorklogId=154050=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154050
 ]

ASF GitHub Bot logged work on BEAM-5636:


Author: ASF GitHub Bot
Created on: 12/Oct/18 22:27
Start Date: 12/Oct/18 22:27
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #6665: [BEAM-5636] Java 
support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6665#issuecomment-429479809
 
 
   Re: @herohde Please take another look at this. Thanks~


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154050)
Time Spent: 0.5h  (was: 20m)

> Java support for custom dataflow worker jar
> ---
>
> Key: BEAM-5636
> URL: https://issues.apache.org/jira/browse/BEAM-5636
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Java jobs. That requires a change to the Java boot 
> code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/java/container/boot.go#L107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (BEAM-5615) Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword argument for this function

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reopened BEAM-5615:
---

> Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword 
> argument for this function
> -
>
> Key: BEAM-5615
> URL: https://issues.apache.org/jira/browse/BEAM-5615
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Affects Versions: 2.8.0
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> ERROR: test_top (apache_beam.transforms.combiners_test.CombineTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners_test.py",
>  line 89, in test_top
> names)  # Note parameter passed to comparator.
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 467, in apply
> label or transform.label)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 477, in apply
> return self.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 759, in expand
> return self._fn(pcoll, *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 185, in Of
> TopCombineFn(n, compare, key, reverse), *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1251, in expand
> default_value = combine_fn.apply([], *self.args, **self.kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 623, in apply
> *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 362, in extract_output
> self._sort_buffer(buffer, lt)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 295, in _sort_buffer
> key=self._key_fn)
> TypeError: 'cmp' is an invalid keyword argument for this function



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5615) Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword argument for this function

2018-10-12 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-5615:
--
Affects Version/s: 2.8.0

> Several tests fail on Python 3 with TypeError: 'cmp' is an invalid keyword 
> argument for this function
> -
>
> Key: BEAM-5615
> URL: https://issues.apache.org/jira/browse/BEAM-5615
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Affects Versions: 2.8.0
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> ERROR: test_top (apache_beam.transforms.combiners_test.CombineTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners_test.py",
>  line 89, in test_top
> names)  # Note parameter passed to comparator.
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 467, in apply
> label or transform.label)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 477, in apply
> return self.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 759, in expand
> return self._fn(pcoll, *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 185, in Of
> TopCombineFn(n, compare, key, reverse), *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pvalue.py",
>  line 111, in __or__
> return self.pipeline.apply(ptransform, self)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/pipeline.py",
>  line 513, in apply
> pvalueish_result = self.runner.apply(transform, pvalueish)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 193, in apply
> return m(transform, input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/runner.py",
>  line 199, in apply_PTransform
> return transform.expand(input)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1251, in expand
> default_value = combine_fn.apply([], *self.args, **self.kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 623, in apply
> *args, **kwargs)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 362, in extract_output
> self._sort_buffer(buffer, lt)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/combiners.py",
>  line 295, in _sort_buffer
> key=self._key_fn)
> TypeError: 'cmp' is an invalid keyword argument for this function



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5621?focusedWorklogId=154049=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154049
 ]

ASF GitHub Bot logged work on BEAM-5621:


Author: ASF GitHub Bot
Created on: 12/Oct/18 22:17
Start Date: 12/Oct/18 22:17
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #6602: [BEAM-5621] Fix 
unorderable types in python 3
URL: https://github.com/apache/beam/pull/6602#issuecomment-429478105
 
 
   Unfortunately I think 
https://github.com/apache/beam/pull/6602#issuecomment-429468709 may be a very 
unintuitive change, so we need to roll it back and either fix the underlying 
issue with typing of negative numbers or proceed with a different solution 
here. We would need to cherry-pick the change into the release branch, so I'll 
mark BEAM-5615 as release blocker until cherry-pick is in.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154049)
Time Spent: 2.5h  (was: 2h 20m)

> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5653) Dataflow FnApi Worker overrides some of Coders due to coder ID generation collision.

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5653?focusedWorklogId=154048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154048
 ]

ASF GitHub Bot logged work on BEAM-5653:


Author: ASF GitHub Bot
Created on: 12/Oct/18 22:16
Start Date: 12/Oct/18 22:16
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #6649: [BEAM-5653] Fix 
overriding coders due to duplicate coderId generation
URL: https://github.com/apache/beam/pull/6649#issuecomment-429477876
 
 
   And it's green!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154048)
Time Spent: 4h 10m  (was: 4h)
Remaining Estimate: 67h 50m  (was: 68h)

> Dataflow FnApi Worker overrides some of Coders due to coder ID generation 
> collision.
> 
>
> Key: BEAM-5653
> URL: https://issues.apache.org/jira/browse/BEAM-5653
> Project: Beam
>  Issue Type: Test
>  Components: java-fn-execution
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Blocker
> Fix For: 2.8.0
>
>   Original Estimate: 72h
>  Time Spent: 4h 10m
>  Remaining Estimate: 67h 50m
>
> Due to one of latest refactorings, we got a bug in Java FnApi Worker that it 
> overrides Coders in ProcessBundleDescriptor sent to SDK Harness that causes 
> jobs to fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5736) List prerequisite knowledge with links to external docs

2018-10-12 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-5736:
--

 Summary: List prerequisite knowledge with links to external docs
 Key: BEAM-5736
 URL: https://issues.apache.org/jira/browse/BEAM-5736
 Project: Beam
  Issue Type: Sub-task
  Components: website
Reporter: Scott Wegner
Assignee: Scott Wegner


Our contributor guide makes some assumptions about prior knowledge. We should 
be explicit about what we expect contributors to already know, and give links 
to places where they can learn more if necessary.

Examples:
* Git & GitHub workflows
* Beam (understand what it does, what it's for, how it fits into ecosystem)
* JDK 8 installed
* Windows / Mac / Linux dev environment (be explicit about what we support)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5735) Contributor Guide Improvements

2018-10-12 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-5735:
--

 Summary: Contributor Guide Improvements
 Key: BEAM-5735
 URL: https://issues.apache.org/jira/browse/BEAM-5735
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Scott Wegner
Assignee: Alan Myrvold


This is a wish-list for improvements to the Beam contributor guide.

Many thanks to [~rohdesam] for the feedback which helped shape this list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=154047=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154047
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 12/Oct/18 22:10
Start Date: 12/Oct/18 22:10
Worklog Time Spent: 10m 
  Work Description: ajamato commented on issue #6205: [BEAM-4374] 
Implementing a subset of the new metrics framework in python.
URL: https://github.com/apache/beam/pull/6205#issuecomment-429476625
 
 
   Squashed all the commits,
   FYI I imported this PR and internal google tests are also passing.
   
   @robertwb, happy to iterate more on your suggestions but we would like to 
submit this PR, and finish up this iteration


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154047)
Time Spent: 8h 10m  (was: 8h)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (BEAM-5709) Tests in BeamFnControlServiceTest are flaky.

2018-10-12 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reopened BEAM-5709:
---

> Tests in BeamFnControlServiceTest are flaky.
> 
>
> Key: BEAM-5709
> URL: https://issues.apache.org/jira/browse/BEAM-5709
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/fn/BeamFnControlServiceTest.java
> Tests for BeamFnControlService are currently flaky. The test attempts to 
> verify that onCompleted was called on the mock streams, but that function 
> gets called on a separate thread, so occasionally the function will not have 
> been called yet, despite the server being shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5729) Create ability to read/write database implementing database/sql contract

2018-10-12 Thread Adrian Witas (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648516#comment-16648516
 ] 

Adrian Witas commented on BEAM-5729:


In my case I need  a few dictionary as side input to be fetched from 
database/sql,  scallable is not must have for my case, but missing this 
functionality is discouraging,

> Create ability to read/write database implementing database/sql  contract
> -
>
> Key: BEAM-5729
> URL: https://issues.apache.org/jira/browse/BEAM-5729
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.7.0
>Reporter: Adrian Witas
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-1251) Python 3 Support

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1251?focusedWorklogId=154046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154046
 ]

ASF GitHub Bot logged work on BEAM-1251:


Author: ASF GitHub Bot
Created on: 12/Oct/18 22:00
Start Date: 12/Oct/18 22:00
Worklog Time Spent: 10m 
  Work Description: swegner commented on issue #6679: [BEAM-1251] Add a 
link to Python 3 Conversion Quick Start Guide to the list of ongoing efforts on 
Beam site.
URL: https://github.com/apache/beam/pull/6679#issuecomment-429474553
 
 
   Yup! 
   
   If you pop open the "All checks have passed" bar, you'll see a link to the 
"Website_Stage_GCS" job 
[results](https://builds.apache.org/job/beam_PreCommit_Website_Stage_GCS_Commit/79/),
 which contains a link to your staged changes for review: 
http://apache-beam-website-pull-requests.storage.googleapis.com/6679/index.html
   
   (I'm brainstorming a way to make that link more prominent on GitHub; let me 
know if you have ideas)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154046)
Time Spent: 22h 50m  (was: 22h 40m)

> Python 3 Support
> 
>
> Key: BEAM-1251
> URL: https://issues.apache.org/jira/browse/BEAM-1251
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Eyad Sibai
>Assignee: Robbe
>Priority: Major
>  Time Spent: 22h 50m
>  Remaining Estimate: 0h
>
> I have been trying to use google datalab with python3. As I see there are 
> several packages that does not support python3 yet which google datalab 
> depends on. This is one of them.
> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/issues/6



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5729) Create ability to read/write database implementing database/sql contract

2018-10-12 Thread Robert Burke (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648512#comment-16648512
 ] 

Robert Burke commented on BEAM-5729:


Once we have a way of defining Splitable DoFns for portable runners, it'll be 
easier to write a scalable implementation of this. 

See https://issues.apache.org/jira/browse/BEAM-3301

> Create ability to read/write database implementing database/sql  contract
> -
>
> Key: BEAM-5729
> URL: https://issues.apache.org/jira/browse/BEAM-5729
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.7.0
>Reporter: Adrian Witas
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5734) RedisIO: finishBundle calls Jedis.exec without checking if there are operations in the pipeline

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5734?focusedWorklogId=154045=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154045
 ]

ASF GitHub Bot logged work on BEAM-5734:


Author: ASF GitHub Bot
Created on: 12/Oct/18 21:56
Start Date: 12/Oct/18 21:56
Worklog Time Spent: 10m 
  Work Description: casidiablo commented on a change in pull request #6682: 
[BEAM-5734] RedisIO: only call Jedis.exec() on finishBundle if there is 
something to send
URL: https://github.com/apache/beam/pull/6682#discussion_r224925261
 
 

 ##
 File path: 
sdks/java/io/redis/src/test/java/org/apache/beam/sdk/io/redis/RedisIOTest.java
 ##
 @@ -86,7 +86,7 @@ public void testBulkRead() throws Exception {
   @Test
   public void testWriteReadUsingDefaultAppendMethod() throws Exception {
 ArrayList> data = new ArrayList<>();
-for (int i = 0; i < 100; i++) {
+for (int i = 0; i < 8000; i++) {
 
 Review comment:
   If the test had used this value instead, unit tests would have detected the 
issue.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154045)
Time Spent: 20m  (was: 10m)

> RedisIO: finishBundle calls Jedis.exec without checking if there are 
> operations in the pipeline
> ---
>
> Key: BEAM-5734
>     URL: https://issues.apache.org/jira/browse/BEAM-5734
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-redis
>Reporter: Cristian
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It throws:
>  
> {code:java}
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> redis.clients.jedis.exceptions.JedisDataException: EXEC without MULTI
> at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
> at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
> at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
> at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
> at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
> at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
> at 
> org.apache.beam.sdk.io.redis.RedisIOTest.testWriteReadUsingDefaultAppendMethod(RedisIOTest.java:100)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
> at org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.

[jira] [Assigned] (BEAM-5729) Create ability to read/write database implementing database/sql contract

2018-10-12 Thread Robert Burke (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke reassigned BEAM-5729:
--

Assignee: (was: Robert Burke)

> Create ability to read/write database implementing database/sql  contract
> -
>
> Key: BEAM-5729
> URL: https://issues.apache.org/jira/browse/BEAM-5729
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.7.0
>Reporter: Adrian Witas
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5728) Unable to store nil value in BigQuery go sdk

2018-10-12 Thread Robert Burke (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke reassigned BEAM-5728:
--

Assignee: (was: Robert Burke)

> Unable to store nil value in BigQuery go sdk
> 
>
> Key: BEAM-5728
> URL: https://issues.apache.org/jira/browse/BEAM-5728
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.7.0
> Environment: any
>Reporter: Adrian Witas
>Priority: Major
>
> Any struct type that  uses pointer type in any field is flag as invalid: 
> "Invalid underlying type: XXX "
> With large BigQuery schema (200+ columns),  and large table size, ability of 
> ingesting and enriching data from file system (GS, local)  where many record 
> will have partial details with nulls is critial since it reduces cost, 
> BigQuery does not charge for NULL, but does for 0, and "empty" messages, 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5734) RedisIO: finishBundle calls Jedis.exec without checking if there are operations in the pipeline

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5734?focusedWorklogId=154044=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154044
 ]

ASF GitHub Bot logged work on BEAM-5734:


Author: ASF GitHub Bot
Created on: 12/Oct/18 21:55
Start Date: 12/Oct/18 21:55
Worklog Time Spent: 10m 
  Work Description: casidiablo opened a new pull request #6682: [BEAM-5734] 
RedisIO: only call Jedis.exec() on finishBundle if there is something to send
URL: https://github.com/apache/beam/pull/6682
 
 
   This fixes a bug in the RedisIO.write sink. The `finishBundle()` calls 
Jedis' `pipeline.exec()` method without checking if there is actually something 
to flush.
   
   That results in this exception being thrown:
   
   ```
   org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
redis.clients.jedis.exceptions.JedisDataException: EXEC without MULTI
at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
at 
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
at 
org.apache.beam.sdk.io.redis.RedisIOTest.testWriteReadUsingDefaultAppendMethod(RedisIOTest.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source

[jira] [Commented] (BEAM-5728) Unable to store nil value in BigQuery go sdk

2018-10-12 Thread Robert Burke (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648509#comment-16648509
 ] 

Robert Burke commented on BEAM-5728:


Part of this has to do with not having a coder registry for handling user 
types, and handling interfaces{} fields. technically, a runner could require 
encoding of types between any two pardos, so user types must be codeable.

https://issues.apache.org/jira/browse/BEAM-3306 

 

That said, there's a work around by making your schema type pretend to be a 
Protocol buffer which will bypass that analysis, at the expense of needing to 
specify your own encoding and decoding for the type in the Marshal and 
Unmarshal methods.

You can see how to do that, demonstrated in create_test.go for testProto.

[https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/create_test.go#L84]
 

> Unable to store nil value in BigQuery go sdk
> 
>
> Key: BEAM-5728
> URL: https://issues.apache.org/jira/browse/BEAM-5728
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.7.0
> Environment: any
>Reporter: Adrian Witas
>Assignee: Robert Burke
>Priority: Major
>
> Any struct type that  uses pointer type in any field is flag as invalid: 
> "Invalid underlying type: XXX "
> With large BigQuery schema (200+ columns),  and large table size, ability of 
> ingesting and enriching data from file system (GS, local)  where many record 
> will have partial details with nulls is critial since it reduces cost, 
> BigQuery does not charge for NULL, but does for 0, and "empty" messages, 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5734) RedisIO: finishBundle calls Jedis.exec without checking if there are operations in the pipeline

2018-10-12 Thread Cristian (JIRA)
Cristian created BEAM-5734:
--

 Summary: RedisIO: finishBundle calls Jedis.exec without checking 
if there are operations in the pipeline
 Key: BEAM-5734
 URL: https://issues.apache.org/jira/browse/BEAM-5734
 Project: Beam
  Issue Type: Bug
  Components: io-java-redis
Reporter: Cristian
Assignee: Jean-Baptiste Onofré


It throws:

 
{code:java}
org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
redis.clients.jedis.exceptions.JedisDataException: EXEC without MULTI at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
 at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
 at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197) at 
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64) at 
org.apache.beam.sdk.Pipeline.run(Pipeline.java:313) at 
org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350) at 
org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331) at 
org.apache.beam.sdk.io.redis.RedisIOTest.testWriteReadUsingDefaultAppendMethod(RedisIOTest.java:100)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319) 
at org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319) 
at org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
 at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
 at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
 at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
 at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
 at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
 at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
 at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
 at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
 at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
 at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155

[jira] [Updated] (BEAM-5734) RedisIO: finishBundle calls Jedis.exec without checking if there are operations in the pipeline

2018-10-12 Thread Cristian (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cristian updated BEAM-5734:
---
Description: 
It throws:

 
{code:java}
org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
redis.clients.jedis.exceptions.JedisDataException: EXEC without MULTI
at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
at 
org.apache.beam.sdk.io.redis.RedisIOTest.testWriteReadUsingDefaultAppendMethod(RedisIOTest.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
at org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:106)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:117)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:155)
at 
org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:137)
at 
org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
at 
org.gradle.internal.concurrent.ExecutorPolicy

[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=154043=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154043
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 12/Oct/18 21:52
Start Date: 12/Oct/18 21:52
Worklog Time Spent: 10m 
  Work Description: HuangLED edited a comment on issue #6680: [BEAM-5637] 
Python support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6680#issuecomment-429468108
 
 
   R:  @herohde 
   cc: @boyuanzz @pabloem 
   
   Addressed.  Also, option definition moved to WorkerOptions. 
   
   Thanks to Boyuan for pointing out the right place for error message. 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154043)
Time Spent: 2h 20m  (was: 2h 10m)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Python jobs. That requires a change to the Python 
> boot code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/python/container/boot.go#L104



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5621?focusedWorklogId=154042=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154042
 ]

ASF GitHub Bot logged work on BEAM-5621:


Author: ASF GitHub Bot
Created on: 12/Oct/18 21:47
Start Date: 12/Oct/18 21:47
Worklog Time Spent: 10m 
  Work Description: tvalentyn edited a comment on issue #6602: [BEAM-5621] 
Fix unorderable types in python 3
URL: https://github.com/apache/beam/pull/6602#issuecomment-429468709
 
 
   Another somewhat related observation:
   Following snippets fails on Python 2, after this PR (in Direct Runner), but 
will pass in Python 3 where there is no distinction between `int` and `long`.
   
   ```
   p = TestPipeline(options=PipelineOptions(pipeline_args))
   input_data = p | beam.Create([1, -2])   # This becomes a [1, -2L]!  
(Unrelated to this PR).
   expected_result = [-2, 1]  
   assert_that(input_data, equal_to(expected_result)) 
   ```
   ```
   apache_beam.testing.util.BeamAssertException: Failed assert: [-2, 1] == [1, 
-2L] [while running 'assert_that/Match']
   ```
   Do we know why negatives are represented as longs?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154042)
Time Spent: 2h 20m  (was: 2h 10m)

> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5733) Pushdown filter to table scan

2018-10-12 Thread Rui Wang (JIRA)
Rui Wang created BEAM-5733:
--

 Summary: Pushdown filter to table scan
 Key: BEAM-5733
 URL: https://issues.apache.org/jira/browse/BEAM-5733
 Project: Beam
  Issue Type: Sub-task
  Components: dsl-sql
Reporter: Rui Wang
Assignee: Rui Wang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5653) Dataflow FnApi Worker overrides some of Coders due to coder ID generation collision.

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5653?focusedWorklogId=154039=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154039
 ]

ASF GitHub Bot logged work on BEAM-5653:


Author: ASF GitHub Bot
Created on: 12/Oct/18 21:39
Start Date: 12/Oct/18 21:39
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #6649: [BEAM-5653] Fix 
overriding coders due to duplicate coderId generation
URL: https://github.com/apache/beam/pull/6649#issuecomment-429470182
 
 
   I merged the purported fix for that, so you should have better luck.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154039)
Time Spent: 4h  (was: 3h 50m)
Remaining Estimate: 68h  (was: 68h 10m)

> Dataflow FnApi Worker overrides some of Coders due to coder ID generation 
> collision.
> 
>
> Key: BEAM-5653
> URL: https://issues.apache.org/jira/browse/BEAM-5653
> Project: Beam
>  Issue Type: Test
>  Components: java-fn-execution
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Blocker
> Fix For: 2.8.0
>
>   Original Estimate: 72h
>  Time Spent: 4h
>  Remaining Estimate: 68h
>
> Due to one of latest refactorings, we got a bug in Java FnApi Worker that it 
> overrides Coders in ProcessBundleDescriptor sent to SDK Harness that causes 
> jobs to fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5709) Tests in BeamFnControlServiceTest are flaky.

2018-10-12 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-5709.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Tests in BeamFnControlServiceTest are flaky.
> 
>
> Key: BEAM-5709
> URL: https://issues.apache.org/jira/browse/BEAM-5709
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/test/java/org/apache/beam/runners/dataflow/worker/fn/BeamFnControlServiceTest.java
> Tests for BeamFnControlService are currently flaky. The test attempts to 
> verify that onCompleted was called on the mock streams, but that function 
> gets called on a separate thread, so occasionally the function will not have 
> been called yet, despite the server being shut down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5621?focusedWorklogId=154037=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154037
 ]

ASF GitHub Bot logged work on BEAM-5621:


Author: ASF GitHub Bot
Created on: 12/Oct/18 21:33
Start Date: 12/Oct/18 21:33
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #6602: [BEAM-5621] Fix 
unorderable types in python 3
URL: https://github.com/apache/beam/pull/6602#issuecomment-429468709
 
 
   Another somewhat related observation:
   Following snippets fails on Python 2, after this PR (in Direct Runner), but 
will pass in Python 3 where there is no distinction between `int` and `long`.
   
   ```
   p = TestPipeline(options=PipelineOptions(pipeline_args))
   input_data = p | beam.Create([1, -2])   # This becomes a [1, -2L]!  
(Unrelated to this PR).
   expected_result = [-2, 1]  
   assert_that(input_data, equal_to(expected_result)) 
   ```
   Do we know why negatives are represented as longs?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154037)
Time Spent: 2h 10m  (was: 2h)

> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=154036=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154036
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 12/Oct/18 21:30
Start Date: 12/Oct/18 21:30
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #6680: [BEAM-5637] Python 
support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6680#issuecomment-429468108
 
 
   R: @boyuanzz @herohde @pabloem 
   
   Addressed.  Also, option definition moved to WorkerOptions. 
   
   Thanks to Boyuan for pointing out the right place for error message. 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154036)
Time Spent: 2h 10m  (was: 2h)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Python jobs. That requires a change to the Python 
> boot code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/python/container/boot.go#L104



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5732) expose runner mode to user through samza pipeline option

2018-10-12 Thread Hai (JIRA)
Hai created BEAM-5732:
-

 Summary: expose runner mode to user through samza pipeline option
 Key: BEAM-5732
 URL: https://issues.apache.org/jira/browse/BEAM-5732
 Project: Beam
  Issue Type: Improvement
  Components: runner-samza
Reporter: Hai
Assignee: Xinyu Liu


We should expose runner mode to user through samza pipeline option so that user 
can decide whether to start samza job as local mode or remote mode.

This should work consistently in both Java runner and Portable runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5636) Java support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5636?focusedWorklogId=154028=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154028
 ]

ASF GitHub Bot logged work on BEAM-5636:


Author: ASF GitHub Bot
Created on: 12/Oct/18 21:00
Start Date: 12/Oct/18 21:00
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on a change in pull request #6665: 
[BEAM-5636] Java support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6665#discussion_r224913769
 
 

 ##
 File path: sdks/java/container/boot.go
 ##
 @@ -104,6 +104,9 @@ func main() {
filepath.Join(jarsDir, "beam-sdks-java-harness.jar"),
}
for _, md := range artifacts {
+   if strings.HasPrefix(md.Name, 
"beam-runners-google-cloud-dataflow-java-fn-api-worker") {
 
 Review comment:
   The purpose here is, if there is java worker jar in artifacts, then this jar 
should not be included into sdk harness classpath, which seems like we don't 
need to check experiment. wdyt?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154028)
Time Spent: 20m  (was: 10m)

> Java support for custom dataflow worker jar
> ---
>
> Key: BEAM-5636
> URL: https://issues.apache.org/jira/browse/BEAM-5636
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Java jobs. That requires a change to the Java boot 
> code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/java/container/boot.go#L107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=154027=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154027
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:59
Start Date: 12/Oct/18 20:59
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #6680: [BEAM-5637] Python 
support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6680#issuecomment-429460451
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154027)
Time Spent: 2h  (was: 1h 50m)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Python jobs. That requires a change to the Python 
> boot code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/python/container/boot.go#L104



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5653) Dataflow FnApi Worker overrides some of Coders due to coder ID generation collision.

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5653?focusedWorklogId=154025=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154025
 ]

ASF GitHub Bot logged work on BEAM-5653:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:49
Start Date: 12/Oct/18 20:49
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #6649: [BEAM-5653] Fix 
overriding coders due to duplicate coderId generation
URL: https://github.com/apache/beam/pull/6649#issuecomment-429457986
 
 
   Precommits fail due to flaky test see 
[BEAM-5709](https://issues.apache.org/jira/browse/BEAM-5709)
   Succeeded precommit run: 
https://builds.apache.org/job/beam_PreCommit_Java_Phrase/318/
   Failing precommit run: 
https://builds.apache.org/job/beam_PreCommit_Java_Phrase/319/
   
   I accidentally started precommit twice in a row.
   
   Can we merge this? I feel this will be safer, than to try get green 
precommit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154025)
Time Spent: 3h 50m  (was: 3h 40m)
Remaining Estimate: 68h 10m  (was: 68h 20m)

> Dataflow FnApi Worker overrides some of Coders due to coder ID generation 
> collision.
> 
>
> Key: BEAM-5653
> URL: https://issues.apache.org/jira/browse/BEAM-5653
> Project: Beam
>  Issue Type: Test
>  Components: java-fn-execution
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Blocker
> Fix For: 2.8.0
>
>   Original Estimate: 72h
>  Time Spent: 3h 50m
>  Remaining Estimate: 68h 10m
>
> Due to one of latest refactorings, we got a bug in Java FnApi Worker that it 
> overrides Coders in ProcessBundleDescriptor sent to SDK Harness that causes 
> jobs to fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=154024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154024
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:47
Start Date: 12/Oct/18 20:47
Worklog Time Spent: 10m 
  Work Description: HuangLED opened a new pull request #6680: [BEAM-5637] 
Python support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6680
 
 
   Python support for customer worker jar (as a staged file). 
   
   Tested positive and negative case by starting actual jobs.
   
   PreCommit pass locally. 
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [X ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154024)
Time Spent: 1h 50m  (was: 1h 40m)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Bea

[jira] [Work logged] (BEAM-5636) Java support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5636?focusedWorklogId=154022=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154022
 ]

ASF GitHub Bot logged work on BEAM-5636:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:46
Start Date: 12/Oct/18 20:46
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #6665: 
[BEAM-5636] Java support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6665#discussion_r224910454
 
 

 ##
 File path: sdks/java/container/boot.go
 ##
 @@ -104,6 +104,9 @@ func main() {
filepath.Join(jarsDir, "beam-sdks-java-harness.jar"),
}
for _, md := range artifacts {
+   if strings.HasPrefix(md.Name, 
"beam-runners-google-cloud-dataflow-java-fn-api-worker") {
 
 Review comment:
   We should only make this check if the experiment is set. Also, the name will 
change to "dataflow-worker.jar" when the artifact bug is fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154022)
Time Spent: 10m
Remaining Estimate: 0h

> Java support for custom dataflow worker jar
> ---
>
> Key: BEAM-5636
>     URL: https://issues.apache.org/jira/browse/BEAM-5636
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Java jobs. That requires a change to the Java boot 
> code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/java/container/boot.go#L107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=154021=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154021
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:44
Start Date: 12/Oct/18 20:44
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on a change in pull request #6667: 
[BEAM-5637] Python support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6667#discussion_r224910055
 
 

 ##
 File path: sdks/python/apache_beam/options/pipeline_options.py
 ##
 @@ -674,7 +674,12 @@ def _add_argparse_args(cls, parser):
  'job submission, the files will be staged in the staging area '
  '(--staging_location option) and the workers will install them in '
  'same order they were specified on the command line.'))
-
+parser.add_argument(
+'--dataflow_worker_jar',
+dest='dataflow_worker_jar',
+type=str,
+help='Dataflow worker jar.'
+)
 
 Review comment:
   Thanks!  Issue addressed but lost the status in this PR due to my 
sub-optional git operations. 
   
   Opening another PR. 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154021)
Time Spent: 1h 40m  (was: 1.5h)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Python jobs. That requires a change to the Python 
> boot code: 
> https://github.com/apache/beam/blob/66d7c865b7267f388ee60752891a9141fad43774/sdks/python/container/boot.go#L104



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5621?focusedWorklogId=154019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154019
 ]

ASF GitHub Bot logged work on BEAM-5621:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:42
Start Date: 12/Oct/18 20:42
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #6602: [BEAM-5621] Fix 
unorderable types in python 3
URL: https://github.com/apache/beam/pull/6602#issuecomment-429456279
 
 
   This is clearly backward incompatible change, however I think this is the 
right behavior. In that sense I do not think it is a regression. However, we 
should clearly highlight this in our release notes/blog post etc.
   
   @tvalentyn Could you create a JIRA, mark it for 2.8.0, explain the change in 
behaviour and mark it as fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154019)
Time Spent: 2h  (was: 1h 50m)

> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5637) Python support for custom dataflow worker jar

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5637?focusedWorklogId=154020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154020
 ]

ASF GitHub Bot logged work on BEAM-5637:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:42
Start Date: 12/Oct/18 20:42
Worklog Time Spent: 10m 
  Work Description: HuangLED closed pull request #6667: [BEAM-5637] Python 
support for custom dataflow worker jar
URL: https://github.com/apache/beam/pull/6667
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/options/pipeline_options.py 
b/sdks/python/apache_beam/options/pipeline_options.py
index a172535b100..2c061e0ec52 100644
--- a/sdks/python/apache_beam/options/pipeline_options.py
+++ b/sdks/python/apache_beam/options/pipeline_options.py
@@ -674,7 +674,12 @@ def _add_argparse_args(cls, parser):
  'job submission, the files will be staged in the staging area '
  '(--staging_location option) and the workers will install them in '
  'same order they were specified on the command line.'))
-
+parser.add_argument(
+'--dataflow_worker_jar',
+dest='dataflow_worker_jar',
+type=str,
+help='Dataflow worker jar.'
+)
 
 class PortableOptions(PipelineOptions):
 
diff --git a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py 
b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
index 1acd3488524..5be60bd701b 100644
--- a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
+++ b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
@@ -381,6 +381,12 @@ def run_pipeline(self, pipeline):
 self.dataflow_client = apiclient.DataflowApplicationClient(
 pipeline._options)
 
+if setup_options.dataflow_worker_jar:
+  experiments = ["use_staged_dataflow_worker_jar"]
+  if debug_options.experiments is not None:
+experiments = list(set(experiments + debug_options.experiments))
+  debug_options.experiments = experiments
+
 # Create the job description and send a request to the service. The result
 # can be None if there is no need to send a request to the service (e.g.
 # template creation). If a request was sent and failed then the call will
diff --git a/sdks/python/apache_beam/runners/portability/stager.py 
b/sdks/python/apache_beam/runners/portability/stager.py
index ef7401ac6aa..e336fd3f9b9 100644
--- a/sdks/python/apache_beam/runners/portability/stager.py
+++ b/sdks/python/apache_beam/runners/portability/stager.py
@@ -123,8 +123,7 @@ def stage_job_resources(self,
 
 Returns:
   A list of file names (no paths) for the resources staged. All the
-  files
-  are assumed to be staged at staging_location.
+  files are assumed to be staged at staging_location.
 
 Raises:
   RuntimeError: If files specified are not found or error encountered
@@ -256,6 +255,13 @@ def stage_job_resources(self,
 'The file "%s" cannot be found. Its location was specified by '
 'the --sdk_location command-line option.' % sdk_path)
 
+if hasattr(setup_options, 'dataflow_worker_jar') and \
+setup_options.dataflow_worker_jar:
+  jar_staged_filename = 'dataflow-worker.jar'
+  staged_path = FileSystems.join(staging_location, jar_staged_filename)
+  self.stage_artifact(setup_options.dataflow_worker_jar, staged_path)
+  resources.append(jar_staged_filename)
+
 # Delete all temp files created while staging job resources.
 shutil.rmtree(temp_dir)
 retrieval_token = self.commit_manifest()


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154020)
Time Spent: 1.5h  (was: 1h 20m)

> Python support for custom dataflow worker jar
> -
>
> Key: BEAM-5637
> URL: https://issues.apache.org/jira/browse/BEAM-5637
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ruoyun Huang
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> One of the slightly subtle aspects is that we would need to ignore one of the 
> staged jars for portable Pyth

[jira] [Work logged] (BEAM-5653) Dataflow FnApi Worker overrides some of Coders due to coder ID generation collision.

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5653?focusedWorklogId=154016=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154016
 ]

ASF GitHub Bot logged work on BEAM-5653:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:41
Start Date: 12/Oct/18 20:41
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #6649: [BEAM-5653] Fix 
overriding coders due to duplicate coderId generation
URL: https://github.com/apache/beam/pull/6649#issuecomment-429456093
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154016)
Time Spent: 3h 20m  (was: 3h 10m)
Remaining Estimate: 68h 40m  (was: 68h 50m)

> Dataflow FnApi Worker overrides some of Coders due to coder ID generation 
> collision.
> 
>
> Key: BEAM-5653
> URL: https://issues.apache.org/jira/browse/BEAM-5653
> Project: Beam
>  Issue Type: Test
>  Components: java-fn-execution
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Blocker
> Fix For: 2.8.0
>
>   Original Estimate: 72h
>  Time Spent: 3h 20m
>  Remaining Estimate: 68h 40m
>
> Due to one of latest refactorings, we got a bug in Java FnApi Worker that it 
> overrides Coders in ProcessBundleDescriptor sent to SDK Harness that causes 
> jobs to fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5653) Dataflow FnApi Worker overrides some of Coders due to coder ID generation collision.

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5653?focusedWorklogId=154018=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154018
 ]

ASF GitHub Bot logged work on BEAM-5653:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:41
Start Date: 12/Oct/18 20:41
Worklog Time Spent: 10m 
  Work Description: Ardagan removed a comment on issue #6649: [BEAM-5653] 
Fix overriding coders due to duplicate coderId generation
URL: https://github.com/apache/beam/pull/6649#issuecomment-429396509
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154018)
Time Spent: 3h 40m  (was: 3.5h)
Remaining Estimate: 68h 20m  (was: 68.5h)

> Dataflow FnApi Worker overrides some of Coders due to coder ID generation 
> collision.
> 
>
> Key: BEAM-5653
> URL: https://issues.apache.org/jira/browse/BEAM-5653
> Project: Beam
>  Issue Type: Test
>  Components: java-fn-execution
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Blocker
> Fix For: 2.8.0
>
>   Original Estimate: 72h
>  Time Spent: 3h 40m
>  Remaining Estimate: 68h 20m
>
> Due to one of latest refactorings, we got a bug in Java FnApi Worker that it 
> overrides Coders in ProcessBundleDescriptor sent to SDK Harness that causes 
> jobs to fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5653) Dataflow FnApi Worker overrides some of Coders due to coder ID generation collision.

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5653?focusedWorklogId=154017=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154017
 ]

ASF GitHub Bot logged work on BEAM-5653:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:41
Start Date: 12/Oct/18 20:41
Worklog Time Spent: 10m 
  Work Description: Ardagan removed a comment on issue #6649: [BEAM-5653] 
Fix overriding coders due to duplicate coderId generation
URL: https://github.com/apache/beam/pull/6649#issuecomment-429456093
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154017)
Time Spent: 3.5h  (was: 3h 20m)
Remaining Estimate: 68.5h  (was: 68h 40m)

> Dataflow FnApi Worker overrides some of Coders due to coder ID generation 
> collision.
> 
>
> Key: BEAM-5653
> URL: https://issues.apache.org/jira/browse/BEAM-5653
> Project: Beam
>  Issue Type: Test
>  Components: java-fn-execution
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Blocker
> Fix For: 2.8.0
>
>   Original Estimate: 72h
>  Time Spent: 3.5h
>  Remaining Estimate: 68.5h
>
> Due to one of latest refactorings, we got a bug in Java FnApi Worker that it 
> overrides Coders in ProcessBundleDescriptor sent to SDK Harness that causes 
> jobs to fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-1251) Python 3 Support

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1251?focusedWorklogId=154015=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154015
 ]

ASF GitHub Bot logged work on BEAM-1251:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:37
Start Date: 12/Oct/18 20:37
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #6679: [BEAM-1251] Add a 
link to Python 3 Conversion Quick Start Guide to the list of ongoing efforts on 
Beam site.
URL: https://github.com/apache/beam/pull/6679#issuecomment-429455025
 
 
   Hey @swegner, am I using a correct way to change the Beam-site? Thanks. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154015)
Time Spent: 22h 40m  (was: 22.5h)

> Python 3 Support
> 
>
> Key: BEAM-1251
> URL: https://issues.apache.org/jira/browse/BEAM-1251
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Eyad Sibai
>Assignee: Robbe
>Priority: Major
>  Time Spent: 22h 40m
>  Remaining Estimate: 0h
>
> I have been trying to use google datalab with python3. As I see there are 
> several packages that does not support python3 yet which google datalab 
> depends on. This is one of them.
> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/issues/6



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-1251) Python 3 Support

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1251?focusedWorklogId=154014=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154014
 ]

ASF GitHub Bot logged work on BEAM-1251:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:34
Start Date: 12/Oct/18 20:34
Worklog Time Spent: 10m 
  Work Description: tvalentyn opened a new pull request #6679: [BEAM-1251] 
Add a link to Python 3 Conversion Quick Start Guide to the list of ongoing 
efforts on Beam site.
URL: https://github.com/apache/beam/pull/6679
 
 
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154014)
Time Spent: 22.5h  (was: 22h 20m)

> Python 3 Support
> 
>
> Key: BEAM-1251
> URL: https://issues.apache.org/jira/browse/BEAM-1251
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Eyad Sibai
>Assignee: Robbe
>Prio

[jira] [Work logged] (BEAM-1251) Python 3 Support

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1251?focusedWorklogId=154013=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154013
 ]

ASF GitHub Bot logged work on BEAM-1251:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:32
Start Date: 12/Oct/18 20:32
Worklog Time Spent: 10m 
  Work Description: tvalentyn opened a new pull request #6678: [BEAM-1251] 
Add a link to Python 3 Conversion Quick Start Guide to the list of ongoing 
efforts on Beam site.
URL: https://github.com/apache/beam/pull/6678
 
 
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
 | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154013)
Time Spent: 22h 20m  (was: 22h 10m)

> Python 3 Support
> 
>
> Key: BEAM-1251
> URL: https://issues.apache.org/jira/browse/BEAM-1251
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Eyad Sibai
>Assignee: Robbe
>Prio

[jira] [Work logged] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5621?focusedWorklogId=154011=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154011
 ]

ASF GitHub Bot logged work on BEAM-5621:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:26
Start Date: 12/Oct/18 20:26
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #6602: [BEAM-5621] 
Fix unorderable types in python 3
URL: https://github.com/apache/beam/pull/6602#issuecomment-429452493
 
 
   CC: @Juta @robertwb @aaltay 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154011)
Time Spent: 1h 50m  (was: 1h 40m)

> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5621) Several tests fail on Python 3 with TypeError: unorderable types: str() < int()

2018-10-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5621?focusedWorklogId=154010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-154010
 ]

ASF GitHub Bot logged work on BEAM-5621:


Author: ASF GitHub Bot
Created on: 12/Oct/18 20:26
Start Date: 12/Oct/18 20:26
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #6602: [BEAM-5621] 
Fix unorderable types in python 3
URL: https://github.com/apache/beam/pull/6602#issuecomment-429452434
 
 
   Because we now sort by types, we may now encounter different behavior when 
using different string types.  For example, previously 
`assert_that(equal_to(['a', u'b', b'c'], ['a', 'b', 'c]))` worked, but now it 
may not because this sorting order now depends on the exact type (i.e. the 
sorting may produce `[u'b', 'a', b'c'`) even for orderable types.  Should we 
consider this a regression?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 154010)
Time Spent: 1h 40m  (was: 1.5h)

> Several tests fail on Python 3 with TypeError: unorderable types: str() < 
> int()
> ---
>
> Key: BEAM-5621
> URL: https://issues.apache.org/jira/browse/BEAM-5621
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Juta Staes
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> ==
> ERROR: test_remove_duplicates 
> (apache_beam.transforms.ptransform_test.PTransformTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 677, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/runners/common.py",
>  line 414, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/transforms/core.py",
>  line 1068, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/testing/util.py",
>  line 115, in _equal
> sorted_expected = sorted(expected)
> TypeError: unorderable types: str() < int()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >