[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407367
 ]

ASF GitHub Bot logged work on BEAM-9542:


Author: ASF GitHub Bot
Created on: 21/Mar/20 03:34
Start Date: 21/Mar/20 03:34
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #11168: [BEAM-9542] Limit 
and clarify the effect of "force" in Java build
URL: https://github.com/apache/beam/pull/11168#issuecomment-601987472
 
 
   Because the effect of `force` is only for tests. The compileJava 
configuration still fetches the latest google-api-client, which is 1.30.8, 
which has a bug of android annotation dependency.
   
   ```
   17:21:41 Execution failed for task 
':sdks:java:testing:test-utils:compileJava'.
   17:21:41 > Could not resolve all files for configuration 
':sdks:java:testing:test-utils:compileClasspath'.
   17:21:41> Could not find androidx.annotation:annotation:1.1.0.
   17:21:41  Searched in the following locations:
   17:21:41- 
file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41  Required by:
   17:21:41  project :sdks:java:testing:test-utils > project 
:sdks:java:extensions:google-cloud-platform-core > 
com.google.api-client:google-api-client:1.30.8
   ```
   
   Maybe I'll need to upgrade google-api-client to the latest version that has 
the fix.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407367)
Time Spent: 4h 40m  (was: 4.5h)

> Where the BeamModulePlugin's force is needed?
> -
>
> Key: BEAM-9542
> URL: https://issues.apache.org/jira/browse/BEAM-9542
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735
> {noformat}
> > Task :sdks:java:core:compileTestJava FAILED
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 2 errors
> <-> 36% EXECUTING [19m 37s]
> {noformat}
> 

[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407366
 ]

ASF GitHub Bot logged work on BEAM-9542:


Author: ASF GitHub Bot
Created on: 21/Mar/20 03:34
Start Date: 21/Mar/20 03:34
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #11168: [BEAM-9542] Limit 
and clarify the effect of "force" in Java build
URL: https://github.com/apache/beam/pull/11168#issuecomment-601987472
 
 
   Because the effect of `force` is only for tests. The compileJava 
configuration still fetches the latest google-api-client, which is 1.30.8, 
which has a bug of android annotation dependency.
   
   ```
   17:21:41 Execution failed for task 
':sdks:java:testing:test-utils:compileJava'.
   17:21:41 > Could not resolve all files for configuration 
':sdks:java:testing:test-utils:compileClasspath'.
   17:21:41> Could not find androidx.annotation:annotation:1.1.0.
   17:21:41  Searched in the following locations:
   17:21:41- 
file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41  Required by:
   17:21:41  project :sdks:java:testing:test-utils > project 
:sdks:java:extensions:google-cloud-platform-core > 
com.google.api-client:google-api-client:1.30.8
   ```
   
   Maybe I'll need to update google-api-client.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407366)
Time Spent: 4.5h  (was: 4h 20m)

> Where the BeamModulePlugin's force is needed?
> -
>
> Key: BEAM-9542
> URL: https://issues.apache.org/jira/browse/BEAM-9542
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735
> {noformat}
> > Task :sdks:java:core:compileTestJava FAILED
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 2 errors
> <-> 36% EXECUTING [19m 37s]
> {noformat}
> Memo for my Mac:
> {noformat}
> 

[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407365
 ]

ASF GitHub Bot logged work on BEAM-9542:


Author: ASF GitHub Bot
Created on: 21/Mar/20 03:33
Start Date: 21/Mar/20 03:33
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #11168: [BEAM-9542] Limit 
and clarify the effect of "force" in Java build
URL: https://github.com/apache/beam/pull/11168#issuecomment-601987472
 
 
   Because the effect of `force` is only for tests. The compileJava 
configuration still fetches the latest google-cloud-api, which is 1.30.8, which 
has a bug of android annotation dependency.
   
   ```
   17:21:41 Execution failed for task 
':sdks:java:testing:test-utils:compileJava'.
   17:21:41 > Could not resolve all files for configuration 
':sdks:java:testing:test-utils:compileClasspath'.
   17:21:41> Could not find androidx.annotation:annotation:1.1.0.
   17:21:41  Searched in the following locations:
   17:21:41- 
file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41- 
https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom
   17:21:41- 
https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar
   17:21:41  Required by:
   17:21:41  project :sdks:java:testing:test-utils > project 
:sdks:java:extensions:google-cloud-platform-core > 
com.google.api-client:google-api-client:1.30.8
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407365)
Time Spent: 4h 20m  (was: 4h 10m)

> Where the BeamModulePlugin's force is needed?
> -
>
> Key: BEAM-9542
> URL: https://issues.apache.org/jira/browse/BEAM-9542
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735
> {noformat}
> > Task :sdks:java:core:compileTestJava FAILED
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 2 errors
> <-> 36% EXECUTING [19m 37s]
> {noformat}
> Memo for my Mac:
> {noformat}
> suztomo-macbookpro44% ./gradlew -p sdks/java check -x 
> 

[jira] [Work logged] (BEAM-8078) streaming_wordcount_debugging.py is missing a test

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8078?focusedWorklogId=407351=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407351
 ]

ASF GitHub Bot logged work on BEAM-8078:


Author: ASF GitHub Bot
Created on: 21/Mar/20 01:56
Start Date: 21/Mar/20 01:56
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #10914: [BEAM-8078] 
streaming_wordcount_debugging.py is missing a test
URL: https://github.com/apache/beam/pull/10914#issuecomment-601976783
 
 
   will merge once I get postcommit to run this test and verify it passes
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407351)
Time Spent: 4h 20m  (was: 4h 10m)

> streaming_wordcount_debugging.py is missing a test
> --
>
> Key: BEAM-8078
> URL: https://issues.apache.org/jira/browse/BEAM-8078
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Aleksey Vysotin
>Priority: Minor
>  Labels: beginner, easy, newbie, starter
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> It's example code and should have a basic_test (like the other wordcount 
> variants in [1]) to at least verify that it runs in the latest Beam release.
> [1] 
> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8078) streaming_wordcount_debugging.py is missing a test

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8078?focusedWorklogId=407348=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407348
 ]

ASF GitHub Bot logged work on BEAM-8078:


Author: ASF GitHub Bot
Created on: 21/Mar/20 01:54
Start Date: 21/Mar/20 01:54
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #10914: [BEAM-8078] 
streaming_wordcount_debugging.py is missing a test
URL: https://github.com/apache/beam/pull/10914#issuecomment-601976548
 
 
   run python 3.7 postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407348)
Time Spent: 4h 10m  (was: 4h)

> streaming_wordcount_debugging.py is missing a test
> --
>
> Key: BEAM-8078
> URL: https://issues.apache.org/jira/browse/BEAM-8078
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Aleksey Vysotin
>Priority: Minor
>  Labels: beginner, easy, newbie, starter
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> It's example code and should have a basic_test (like the other wordcount 
> variants in [1]) to at least verify that it runs in the latest Beam release.
> [1] 
> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=407342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407342
 ]

ASF GitHub Bot logged work on BEAM-9468:


Author: ASF GitHub Bot
Created on: 21/Mar/20 01:06
Start Date: 21/Mar/20 01:06
Worklog Time Spent: 10m 
  Work Description: jaketf commented on pull request #11151: [BEAM-9468]  
Hl7v2 io
URL: https://github.com/apache/beam/pull/11151#discussion_r395946114
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/HL7v2IO.java
 ##
 @@ -0,0 +1,620 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.healthcare;
+
+import com.google.api.services.healthcare.v1alpha2.model.IngestMessageResponse;
+import com.google.api.services.healthcare.v1alpha2.model.Message;
+import com.google.auto.value.AutoValue;
+import java.io.IOException;
+import java.text.ParseException;
+import java.util.Collections;
+import java.util.List;
+import org.apache.beam.sdk.io.gcp.datastore.AdaptiveThrottler;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubIO;
+import org.apache.beam.sdk.metrics.Counter;
+import org.apache.beam.sdk.metrics.Metrics;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.util.Sleeper;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionTuple;
+import org.apache.beam.sdk.values.PDone;
+import org.apache.beam.sdk.values.TupleTag;
+import org.apache.beam.sdk.values.TupleTagList;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Throwables;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link HL7v2IO} provides an API for reading from and writing to https://cloud.google.com/healthcare/docs/concepts/hl7v2;>Google Cloud 
Healthcare HL7v2 API.
+ * 
+ *
+ * Read HL7v2 Messages are fetched from the HL7v2 store based on the {@link 
PCollection} of of
+ * message IDs {@link String}s produced by the {@link 
AutoValue_HL7v2IO_Read#getMessageIDTransform}
+ * as {@link PCollectionTuple}*** containing an {@link HL7v2IO.Read#OUT} tag 
for successfully
+ * fetched messages and a {@link HL7v2IO.Read#DEAD_LETTER} tag for message IDs 
that could not be
+ * fetched.
+ *
+ * HL7v2 stores can be read in several ways: - Unbounded: based on the 
Pub/Sub Notification
+ * Channel {@link HL7v2IO#readNotificationSubscription(String)} - Bounded: 
based on reading an
+ * entire HL7v2 store (or stores) {@link HL7v2IO#readHL7v2Store(String)} - 
Bounded: based on reading
+ * an HL7v2 store with a filter
+ *
+ * Note, due to the flexibility of this Read transform, this must output a 
dead letter queue.
+ * This handles the scenario where the the PTransform that populates a 
PCollection of message IDs
+ * contains message IDs that do not exist in the HL7v2 stores.
+ *
+ * Example:
+ *
+ * {@code
+ * PipelineOptions options = ...;
+ * Pipeline pipeline = Pipeline.create(options)
+ *
+ *
+ * PCollectionTuple messages = pipeline.apply(
+ * new HLv2IO.readNotifications(options.getNotificationSubscription())
+ *
+ * // Write errors to your favorite dead letter  queue (e.g. Pub/Sub, GCS, 
BigQuery)
+ * messages.get(PubsubNotificationToHL7v2Message.DEAD_LETTER)
+ *.apply("WriteToDeadLetterQueue", ...);
+ *
+ * PCollection fetchedMessages = 
fetchResults.get(PubsubNotificationToHL7v2Message.OUT)
+ *.apply("ExtractFetchedMessage",
+ *MapElements
+ *.into(TypeDescriptor.of(Message.class))
+ *.via(FailsafeElement::getPayload));
+ *
+ * // Go about your happy path transformations.
+ * PCollection out = fetchedMessages.apply("ProcessFetchedMessages", 
...);
+ *
+ * // Write using the Message.Ingest method of the HL7v2 REST API.
+ * out.apply(HL7v2IO.ingestMessages(options.getOutputHL7v2Store()));
+ *
+ * pipeline.run();
+ *
+ * }***
+ * 
+ */
+public class HL7v2IO {
+  // TODO add metrics 

[jira] [Work logged] (BEAM-8019) Support cross-language transforms for DataflowRunner

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8019?focusedWorklogId=407340=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407340
 ]

ASF GitHub Bot logged work on BEAM-8019:


Author: ASF GitHub Bot
Created on: 21/Mar/20 00:50
Start Date: 21/Mar/20 00:50
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11185: [BEAM-8019] 
Some generalizations to support cross-language transforms.
URL: https://github.com/apache/beam/pull/11185#discussion_r395944333
 
 

 ##
 File path: sdks/python/apache_beam/pipeline.py
 ##
 @@ -1127,30 +1133,79 @@ def transform_to_runner_api(transform,  # type: 
Optional[ptransform.PTransform]
   def from_runner_api(proto,  # type: beam_runner_api_pb2.PTransform
   context  # type: PipelineContext
  ):
+side_input_tags = []
+if common_urns.primitives.PAR_DO.urn == proto.spec.urn:
+  # Preserving side input tags.
+  from apache_beam.utils import proto_utils
+  from apache_beam.portability.api import beam_runner_api_pb2
+  payload = (
+  proto_utils.parse_Bytes(
+  proto.spec.payload, beam_runner_api_pb2.ParDoPayload))
+  for tag, si in payload.side_inputs.items():
+side_input_tags.append(tag)
+
 # type: (...) -> AppliedPTransform
-def is_side_input(tag):
-  # type: (str) -> bool
+def is_python_side_input(tag):
   # As per named_inputs() above.
-  return tag.startswith('side')
+  return re.match(SIDE_INPUT_REGEX, tag)
+
+all_input_tags = [tag for tag, id in proto.inputs.items()]
+
+# All side inputs have to be available in input tags
+python_indexed_side_inputs = False
+for side_tag in side_input_tags:
+  if side_tag not in all_input_tags:
+raise Exception(
+'Side input tag %s is not available in list of input tags %r' %
+(side_tag, all_input_tags))
+
+  # We process Python and external side inputs differently. We fail early
+  # here if we cannot decide which way to go.
+  if is_python_side_input(side_tag):
+python_indexed_side_inputs = True
+  else:
+if python_indexed_side_inputs:
+  raise Exception(
+  'Cannot process side inputs due to inconsistent sideinput '
+  'naming. If using an external transform consider re-naming side '
+  'inputs to not match Python indexed format %s' %
+  SIDE_INPUT_REGEX)
 
 main_inputs = [
 context.pcollections.get_by_id(id) for tag,
-id in proto.inputs.items() if not is_side_input(tag)
+id in proto.inputs.items() if tag not in side_input_tags
 ]
 
-# Ordering is important here.
-indexed_side_inputs = [
-(get_sideinput_index(tag), context.pcollections.get_by_id(id)) for tag,
-id in proto.inputs.items() if is_side_input(tag)
-]
-side_inputs = [si for _, si in sorted(indexed_side_inputs)]
+if python_indexed_side_inputs:
+  # Ordering is important here.
 
 Review comment:
   This all seems rather fragile. Would it be possible to just make side_inputs 
a dict everywhere in the internal SDK representation? (Or is there 
introspection with the legacy worker code that would make this hard?)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407340)
Time Spent: 8h 40m  (was: 8.5h)

> Support cross-language transforms for DataflowRunner
> 
>
> Key: BEAM-8019
> URL: https://issues.apache.org/jira/browse/BEAM-8019
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> This is to capture the Beam changes needed for this task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8019) Support cross-language transforms for DataflowRunner

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8019?focusedWorklogId=407339=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407339
 ]

ASF GitHub Bot logged work on BEAM-8019:


Author: ASF GitHub Bot
Created on: 21/Mar/20 00:50
Start Date: 21/Mar/20 00:50
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11185: [BEAM-8019] 
Some generalizations to support cross-language transforms.
URL: https://github.com/apache/beam/pull/11185#discussion_r395944033
 
 

 ##
 File path: sdks/python/apache_beam/coders/coders.py
 ##
 @@ -1383,22 +1377,67 @@ def from_runner_api_parameter(payload, components, 
context):
 write_state_threshold=int(payload))
 
 
-class RunnerAPICoderHolder(Coder):
-  """A `Coder` that holds a runner API `Coder` proto.
+class ElementTypeHolder(typehints.TypeConstraint):
+  """A dummy element type for external coders that cannot be parsed in 
Python"""
 
-  This is used for coders for which corresponding objects cannot be
-  initialized in Python SDK. For example, coders for remote SDKs that may
-  be available in Python SDK transform graph when expanding a cross-language
-  transform.
-  """
-  def __init__(self, proto):
-self._proto = proto
+  def __init__(self, coder, context):
+self.coder = coder
+self.context = context
 
-  def proto(self):
-return self._proto
 
-  def to_runner_api(self, context):
-return self._proto
+class ExternalCoder(Coder):
 
-  def to_type_hint(self):
-return Any
+  coder_count = 0
+
+  def __init__(self, element_type_holder):
+self.element_type_holder = element_type_holder
+
+  def as_cloud_object(self, coders_context=None):
+if not coders_context:
+  raise Exception(
+  'coders_context must be specified to correctly encode external 
coders')
+coder_id = coders_context.get_by_proto(
+self.element_type_holder.coder, deduplicate=True)
+
+coder_proto = self.element_type_holder.coder
+
+
+kind_str = 'kind:external' + str(ExternalCoder.coder_count)
+ExternalCoder.coder_count = ExternalCoder.coder_count + 1
+component_encodings = []
+if coder_proto.spec.urn == 'beam:coder:kv:v1':
+  kind_str = 'kind:pair'
+  for component_coder_id in coder_proto.component_coder_ids:
+component_encodings.append({
+'@type': 'kind:external' + str(ExternalCoder.coder_count),
+'pipeline_proto_coder_id': component_coder_id
+})
+ExternalCoder.coder_count = ExternalCoder.coder_count + 1
+
+value = {
+# This is a placeholder type. Dataflow will get the actual coder from
+# pipeline proto using the pipeline_proto_coder_id property.
+'@type': kind_str,
+'pipeline_proto_coder_id': coder_id
+}
+if component_encodings:
+  value['is_pair_like'] = True
+  value['component_encodings'] = component_encodings
+
+return value
+
+  @staticmethod
+  def from_type_hint(typehint, unused_registry):
+if isinstance(typehint, ElementTypeHolder):
+  return ExternalCoder(typehint)
+else:
+  raise ValueError((
+  'Expected an instance of ElementTypeHolder'
+  ', but got a %s' % typehint))
+
 
 Review comment:
   `yapf -ir path/to/files/...`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407339)
Time Spent: 8.5h  (was: 8h 20m)

> Support cross-language transforms for DataflowRunner
> 
>
> Key: BEAM-8019
> URL: https://issues.apache.org/jira/browse/BEAM-8019
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> This is to capture the Beam changes needed for this task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9446) FlinkRunner discards parallelism and execution_mode_for_batch pipeline options

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9446?focusedWorklogId=407335=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407335
 ]

ASF GitHub Bot logged work on BEAM-9446:


Author: ASF GitHub Bot
Created on: 21/Mar/20 00:08
Start Date: 21/Mar/20 00:08
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #11189: [BEAM-9446] Retain 
unknown arguments when using uber jar job server.
URL: https://github.com/apache/beam/pull/11189#issuecomment-601960942
 
 
   Run PortableJar_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407335)
Time Spent: 3h 20m  (was: 3h 10m)

> FlinkRunner discards parallelism and execution_mode_for_batch pipeline options
> --
>
> Key: BEAM-9446
> URL: https://issues.apache.org/jira/browse/BEAM-9446
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
> Fix For: 2.21.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> I need these options for TFX, but they're being discarded (I believe they are 
> normally supplied by the job server).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407332
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 21/Mar/20 00:03
Start Date: 21/Mar/20 00:03
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395933544
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,121 +330,148 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
-  // A set of key+value labels which define the scope of the metric.
+
+  // A set of key and value labels which define the scope of the metric. For
+  // well known URNs, the set of required labels is provided by the associated
+  // MonitoringInfoSpec.
+  //
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
+// A set of well known URNs that specify the encoding and aggregation method.
 message MonitoringInfoTypeUrns {
   enum Enum {
+// Represents an integer counter where values are summed across bundles.
+//
+// Encoding: 
+//   - value: beam:coder:varint:v1
 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-"beam:metrics:sum_int_64"];
-
-DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
- "beam:metrics:distribution_int_64"];
-
-LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-   "beam:metrics:latest_int_64"];
-
-// iterable is encoded with a beam:coder:double:v1 coder for each
-// element.
-LATEST_DOUBLES_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) =
- "beam:metrics:latest_doubles"];
-  }
-}
-
-message Metric {
-  // (Required) The data for this metric.
-  oneof data {
-CounterData counter_data = 1;
-DistributionData distribution_data = 2;
-ExtremaData extrema_data = 3;
-  }
-}
-
-// Data associated with a Counter or Gauge metric.
-// This is designed to be compatible with metric collection
-// systems such as DropWizard.
-message CounterData {
-  oneof value {
-int64 int64_value = 1;
-double double_value = 2;
-string string_value = 3;
-  }
-}
-
-// Extrema messages are used for calculating
-// Top-N/Bottom-N metrics.
-message ExtremaData {
-  oneof extrema {
-IntExtremaData int_extrema_data = 1;
-DoubleExtremaData double_extrema_data = 2;
-  }
-}
-
-message IntExtremaData {
-  repeated int64 int_values = 1;
-}
+"beam:metrics:sum_int64:v1"];
+
+// Represents a double counter where values are summed across bundles.
+//
+// Encoding: 
+//   value: beam:coder:double:v1
+SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metrics:sum_double:v1"];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:varint:v1
+//   - min:   beam:coder:varint:v1
+//   - max:   beam:coder:varint:v1
+DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents a distribution of a double value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:double:v1
+//   - min:   beam:coder:double:v1
+//   - max:   

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407334
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 21/Mar/20 00:03
Start Date: 21/Mar/20 00:03
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395936249
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -52,38 +55,157 @@ message Annotation {
   string value = 2;
 }
 
-// Populated MonitoringInfoSpecs for specific URNs.
-// Indicating the required fields to be set.
-// SDKs and RunnerHarnesses can load these instances into memory and write a
-// validator or code generator to assist with populating and validating
-// MonitoringInfo protos.
+// A set of well known MonitoringInfo specifications.
 message MonitoringInfoSpecs {
   enum Enum {
-// TODO(BEAM-6926): Add the PTRANSFORM name as a required label after
-// upgrading the python SDK.
-USER_COUNTER = 0 [(monitoring_info_spec) = {
-  urn: "beam:metric:user",
-  type_urn: "beam:metrics:sum_int_64",
+// Represents an integer counter where values are summed across bundles.
+USER_SUM_INT64 = 0 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:v1",
+  type: "beam:metrics:sum_int64:v1",
   required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"],
   annotations: [{
 key: "description",
-value: "URN utilized to report user numeric counters."
+value: "URN utilized to report user metric."
   }]
 }];
 
-ELEMENT_COUNT = 1 [(monitoring_info_spec) = {
+// Represents a double counter where values are summed across bundles.
+USER_SUM_DOUBLE = 1 [(monitoring_info_spec) = {
+  urn: "beam:metric:user:v1",
 
 Review comment:
   Should it be legal to have two counters with the same URN but different 
types. (This seems to fly agains the idea of a URN being a Unique identifier.) 
   
   Seeing this explosion of types, however, makes it feel like we should not be 
manually be enumerating them (or at least I'm struggling to see the value in 
that over just saying that user counters may have any type).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407334)
Time Spent: 21.5h  (was: 21h 20m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 21.5h
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407333
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 21/Mar/20 00:03
Start Date: 21/Mar/20 00:03
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395935001
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,101 +215,127 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
+
   // A set of key+value labels which define the scope of the metric.
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
 message MonitoringInfoTypeUrns {
   enum Enum {
+// Represents an integer counter where values are summed across bundles.
+//
+// Encoding: 
+//   - value: beam:coder:varint:v1
 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
 
 Review comment:
   Rather than manually write out the cross product, how about we define 
`{sum,min,max,top_n,bottom_n,distribtuion,latest}_{int64,double,string}` types 
as having known semantics and encoding? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407333)
Time Spent: 21h 20m  (was: 21h 10m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9446) FlinkRunner discards parallelism and execution_mode_for_batch pipeline options

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9446?focusedWorklogId=407331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407331
 ]

ASF GitHub Bot logged work on BEAM-9446:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:50
Start Date: 20/Mar/20 23:50
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #11189: [BEAM-9446] 
Retain unknown arguments when using uber jar job server.
URL: https://github.com/apache/beam/pull/11189
 
 
   See #11052 for context.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407330=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407330
 ]

ASF GitHub Bot logged work on BEAM-3301:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:48
Start Date: 20/Mar/20 23:48
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #11188: [BEAM-3301] Adding 
restriction trackers and validation.
URL: https://github.com/apache/beam/pull/11188#issuecomment-601956825
 
 
   Run Go PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407330)
Time Spent: 8.5h  (was: 8h 20m)

> Go SplittableDoFn support
> -
>
> Key: BEAM-3301
> URL: https://issues.apache.org/jira/browse/BEAM-3301
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> SDFs will be the only way to add streaming and liquid sharded IO for Go.
> Design doc: https://s.apache.org/splittable-do-fn



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407329
 ]

ASF GitHub Bot logged work on BEAM-3301:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:48
Start Date: 20/Mar/20 23:48
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11188: [BEAM-3301] 
Adding restriction trackers and validation.
URL: https://github.com/apache/beam/pull/11188
 
 
   Adding RTrackers as an interface, and adding them to the SDF validation.
   
   I think this is the last real code involved in SDF validation, assuming I'm 
not forgetting anything. I might do a second pass on the error messages because 
they seem inconsistent with the old error messages, but the next major task is 
going to be working on the SDF exec code and doing some testing with the Flink 
runner.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=407326=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407326
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:41
Start Date: 20/Mar/20 23:41
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #10717: [BEAM-8280] 
Enable type hint annotations
URL: https://github.com/apache/beam/pull/10717
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407326)
Time Spent: 10h 10m  (was: 10h)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7923) Interactive Beam

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407327=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407327
 ]

ASF GitHub Bot logged work on BEAM-7923:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:41
Start Date: 20/Mar/20 23:41
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on pull request #11174: 
[BEAM-7923] Pop failed transform when error is raised
URL: https://github.com/apache/beam/pull/11174#discussion_r395933604
 
 

 ##
 File path: sdks/python/apache_beam/pipeline.py
 ##
 @@ -307,58 +307,61 @@ def _replace_if_needed(self, original_transform_node):
   elif len(inputs) == 0:
 input_node = pvalue.PBegin(self.pipeline)
 
-  # We have to add the new AppliedTransform to the stack before 
expand()
-  # and pop it out later to make sure that parts get added correctly.
-  self.pipeline.transforms_stack.append(replacement_transform_node)
-
-  # Keeping the same label for the replaced node but recursively
-  # removing labels of child transforms of original transform since 
they
-  # will be replaced during the expand below. This is needed in case
-  # the replacement contains children that have labels that conflicts
-  # with labels of the children of the original.
-  self.pipeline._remove_labels_recursively(original_transform_node)
-
-  new_output = replacement_transform.expand(input_node)
-  assert isinstance(
-  new_output, (dict, pvalue.PValue, pvalue.DoOutputsTuple))
-
-  if isinstance(new_output, pvalue.PValue):
-new_output.element_type = None
-self.pipeline._infer_result_type(
-replacement_transform, inputs, new_output)
-
-  if isinstance(new_output, dict):
-for new_tag, new_pcoll in new_output.items():
-  replacement_transform_node.add_output(new_pcoll, new_tag)
-  elif isinstance(new_output, pvalue.DoOutputsTuple):
-replacement_transform_node.add_output(
-new_output, new_output._main_tag)
-  else:
-replacement_transform_node.add_output(new_output, new_output.tag)
-
-  # Recording updated outputs. This cannot be done in the same visitor
-  # since if we dynamically update output type here, we'll run into
-  # errors when visiting child nodes.
-  #
-  # NOTE: When replacing multiple outputs, the replacement PCollection
-  # tags must have a matching tag in the original transform.
-  if isinstance(new_output, pvalue.PValue):
-if not new_output.producer:
-  new_output.producer = replacement_transform_node
-output_map[original_transform_node.outputs[new_output.tag]] = \
-new_output
-  elif isinstance(new_output, (pvalue.DoOutputsTuple, tuple)):
-for pcoll in new_output:
-  if not pcoll.producer:
-pcoll.producer = replacement_transform_node
-  output_map[original_transform_node.outputs[pcoll.tag]] = pcoll
-  elif isinstance(new_output, dict):
-for tag, pcoll in new_output.items():
-  if not pcoll.producer:
-pcoll.producer = replacement_transform_node
-  output_map[original_transform_node.outputs[tag]] = pcoll
-
-  self.pipeline.transforms_stack.pop()
+  try:
+# We have to add the new AppliedTransform to the stack before
+# expand() and pop it out later to make sure that parts get added
+# correctly.
+self.pipeline.transforms_stack.append(replacement_transform_node)
 
 Review comment:
   This is a python newbie question. Does list.append() ever throw an 
exception? If so, should we move this out of the try block so that we don't 
pop() if list.append() fails?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407327)
Time Spent: 9h 20m  (was: 9h 10m)

> Interactive Beam
> 
>
> Key: BEAM-7923
> URL: https://issues.apache.org/jira/browse/BEAM-7923
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> This is the top level ticket for all efforts leveraging [interactive 
> 

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407325
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:40
Start Date: 20/Mar/20 23:40
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395933271
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,101 +215,127 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
+
   // A set of key+value labels which define the scope of the metric.
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
 message MonitoringInfoTypeUrns {
   enum Enum {
+// Represents an integer counter where values are summed across bundles.
+//
+// Encoding: 
+//   - value: beam:coder:varint:v1
 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-"beam:metrics:sum_int_64"];
-
-DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
- "beam:metrics:distribution_int_64"];
-
-LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-   "beam:metrics:latest_int_64"];
+"beam:metrics:sum_int64:v1"];
+
+// Represents a double counter where values are summed across bundles.
+//
+// Encoding: 
+//   value: beam:coder:double:v1
+SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metrics:sum_int64:v1"];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:varint:v1
+//   - min:   beam:coder:varint:v1
+//   - max:   beam:coder:varint:v1
+DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents a distribution of a double value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:double:v1
+//   - min:   beam:coder:double:v1
+//   - max:   beam:coder:double:v1
+DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) 
=
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:varint:v1
+LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:double:v1
+LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the largest set of integer values seen across bundles.
+//
+// Encoding: ...
+//   - valueX: beam:coder:varint:v1
+TOP_N_INT64_TYPE = 6 

[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407323=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407323
 ]

ASF GitHub Bot logged work on BEAM-9537:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:31
Start Date: 20/Mar/20 23:31
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11153: [BEAM-9537] 
Adding a new module for FnApiRunner
URL: https://github.com/apache/beam/pull/11153
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407323)
Time Spent: 2h 40m  (was: 2.5h)

> Refactor FnApiRunner into its own package
> -
>
> Key: BEAM-9537
> URL: https://issues.apache.org/jira/browse/BEAM-9537
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Minor
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9398) Python type hints: AbstractDoFnWrapper does not wrap setup

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9398?focusedWorklogId=407322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407322
 ]

ASF GitHub Bot logged work on BEAM-9398:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:26
Start Date: 20/Mar/20 23:26
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #0: [BEAM-9398] 
runtime_type_check: support setup
URL: https://github.com/apache/beam/pull/0#issuecomment-601952041
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407322)
Time Spent: 0.5h  (was: 20m)

> Python type hints: AbstractDoFnWrapper does not wrap setup
> --
>
> Key: BEAM-9398
> URL: https://issues.apache.org/jira/browse/BEAM-9398
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> And possibly other methods.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9558) Make end of data channel explicit

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9558?focusedWorklogId=407321=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407321
 ]

ASF GitHub Bot logged work on BEAM-9558:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:23
Start Date: 20/Mar/20 23:23
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11173: [BEAM-9558] 
Add explicit end bit for data channel.
URL: https://github.com/apache/beam/pull/11173
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407321)
Time Spent: 1h  (was: 50m)

> Make end of data channel explicit
> -
>
> Key: BEAM-9558
> URL: https://issues.apache.org/jira/browse/BEAM-9558
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently the end of a data channel is implicitly marked by sending an empty 
> data block. The protocol would be simplified by making this explicit, and it 
> would also prevent data loss bugs that might occur by accidentally sending an 
> empty block. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7923) Interactive Beam

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407316
 ]

ASF GitHub Bot logged work on BEAM-7923:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:14
Start Date: 20/Mar/20 23:14
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #11174: [BEAM-7923] Pop 
failed transform when error is raised
URL: https://github.com/apache/beam/pull/11174#issuecomment-601949155
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407316)
Time Spent: 9h 10m  (was: 9h)

> Interactive Beam
> 
>
> Key: BEAM-7923
> URL: https://issues.apache.org/jira/browse/BEAM-7923
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> This is the top level ticket for all efforts leveraging [interactive 
> Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]]
> As the development goes, blocking tickets will be added to this one.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407315=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407315
 ]

ASF GitHub Bot logged work on BEAM-9340:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:06
Start Date: 20/Mar/20 23:06
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11165: [BEAM-9340] 
Populate requirements for Java.
URL: https://github.com/apache/beam/pull/11165#issuecomment-601947065
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407315)
Time Spent: 2h 20m  (was: 2h 10m)

> Properly populate pipeline proto requirements.
> --
>
> Key: BEAM-9340
> URL: https://issues.apache.org/jira/browse/BEAM-9340
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-go, sdk-java-core, sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9512) Anonymous structs have name collision in schema

2020-03-20 Thread Andrew Pilloud (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud reassigned BEAM-9512:


Assignee: Andrew Pilloud

> Anonymous structs have name collision in schema
> ---
>
> Key: BEAM-9512
> URL: https://issues.apache.org/jira/browse/BEAM-9512
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Labels: zetasql-compliance
>
> {code:java}
> Mar 16, 2020 12:57:42 PM 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
> executeQuery
> INFO: Processing Sql statement: SELECT STRUCT(ARRAY INT64>>[(11, 12), (21, 22)])
> Mar 16, 2020 12:57:42 PM 
> com.google.zetasql.io.grpc.internal.SerializingExecutor run
> SEVERE: Exception while executing runnable 
> com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed@c73b08a
> java.lang.IllegalArgumentException: Duplicate field  added to schema
>   at org.apache.beam.sdk.schemas.Schema.(Schema.java:228)
>   at org.apache.beam.sdk.schemas.Schema.fromFields(Schema.java:966)
>   at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:503)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toSchema(CalciteUtils.java:194)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toFieldType(CalciteUtils.java:251)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toFieldType(CalciteUtils.java:246)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toField(CalciteUtils.java:239)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toField(CalciteUtils.java:235)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at java.util.Iterator.forEachRemaining(Iterator.java:116)
>   at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>   at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toSchema(CalciteUtils.java:194)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toFieldType(CalciteUtils.java:251)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toField(CalciteUtils.java:239)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toField(CalciteUtils.java:235)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at java.util.Iterator.forEachRemaining(Iterator.java:116)
>   at 
> java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>   at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toSchema(CalciteUtils.java:194)
>   at 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:243)
>   at 
> com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423)
>   at 
> com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
>   at 
> com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
>   at 
> com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711)
>   at 
> com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
>  {code}



--
This message was sent 

[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407311=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407311
 ]

ASF GitHub Bot logged work on BEAM-9340:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:02
Start Date: 20/Mar/20 23:02
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11165: [BEAM-9340] 
Populate requirements for Java.
URL: https://github.com/apache/beam/pull/11165#issuecomment-601946285
 
 
   This passes locally. 
org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInFinishBundleStateful
 seems unrelated. Retrying again. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407311)
Time Spent: 2h  (was: 1h 50m)

> Properly populate pipeline proto requirements.
> --
>
> Key: BEAM-9340
> URL: https://issues.apache.org/jira/browse/BEAM-9340
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-go, sdk-java-core, sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407312=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407312
 ]

ASF GitHub Bot logged work on BEAM-9340:


Author: ASF GitHub Bot
Created on: 20/Mar/20 23:02
Start Date: 20/Mar/20 23:02
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11165: [BEAM-9340] 
Populate requirements for Java.
URL: https://github.com/apache/beam/pull/11165#issuecomment-601946329
 
 
   Run Java PreCommit
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407312)
Time Spent: 2h 10m  (was: 2h)

> Properly populate pipeline proto requirements.
> --
>
> Key: BEAM-9340
> URL: https://issues.apache.org/jira/browse/BEAM-9340
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-go, sdk-java-core, sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9564) Remove insecure ssl options from MongoDBIO

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9564?focusedWorklogId=407309=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407309
 ]

ASF GitHub Bot logged work on BEAM-9564:


Author: ASF GitHub Bot
Created on: 20/Mar/20 22:57
Start Date: 20/Mar/20 22:57
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #11186: [BEAM-9564] 
Remove insecure ssl options from MongoDBIO
URL: https://github.com/apache/beam/pull/11186
 
 
   These changes are not backwards compatible but this is intended to solve the 
potential security issues and also because MongoDBIO does not have strong 
backwards compatibility yet (aka it is still tagged as `@Experimental`).
   
   R: @alexvanboxel
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407309)
Remaining Estimate: 0h
Time Spent: 10m

> Remove insecure ssl options from MongoDBIO
> --
>
> Key: BEAM-9564
> URL: https://issues.apache.org/jira/browse/BEAM-9564
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-mongodb
>Affects Versions: 2.21.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Critical
>  Labels: backward-incompatible
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The option MongoDBIO.withIgnoreSSLCertificate  and 
> withSSLInvalidHostNameAllowedslInvalidHostNameAllowed() are insecure by 
> design. We should not encourage users to be able to use them so better to 
> remove these options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9564) Remove insecure ssl options from MongoDBIO

2020-03-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9564:
---
Description: The option MongoDBIO.withIgnoreSSLCertificate  and 
withSSLInvalidHostNameAllowedslInvalidHostNameAllowed() are insecure by design. 
We should not encourage users to be able to use them so better to remove these 
options.  (was: The option MongoDBIO.withIgnoreSSLCertificate is insecure by 
design. We should not encourage this behavior so better to remove this option.

I thought this was used only for tests but it does not seem to be the case so 
there is not even a valid case to keep it.)

> Remove insecure ssl options from MongoDBIO
> --
>
> Key: BEAM-9564
> URL: https://issues.apache.org/jira/browse/BEAM-9564
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-mongodb
>Affects Versions: 2.21.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Critical
>  Labels: backward-incompatible
>
> The option MongoDBIO.withIgnoreSSLCertificate  and 
> withSSLInvalidHostNameAllowedslInvalidHostNameAllowed() are insecure by 
> design. We should not encourage users to be able to use them so better to 
> remove these options.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9564) Remove insecure ssl options from MongoDBIO

2020-03-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9564:
---
Summary: Remove insecure ssl options from MongoDBIO  (was: Remove insecure 
withIgnoreSSLCertificate method from MongoDBIO)

> Remove insecure ssl options from MongoDBIO
> --
>
> Key: BEAM-9564
> URL: https://issues.apache.org/jira/browse/BEAM-9564
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-mongodb
>Affects Versions: 2.21.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Critical
>  Labels: backward-incompatible
>
> The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We 
> should not encourage this behavior so better to remove this option.
> I thought this was used only for tests but it does not seem to be the case so 
> there is not even a valid case to keep it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9564) Remove insecure withIgnoreSSLCertificate method from MongoDBIO

2020-03-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9564:
---
Summary: Remove insecure withIgnoreSSLCertificate method from MongoDBIO  
(was: Remove withIgnoreSSLCertificate from MongoDBIO)

> Remove insecure withIgnoreSSLCertificate method from MongoDBIO
> --
>
> Key: BEAM-9564
> URL: https://issues.apache.org/jira/browse/BEAM-9564
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-mongodb
>Affects Versions: 2.21.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Critical
>  Labels: backward-incompatible
>
> The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We 
> should not encourage this behavior so better to remove this option.
> I thought this was used only for tests but it does not seem to be the case so 
> there is not even a valid case to keep it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407302
 ]

ASF GitHub Bot logged work on BEAM-3301:


Author: ASF GitHub Bot
Created on: 20/Mar/20 22:25
Start Date: 20/Mar/20 22:25
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11179: [BEAM-3301] 
Bugfix in DoFn validation.
URL: https://github.com/apache/beam/pull/11179
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407302)
Time Spent: 8h 10m  (was: 8h)

> Go SplittableDoFn support
> -
>
> Key: BEAM-3301
> URL: https://issues.apache.org/jira/browse/BEAM-3301
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> SDFs will be the only way to add streaming and liquid sharded IO for Go.
> Design doc: https://s.apache.org/splittable-do-fn



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9430) Migrate from ProcessContext#updateWatermark to WatermarkEstimators

2020-03-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9430:
---
Labels: backward-incompatible  (was: )

> Migrate from ProcessContext#updateWatermark to WatermarkEstimators
> --
>
> Key: BEAM-9430
> URL: https://issues.apache.org/jira/browse/BEAM-9430
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Labels: backward-incompatible
> Fix For: 2.21.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Current discussion underway in 
> [https://lists.apache.org/thread.html/r5d974b6a58bc04ff4c02682fda4ef68608121f1bf23a86e9d592ca6e%40%3Cdev.beam.apache.org%3E]
>  
> Proposed API: [https://github.com/apache/beam/pull/10992]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9564) Remove withIgnoreSSLCertificate from MongoDBIO

2020-03-20 Thread Jira
Ismaël Mejía created BEAM-9564:
--

 Summary: Remove withIgnoreSSLCertificate from MongoDBIO
 Key: BEAM-9564
 URL: https://issues.apache.org/jira/browse/BEAM-9564
 Project: Beam
  Issue Type: Improvement
  Components: io-java-mongodb
Affects Versions: 2.21.0
Reporter: Ismaël Mejía
Assignee: Ismaël Mejía


The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We should 
not encourage this behavior so better to remove this option.

I thought this was used only for tests but it does not seem to be the case so 
there is not even a valid case to keep it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9564) Remove withIgnoreSSLCertificate from MongoDBIO

2020-03-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9564:
---
Labels: backward-incompatible  (was: )

> Remove withIgnoreSSLCertificate from MongoDBIO
> --
>
> Key: BEAM-9564
> URL: https://issues.apache.org/jira/browse/BEAM-9564
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-mongodb
>Affects Versions: 2.21.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Critical
>  Labels: backward-incompatible
>
> The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We 
> should not encourage this behavior so better to remove this option.
> I thought this was used only for tests but it does not seem to be the case so 
> there is not even a valid case to keep it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9564) Remove withIgnoreSSLCertificate from MongoDBIO

2020-03-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9564:
---
Status: Open  (was: Triage Needed)

> Remove withIgnoreSSLCertificate from MongoDBIO
> --
>
> Key: BEAM-9564
> URL: https://issues.apache.org/jira/browse/BEAM-9564
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-mongodb
>Affects Versions: 2.21.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Critical
>
> The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We 
> should not encourage this behavior so better to remove this option.
> I thought this was used only for tests but it does not seem to be the case so 
> there is not even a valid case to keep it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9549) Flaky portableWordCountBatch and portableWordCountStreaming tests

2020-03-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9549:
---
Status: Open  (was: Triage Needed)

> Flaky portableWordCountBatch and portableWordCountStreaming tests
> -
>
> Key: BEAM-9549
> URL: https://issues.apache.org/jira/browse/BEAM-9549
> Project: Beam
>  Issue Type: Test
>  Components: test-failures
>Reporter: Ning Kang
>Assignee: Ankur Goenka
>Priority: Major
> Attachments: Sr5cNnx8sAW.png
>
>
> The tests :sdks:python:test-suites:portable:py2:portableWordCountBatch and 
> :sdks:python:test-suites:portable:py2:portableWordCountStreaming are flaky, 
> sometimes throws grpc errrors.
> Stacktrace
> !Sr5cNnx8sAW.png|width=2049,height=1001!
> In text:
> {code:java}
> INFO:root:Using Python SDK docker image: 
> apache/beam_python2.7_sdk:2.21.0.dev. If the image is not available at local, 
> we will try to pull from 
> hub.docker.comINFO:apache_beam.runners.portability.fn_api_runner_transforms:
>   
> INFO:apache_beam.utils.subprocess_server:Starting service 
> with ['docker' 'run' '-v' '/usr/bin/docker:/bin/docker' '-v' 
> '/var/run/docker.sock:/var/run/docker.sock' '--network=host' 
> 'apache/beam_flink1.9_job_server:latest' '--job-host' 'localhost' 
> '--job-port' '58753' '--artifact-port' '60175' '--expansion-port' 
> '33067']INFO:apache_beam.utils.subprocess_server:[main] INFO 
> org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - 
> ArtifactStagingService started on 
> localhost:60175INFO:apache_beam.utils.subprocess_server:[main] INFO 
> org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - Java 
> ExpansionService started on 
> localhost:33067INFO:apache_beam.utils.subprocess_server:[main] INFO 
> org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - 
> JobService started on localhost:58753ERROR:grpc._common:Exception 
> deserializing message!Traceback (most recent call last):  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_common.py",
>  line 84, in _transformreturn transformer(message)DecodeError: Error 
> parsing messageTraceback (most recent call last):  File 
> "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)  File "/usr/lib/python2.7/runpy.py", 
> line 72, in _run_codeexec code in run_globals  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/lib/python2.7/site-packages/apache_beam/examples/wordcount.py",
>  line 142, in run()  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/lib/python2.7/site-packages/apache_beam/examples/wordcount.py",
>  line 121, in runresult = p.run()  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
>  line 495, in runself._options).run(False)  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
>  line 508, in runreturn self.runner.run_pipeline(self, self._options)  
> File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py",
>  line 401, in run_pipelinejob_service_handle.submit(proto_pipeline)  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py",
>  line 102, in submitprepare_response = self.prepare(proto_pipeline)  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py",
>  line 179, in preparetimeout=self.timeout)  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
>  line 826, in __call__return _end_unary_response_blocking(state, call, 
> False, None)  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
>  line 729, in _end_unary_response_blockingraise 
> _InactiveRpcError(state)grpc._channel._InactiveRpcError: <_InactiveRpcError 
> of 

[jira] [Updated] (BEAM-9563) TFRecordIO inefficient read from sideinput causing pipeline to be slow - fix

2020-03-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9563:
---
Status: Open  (was: Triage Needed)

> TFRecordIO inefficient read from sideinput causing pipeline to be slow - fix
> 
>
> Key: BEAM-9563
> URL: https://issues.apache.org/jira/browse/BEAM-9563
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: Not applicable
>Reporter: Piotr Szuberski
>Assignee: Piotr Szuberski
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The class ToListCombineFn in the previous task has public access level but 
> can be private.
> sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java:420
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9444) Shall we use GCP Libraries BOM to specify Google-related library versions?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9444?focusedWorklogId=407297=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407297
 ]

ASF GitHub Bot logged work on BEAM-9444:


Author: ASF GitHub Bot
Created on: 20/Mar/20 22:09
Start Date: 20/Mar/20 22:09
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #11156: [BEAM-9444] 
Use GCP Libraries BOM for Google Cloud Dependencies
URL: https://github.com/apache/beam/pull/11156#discussion_r395911140
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -444,45 +439,46 @@ class BeamModulePlugin implements Plugin {
 commons_lang3   : 
"org.apache.commons:commons-lang3:3.9",
 commons_math3   : 
"org.apache.commons:commons-math3:3.6.1",
 error_prone_annotations : 
"com.google.errorprone:error_prone_annotations:2.0.15",
-gax : 
"com.google.api:gax:$gax_version",
-gax_grpc: 
"com.google.api:gax-grpc:$gax_version",
+gax : "com.google.api:gax",
+gax_grpc: 
"com.google.api:gax-grpc",
 google_api_client   : 
"com.google.api-client:google-api-client:$google_clients_version",
 google_api_client_jackson2  : 
"com.google.api-client:google-api-client-jackson2:$google_clients_version",
 google_api_client_java6 : 
"com.google.api-client:google-api-client-java6:$google_clients_version",
-google_api_common   : 
"com.google.api:api-common:1.8.1",
+google_api_common   : 
"com.google.api:api-common",
 
 Review comment:
   A question for my learning. Previously this file had specific dependency 
versions. For any released Beam version we could check the file in the release 
branch and get a list of dependencies and their versions.
   
   After this change, how can we do the same thing? Would it happen through a 
generated BOM file in the release branch?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407297)
Time Spent: 7h 20m  (was: 7h 10m)

> Shall we use GCP Libraries BOM to specify Google-related library versions?
> --
>
> Key: BEAM-9444
> URL: https://issues.apache.org/jira/browse/BEAM-9444
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Attachments: Screen Shot 2020-03-13 at 13.33.01.png, Screen Shot 
> 2020-03-17 at 16.01.16.png
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Shall we use GCP Libraries BOM to specify Google-related library versions?
>   
>  I've been working on Beam's dependency upgrades in the past few months. I 
> think it's time to consider a long-term solution to keep the libraries 
> up-to-date with small maintenance effort. To achieve that, I propose Beam to 
> use GCP Libraries BOM to set the Google-related library versions, rather than 
> trying to make changes in each of ~30 Google libraries.
>   
> h1. Background
> A BOM is pom.xml that provides dependencyManagement to importing projects.
>   
>  GCP Libraries BOM is a BOM that includes many Google Cloud related libraries 
> + gRPC + protobuf. We (Google Cloud Java Diamond Dependency team) maintain 
> the BOM so that the set of the libraries are compatible with each other.
>   
> h1. Implementation
> Notes for obstacles.
> h2. BeamModulePlugin's "force" does not take BOM into account (thus fails)
> {{forcedModules}} via version resolution strategy is playing bad. This causes
> {noformat}
> A problem occurred evaluating project ':sdks:java:extensions:sql'. 
> Could not resolve all dependencies for configuration 
> ':sdks:java:extensions:sql:fmppTemplates'.
> Invalid format: 'com.google.cloud:google-cloud-core'. Group, name and version 
> cannot be empty. Correct example: 'org.gradle:gradle-core:1.0'{noformat}
> !Screen Shot 2020-03-13 at 13.33.01.png|width=489,height=287! 
>   
> h2. :sdks:java:maven-archetypes:examples needs the version of 
> google-http-client
> The task requires the version for the library:
> {code:java}
> 'google-http-client.version': 
> 

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407296=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407296
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 22:09
Start Date: 20/Mar/20 22:09
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395908901
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,101 +215,127 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
+
   // A set of key+value labels which define the scope of the metric.
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
 message MonitoringInfoTypeUrns {
   enum Enum {
+// Represents an integer counter where values are summed across bundles.
+//
+// Encoding: 
+//   - value: beam:coder:varint:v1
 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-"beam:metrics:sum_int_64"];
-
-DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
- "beam:metrics:distribution_int_64"];
-
-LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-   "beam:metrics:latest_int_64"];
+"beam:metrics:sum_int64:v1"];
+
+// Represents a double counter where values are summed across bundles.
+//
+// Encoding: 
+//   value: beam:coder:double:v1
+SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metrics:sum_int64:v1"];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:varint:v1
+//   - min:   beam:coder:varint:v1
+//   - max:   beam:coder:varint:v1
+DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents a distribution of a double value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:double:v1
+//   - min:   beam:coder:double:v1
+//   - max:   beam:coder:double:v1
+DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) 
=
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:varint:v1
+LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:double:v1
+LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the largest set of integer values seen across bundles.
+//
+// Encoding: ...
+//   - valueX: beam:coder:varint:v1
+TOP_N_INT64_TYPE = 6 

[jira] [Work logged] (BEAM-8910) Use AVRO instead of JSON in BigQuery bounded source.

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8910?focusedWorklogId=407295=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407295
 ]

ASF GitHub Bot logged work on BEAM-8910:


Author: ASF GitHub Bot
Created on: 20/Mar/20 22:08
Start Date: 20/Mar/20 22:08
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11086: [BEAM-8910] 
Make custom BQ source read from Avro
URL: https://github.com/apache/beam/pull/11086#discussion_r395910855
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery_read_it_test.py
 ##
 @@ -236,11 +251,12 @@ def create_table(cls, table_name):
 cls.bigquery_client.insert_rows(
 cls.project, cls.dataset_id, table_name, table_data)
 
-  def get_expected_data(self):
+  def get_expected_data(self, native=True):
+byts = b'\xab\xac'
 expected_row = {
 'float': 0.33,
 'numeric': Decimal('10'),
-'bytes': base64.b64encode(b'\xab\xac'),
+'bytes': base64.b64encode(byts) if native else byts,
 
 Review comment:
   The behavior will be different for different transforms. Users will need to 
explicitly change the transform in their code. We can make them aware of the 
new transform, and its typing differences in release notes, and possibly in 
Pydoc for the new transform as well. Thoughts? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407295)
Time Spent: 2h 50m  (was: 2h 40m)

> Use AVRO instead of JSON in BigQuery bounded source.
> 
>
> Key: BEAM-8910
> URL: https://issues.apache.org/jira/browse/BEAM-8910
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Kamil Wasilewski
>Assignee: Pablo Estrada
>Priority: Minor
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> The proposed BigQuery bounded source in Python SDK (see PR: 
> [https://github.com/apache/beam/pull/9772)] uses a BigQuery export job to 
> take a snapshot of the table and read from each produced JSON file. A 
> performance improvement can be gain by switching to AVRO instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9558) Make end of data channel explicit

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9558?focusedWorklogId=407294=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407294
 ]

ASF GitHub Bot logged work on BEAM-9558:


Author: ASF GitHub Bot
Created on: 20/Mar/20 22:06
Start Date: 20/Mar/20 22:06
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11173: [BEAM-9558] Add 
explicit end bit for data channel.
URL: https://github.com/apache/beam/pull/11173#issuecomment-601930536
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407294)
Time Spent: 50m  (was: 40m)

> Make end of data channel explicit
> -
>
> Key: BEAM-9558
> URL: https://issues.apache.org/jira/browse/BEAM-9558
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently the end of a data channel is implicitly marked by sending an empty 
> data block. The protocol would be simplified by making this explicit, and it 
> would also prevent data loss bugs that might occur by accidentally sending an 
> empty block. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7923) Interactive Beam

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407292=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407292
 ]

ASF GitHub Bot logged work on BEAM-7923:


Author: ASF GitHub Bot
Created on: 20/Mar/20 22:04
Start Date: 20/Mar/20 22:04
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11174: [BEAM-7923] Pop 
failed transform when error is raised
URL: https://github.com/apache/beam/pull/11174#issuecomment-601929637
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407292)
Time Spent: 9h  (was: 8h 50m)

> Interactive Beam
> 
>
> Key: BEAM-7923
> URL: https://issues.apache.org/jira/browse/BEAM-7923
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> This is the top level ticket for all efforts leveraging [interactive 
> Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]]
> As the development goes, blocking tickets will be added to this one.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407290
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 22:01
Start Date: 20/Mar/20 22:01
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395908901
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,101 +215,127 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
+
   // A set of key+value labels which define the scope of the metric.
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
 message MonitoringInfoTypeUrns {
   enum Enum {
+// Represents an integer counter where values are summed across bundles.
+//
+// Encoding: 
+//   - value: beam:coder:varint:v1
 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-"beam:metrics:sum_int_64"];
-
-DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
- "beam:metrics:distribution_int_64"];
-
-LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-   "beam:metrics:latest_int_64"];
+"beam:metrics:sum_int64:v1"];
+
+// Represents a double counter where values are summed across bundles.
+//
+// Encoding: 
+//   value: beam:coder:double:v1
+SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metrics:sum_int64:v1"];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:varint:v1
+//   - min:   beam:coder:varint:v1
+//   - max:   beam:coder:varint:v1
+DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents a distribution of a double value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:double:v1
+//   - min:   beam:coder:double:v1
+//   - max:   beam:coder:double:v1
+DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) 
=
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:varint:v1
+LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:double:v1
+LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the largest set of integer values seen across bundles.
+//
+// Encoding: ...
+//   - valueX: beam:coder:varint:v1
+TOP_N_INT64_TYPE = 6 

[jira] [Work logged] (BEAM-9420) Configurable timeout for Kafka setupInitialOffset()

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9420?focusedWorklogId=407289=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407289
 ]

ASF GitHub Bot logged work on BEAM-9420:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:59
Start Date: 20/Mar/20 21:59
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #11099: [BEAM-9420] 
Configurable timeout for blocking kafka API call(s)
URL: https://github.com/apache/beam/pull/11099#issuecomment-601928283
 
 
   @aromanenko-dev maybe since he has been maintaing KafkaIo for a while.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407289)
Time Spent: 1h  (was: 50m)

> Configurable timeout for Kafka setupInitialOffset()
> ---
>
> Key: BEAM-9420
> URL: https://issues.apache.org/jira/browse/BEAM-9420
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.19.0
>Reporter: Jozef Vilcek
>Assignee: Jozef Vilcek
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> If bootstrap brokers does contain an unhealthy server, it can break the start 
> of a whole Beam job. During the start, `KafkaUnboundedReader` is waiting for  
> `setupInitialOffset()`. Wait timeout is either a double time of `request. 
> timeout.ms` or some default constant. In both cases, it might not be enough 
> time for kafka-client to initiate fallback and retry metadata discovery via 
> another broker from given bootstrap list.
> The client should be able to specify timeout for `setupInitialOffset()` 
> explicitly as a setting to KafkaIO read.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8019) Support cross-language transforms for DataflowRunner

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8019?focusedWorklogId=407287=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407287
 ]

ASF GitHub Bot logged work on BEAM-8019:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:56
Start Date: 20/Mar/20 21:56
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #11185: [BEAM-8019] Some 
generalizations to support cross-language transforms.
URL: https://github.com/apache/beam/pull/11185#issuecomment-601927190
 
 
   cc: @robertwb @lukecwik 
   
   Still addressing some unit test failures but sharing for any early comments.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407287)
Time Spent: 8h 20m  (was: 8h 10m)

> Support cross-language transforms for DataflowRunner
> 
>
> Key: BEAM-8019
> URL: https://issues.apache.org/jira/browse/BEAM-8019
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> This is to capture the Beam changes needed for this task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8019) Support cross-language transforms for DataflowRunner

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8019?focusedWorklogId=407285=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407285
 ]

ASF GitHub Bot logged work on BEAM-8019:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:55
Start Date: 20/Mar/20 21:55
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #11185: 
[BEAM-8019] Some generalizations to support cross-language transforms.
URL: https://github.com/apache/beam/pull/11185
 
 
   These are needed for for runners that need to build a Python object graph 
from a runner API proto with external transforms (for example, Dataflow).
   
   Some generalizations to support cross-language transforms.
   
   Testing - cross-language test suite [1] works for Dataflow with these 
changes (will be enabled separately).
   
   WIP: not all tests pass yet
   
   [1] 
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/validate_runner_xlang_test.py#L51
   
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407283
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:50
Start Date: 20/Mar/20 21:50
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395905193
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,101 +215,127 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
+
   // A set of key+value labels which define the scope of the metric.
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
 message MonitoringInfoTypeUrns {
 
 Review comment:
   Done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407283)
Time Spent: 20.5h  (was: 20h 20m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 20.5h
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407277=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407277
 ]

ASF GitHub Bot logged work on BEAM-9537:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:45
Start Date: 20/Mar/20 21:45
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #11153: [BEAM-9537] Adding a 
new module for FnApiRunner
URL: https://github.com/apache/beam/pull/11153#issuecomment-601923625
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407277)
Time Spent: 2h 20m  (was: 2h 10m)

> Refactor FnApiRunner into its own package
> -
>
> Key: BEAM-9537
> URL: https://issues.apache.org/jira/browse/BEAM-9537
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Minor
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=407276=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407276
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:44
Start Date: 20/Mar/20 21:44
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #10717: [BEAM-8280] Enable type 
hint annotations
URL: https://github.com/apache/beam/pull/10717#issuecomment-601923382
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407276)
Time Spent: 10h  (was: 9h 50m)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407272=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407272
 ]

ASF GitHub Bot logged work on BEAM-9340:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:37
Start Date: 20/Mar/20 21:37
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11165: [BEAM-9340] 
Populate requirements for Java.
URL: https://github.com/apache/beam/pull/11165#issuecomment-601921071
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407272)
Time Spent: 1h 50m  (was: 1h 40m)

> Properly populate pipeline proto requirements.
> --
>
> Key: BEAM-9340
> URL: https://issues.apache.org/jira/browse/BEAM-9340
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-go, sdk-java-core, sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9545) MVP: DataframeTransform

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9545?focusedWorklogId=407271=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407271
 ]

ASF GitHub Bot logged work on BEAM-9545:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:37
Start Date: 20/Mar/20 21:37
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #10760: [BEAM-9545] 
Dataframe transforms
URL: https://github.com/apache/beam/pull/10760#discussion_r395900936
 
 

 ##
 File path: sdks/python/apache_beam/dataframe/transforms.py
 ##
 @@ -0,0 +1,246 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+
+import pandas as pd
+
+import apache_beam as beam
+from apache_beam import transforms
+from apache_beam.dataframe import expressions
+from apache_beam.dataframe import frame_base
+from apache_beam.dataframe import frames  # pylint: disable=unused-import
+
+
+class DataframeTransform(transforms.PTransform):
+  """A PTransform for applying function that takes and returns dataframes
+  to one or more PCollections.
+
+  For example, if pcoll is a PCollection of dataframes, one could write::
+
+  pcoll | DataframeTransform(lambda df: df.group_by('key').sum(), 
proxy=...)
+
+  To pass multiple PCollections, pass a tuple of PCollections wich will be
+  passed to the callable as positional arguments, or a dictionary of
+  PCollections, in which case they will be passed as keyword arguments.
+  """
+  def __init__(self, func, proxy):
+self._func = func
+self._proxy = proxy
+
+  def expand(self, input_pcolls):
+def wrap_as_dict(values):
+  if isinstance(values, dict):
+return values
+  elif isinstance(values, tuple):
+return dict(enumerate(values))
+  else:
+return {None: values}
+
+# TODO: Infer the proxy from the input schema.
+def proxy(key):
+  if key is None:
+return self._proxy
+  else:
+return self._proxy[key]
+
+# The input can be a dictionary, tuple, or plain PCollection.
+# Wrap as a dict for homogeneity.
+# TODO: Possibly inject batching here.
+input_dict = wrap_as_dict(input_pcolls)
+placeholders = {
+key: frame_base.DeferredFrame.wrap(
+expressions.PlaceholderExpression(proxy(key)))
+for key in input_dict.keys()
+}
+
+# The calling convention of the user-supplied func varies according to the
+# type of the input.
+if isinstance(input_pcolls, dict):
+  result_frames = self._func(**placeholders)
+elif isinstance(input_pcolls, tuple):
+  result_frames = self._func(
+  *(value for _, value in sorted(placeholders.items(
+else:
+  result_frames = self._func(placeholders[None])
+
+# Likewise the output may be a dict, tuple, or raw (deferred) Dataframe.
+result_dict = wrap_as_dict(result_frames)
+
+result_pcolls = self._apply_deferred_ops(
+{placeholders[key]._expr: pcoll
+ for key, pcoll in input_dict.items()},
+{key: df._expr
+ for key, df in result_dict.items()})
+
+# Convert the result back into a set of PCollections.
+if isinstance(result_frames, dict):
+  return result_pcolls
+elif isinstance(result_frames, tuple):
+  return tuple((value for _, value in sorted(result_pcolls.items(
+else:
+  return result_pcolls[None]
+
+  def _apply_deferred_ops(
+  self,
+  inputs,  # type: Dict[PlaceholderExpr, PCollection]
+  outputs,  # type: Dict[Any, Expression]
+  ):  # -> Dict[Any, PCollection]
+"""Construct a Beam graph that evaluates a set of expressions on a set of
+input PCollections.
+
+:param inputs: A mapping of placeholder expressions to PCollections.
+:param outputs: A mapping of keys to expressions defined in terms of the
+placeholders of inputs.
+
+Returns a dictionary whose keys are those of outputs, and whose values are
+PCollections corresponding to the values of outputs evaluated at the
+values of inputs.
+
+Logically, `_apply_deferred_ops({x: a, y: b}, {f: F(x, y), g: G(x, y)})`
+returns `{f: F(a, 

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407270=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407270
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:35
Start Date: 20/Mar/20 21:35
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395900617
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -194,33 +188,25 @@ extend google.protobuf.EnumValueOptions {
 }
 
 message MonitoringInfo {
-  // The name defining the metric or monitored state.
+  // The name defining the semantic meaning of the metric or monitored state.
+  //
+  // See MonitoringInfoSpecs.Enum for the set of well known metrics/monitored
+  // state.
   string urn = 1;
 
-  // This is specified as a URN that implies:
-  // A message class: (Distribution, Counter, Extrema, MonitoringDataTable).
-  // Sub types like field formats - int64, double, string.
-  // Aggregation methods - SUM, LATEST, TOP-N, BOTTOM-N, DISTRIBUTION
-  // valid values are:
-  // beam:metrics:[sum_int_64|latest_int_64|top_n_int_64|bottom_n_int_64|
-  // sum_double|latest_double|top_n_double|bottom_n_double|
-  // distribution_int_64|distribution_double|monitoring_data_table|
-  // latest_doubles
+  // This is specified as a URN that implies the encoding and aggregation
+  // method. See MonitoringInfoTypeUrns.Enum for the set of well known types.
   string type = 2;
 
-  // The Metric or monitored state.
-  oneof data {
-MonitoringTableData monitoring_table_data = 3;
-Metric metric = 4;
-bytes payload = 7;
-  }
+  // The monitored state encoded as per the specification defined by the type.
+  bytes payload = 3;
 
 Review comment:
   Yup
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407270)
Time Spent: 20h 20m  (was: 20h 10m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 20h 20m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407269=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407269
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:35
Start Date: 20/Mar/20 21:35
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395900505
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,101 +215,127 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
+
   // A set of key+value labels which define the scope of the metric.
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
 message MonitoringInfoTypeUrns {
   enum Enum {
+// Represents an integer counter where values are summed across bundles.
+//
+// Encoding: 
+//   - value: beam:coder:varint:v1
 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-"beam:metrics:sum_int_64"];
-
-DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
- "beam:metrics:distribution_int_64"];
-
-LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-   "beam:metrics:latest_int_64"];
+"beam:metrics:sum_int64:v1"];
+
+// Represents a double counter where values are summed across bundles.
+//
+// Encoding: 
+//   value: beam:coder:double:v1
+SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metrics:sum_int64:v1"];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:varint:v1
+//   - min:   beam:coder:varint:v1
+//   - max:   beam:coder:varint:v1
+DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents a distribution of a double value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:double:v1
+//   - min:   beam:coder:double:v1
+//   - max:   beam:coder:double:v1
+DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) 
=
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:varint:v1
+LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:double:v1
+LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the largest set of integer values seen across bundles.
+//
+// Encoding: ...
+//   - valueX: beam:coder:varint:v1
+TOP_N_INT64_TYPE = 6 

[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=407268=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407268
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:27
Start Date: 20/Mar/20 21:27
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10290: [BEAM-8561] Add 
ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#issuecomment-601918008
 
 
   We let it like this because we had a regression, if we want to generate and 
compile the class we need to have thrift installed in all Beam workers (and as 
a requirement for everyone building Beam) which I think is clearly overklll 
just for a test.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407268)
Time Spent: 17.5h  (was: 17h 20m)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 17.5h
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407267=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407267
 ]

ASF GitHub Bot logged work on BEAM-9537:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:26
Start Date: 20/Mar/20 21:26
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11153: [BEAM-9537] 
Adding a new module for FnApiRunner
URL: https://github.com/apache/beam/pull/11153#discussion_r395897316
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
 ##
 @@ -269,8 +269,8 @@ def visit_transform(self, transform_node):
   pcoll.element_type, transform_node.full_label)
   key_type, value_type = pcoll.element_type.tuple_types
   if transform_node.outputs:
-from apache_beam.runners.portability.fn_api_runner_transforms 
import \
-  only_element
+from apache_beam.runners.portability.fn_api_runner.transforms \
 
 Review comment:
   I meant that utilities like only_element should probably be in a more common 
place rather than imported elsewhere from here. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407267)
Time Spent: 2h 10m  (was: 2h)

> Refactor FnApiRunner into its own package
> -
>
> Key: BEAM-9537
> URL: https://issues.apache.org/jira/browse/BEAM-9537
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407265=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407265
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:16
Start Date: 20/Mar/20 21:16
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395893496
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,101 +215,127 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
+
   // A set of key+value labels which define the scope of the metric.
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
 message MonitoringInfoTypeUrns {
   enum Enum {
+// Represents an integer counter where values are summed across bundles.
+//
+// Encoding: 
+//   - value: beam:coder:varint:v1
 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-"beam:metrics:sum_int_64"];
-
-DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
- "beam:metrics:distribution_int_64"];
-
-LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-   "beam:metrics:latest_int_64"];
+"beam:metrics:sum_int64:v1"];
+
+// Represents a double counter where values are summed across bundles.
+//
+// Encoding: 
+//   value: beam:coder:double:v1
+SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metrics:sum_int64:v1"];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:varint:v1
+//   - min:   beam:coder:varint:v1
+//   - max:   beam:coder:varint:v1
+DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents a distribution of a double value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:double:v1
+//   - min:   beam:coder:double:v1
+//   - max:   beam:coder:double:v1
+DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) 
=
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:varint:v1
+LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:double:v1
+LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the largest set of integer values seen across bundles.
+//
+// Encoding: ...
+//   - valueX: beam:coder:varint:v1
+TOP_N_INT64_TYPE = 6 

[jira] [Work logged] (BEAM-8078) streaming_wordcount_debugging.py is missing a test

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8078?focusedWorklogId=407264=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407264
 ]

ASF GitHub Bot logged work on BEAM-8078:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:14
Start Date: 20/Mar/20 21:14
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10914: [BEAM-8078] 
streaming_wordcount_debugging.py is missing a test
URL: https://github.com/apache/beam/pull/10914#issuecomment-601913507
 
 
   @Tesio - unfortunately, due to an ongoing issue, jenkins only starts test 
when a committer requests it. This should be resolved but in the meantime, if 
you need to trigger tests feel free to ask on the dev@ list for someone to do 
it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407264)
Time Spent: 4h  (was: 3h 50m)

> streaming_wordcount_debugging.py is missing a test
> --
>
> Key: BEAM-8078
> URL: https://issues.apache.org/jira/browse/BEAM-8078
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Aleksey Vysotin
>Priority: Minor
>  Labels: beginner, easy, newbie, starter
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> It's example code and should have a basic_test (like the other wordcount 
> variants in [1]) to at least verify that it runs in the latest Beam release.
> [1] 
> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407263=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407263
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:13
Start Date: 20/Mar/20 21:13
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395892463
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,101 +215,127 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
+
   // A set of key+value labels which define the scope of the metric.
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
 message MonitoringInfoTypeUrns {
 
 Review comment:
   Let's add some comments to make it clear the type is referring to what is 
collected in each MonitoringInfo update, and how they should be aggregated 
together
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407263)
Time Spent: 19h 50m  (was: 19h 40m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 19h 50m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8078) streaming_wordcount_debugging.py is missing a test

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8078?focusedWorklogId=407262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407262
 ]

ASF GitHub Bot logged work on BEAM-8078:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:13
Start Date: 20/Mar/20 21:13
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10914: [BEAM-8078] 
streaming_wordcount_debugging.py is missing a test
URL: https://github.com/apache/beam/pull/10914#issuecomment-601913195
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407262)
Time Spent: 3h 50m  (was: 3h 40m)

> streaming_wordcount_debugging.py is missing a test
> --
>
> Key: BEAM-8078
> URL: https://issues.apache.org/jira/browse/BEAM-8078
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Aleksey Vysotin
>Priority: Minor
>  Labels: beginner, easy, newbie, starter
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> It's example code and should have a basic_test (like the other wordcount 
> variants in [1]) to at least verify that it runs in the latest Beam release.
> [1] 
> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407260
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:07
Start Date: 20/Mar/20 21:07
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395888786
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -194,33 +188,25 @@ extend google.protobuf.EnumValueOptions {
 }
 
 message MonitoringInfo {
-  // The name defining the metric or monitored state.
+  // The name defining the semantic meaning of the metric or monitored state.
+  //
+  // See MonitoringInfoSpecs.Enum for the set of well known metrics/monitored
+  // state.
   string urn = 1;
 
-  // This is specified as a URN that implies:
-  // A message class: (Distribution, Counter, Extrema, MonitoringDataTable).
-  // Sub types like field formats - int64, double, string.
-  // Aggregation methods - SUM, LATEST, TOP-N, BOTTOM-N, DISTRIBUTION
-  // valid values are:
-  // beam:metrics:[sum_int_64|latest_int_64|top_n_int_64|bottom_n_int_64|
-  // sum_double|latest_double|top_n_double|bottom_n_double|
-  // distribution_int_64|distribution_double|monitoring_data_table|
-  // latest_doubles
+  // This is specified as a URN that implies the encoding and aggregation
+  // method. See MonitoringInfoTypeUrns.Enum for the set of well known types.
   string type = 2;
 
-  // The Metric or monitored state.
-  oneof data {
-MonitoringTableData monitoring_table_data = 3;
-Metric metric = 4;
-bytes payload = 7;
-  }
+  // The monitored state encoded as per the specification defined by the type.
+  bytes payload = 3;
 
 Review comment:
   Probably you are already planning on doing this. But having helper functions 
to easily encode/decode these would be great.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407260)
Time Spent: 19h 40m  (was: 19.5h)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407258
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:07
Start Date: 20/Mar/20 21:07
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395888153
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,101 +215,127 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
+
   // A set of key+value labels which define the scope of the metric.
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
 message MonitoringInfoTypeUrns {
   enum Enum {
+// Represents an integer counter where values are summed across bundles.
+//
+// Encoding: 
+//   - value: beam:coder:varint:v1
 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-"beam:metrics:sum_int_64"];
-
-DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
- "beam:metrics:distribution_int_64"];
-
-LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
-   "beam:metrics:latest_int_64"];
+"beam:metrics:sum_int64:v1"];
+
+// Represents a double counter where values are summed across bundles.
+//
+// Encoding: 
+//   value: beam:coder:double:v1
+SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+"beam:metrics:sum_int64:v1"];
+
+// Represents a distribution of an integer value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:varint:v1
+//   - min:   beam:coder:varint:v1
+//   - max:   beam:coder:varint:v1
+DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents a distribution of a double value where:
+//   - count: represents the number of values seen across all bundles
+//   - sum: represents the total of the value across all bundles
+//   - min: represents the smallest value seen across all bundles
+//   - max: represents the largest value seen across all bundles
+//
+// Encoding: 
+//   - count: beam:coder:varint:v1
+//   - sum:   beam:coder:double:v1
+//   - min:   beam:coder:double:v1
+//   - max:   beam:coder:double:v1
+DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) 
=
+ "beam:metrics:distribution_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:varint:v1
+LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the latest seen integer value. The timestamp is used to
+// provide an "ordering" over multiple values to determine which is the
+// latest.
+//
+// Encoding: 
+//   - timestamp: beam:coder:varint:v1 (milliseconds since epoch)
+//   - value: beam:coder:double:v1
+LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) =
+   "beam:metrics:latest_int64:v1"];
+
+// Represents the largest set of integer values seen across bundles.
+//
+// Encoding: ...
+//   - valueX: beam:coder:varint:v1
+TOP_N_INT64_TYPE = 6 

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407256=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407256
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:07
Start Date: 20/Mar/20 21:07
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395888595
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -229,101 +215,127 @@ message MonitoringInfo {
 NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }];
 NAME = 6 [(label_props) = { name: "NAME" }];
   }
+
   // A set of key+value labels which define the scope of the metric.
   // Either a well defined entity id for matching the enum names in
   // the MonitoringInfoLabels enum or any arbitrary label
   // set by a custom metric or user metric.
+  //
   // A monitoring system is expected to be able to aggregate the metrics
   // together for all updates having the same URN and labels. Some systems such
   // as Stackdriver will be able to aggregate the metrics using a subset of the
   // provided labels
-  map labels = 5;
-
-  // The walltime of the most recent update.
-  // Useful for aggregation for latest types such as LatestInt64.
-  google.protobuf.Timestamp timestamp = 6;
+  map labels = 4;
 }
 
 message MonitoringInfoTypeUrns {
   enum Enum {
+// Represents an integer counter where values are summed across bundles.
+//
+// Encoding: 
+//   - value: beam:coder:varint:v1
 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) =
 
 Review comment:
   I think there are only a few of these types being used now. SUM_INT64_TYPE 
and DISTRIBUTION_INT64_TYPE. I hope we can make it very simple to add new ones 
of these with minimal changes (Adding MonitoringInfoSpec and reusing existing 
framework/libraries in the SDK, runners can mostly pass them through to a 
service to aggregate across multiple workers)
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407256)
Time Spent: 19.5h  (was: 19h 20m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 19.5h
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407259=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407259
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:07
Start Date: 20/Mar/20 21:07
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395889104
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -194,33 +188,25 @@ extend google.protobuf.EnumValueOptions {
 }
 
 message MonitoringInfo {
-  // The name defining the metric or monitored state.
+  // The name defining the semantic meaning of the metric or monitored state.
+  //
+  // See MonitoringInfoSpecs.Enum for the set of well known metrics/monitored
+  // state.
   string urn = 1;
 
-  // This is specified as a URN that implies:
-  // A message class: (Distribution, Counter, Extrema, MonitoringDataTable).
-  // Sub types like field formats - int64, double, string.
-  // Aggregation methods - SUM, LATEST, TOP-N, BOTTOM-N, DISTRIBUTION
-  // valid values are:
-  // beam:metrics:[sum_int_64|latest_int_64|top_n_int_64|bottom_n_int_64|
-  // sum_double|latest_double|top_n_double|bottom_n_double|
-  // distribution_int_64|distribution_double|monitoring_data_table|
-  // latest_doubles
+  // This is specified as a URN that implies the encoding and aggregation
+  // method. See MonitoringInfoTypeUrns.Enum for the set of well known types.
   string type = 2;
 
-  // The Metric or monitored state.
-  oneof data {
-MonitoringTableData monitoring_table_data = 3;
-Metric metric = 4;
-bytes payload = 7;
-  }
+  // The monitored state encoded as per the specification defined by the type.
+  bytes payload = 3;
 
 Review comment:
   My biggest concern is losing the ability to print debug strings, which are 
helpful when people are trying to learn how these are populated. But maybe we 
can just add a few obvious places to dump debug logs, debug files, etc with the 
MonitoringInfors parses properly.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407259)
Time Spent: 19h 40m  (was: 19.5h)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7923) Interactive Beam

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407255
 ]

ASF GitHub Bot logged work on BEAM-7923:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:07
Start Date: 20/Mar/20 21:07
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on pull request #11174: [BEAM-7923] 
Pop failed transform in CombineGlobally
URL: https://github.com/apache/beam/pull/11174#discussion_r395890026
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -1811,6 +1811,8 @@ def add_input_types(transform):
   return view
 else:
   if pcoll.windowing.windowfn != GlobalWindows():
+# Remove the broken transform when running into value error.
+pcoll.pipeline.transforms_stack.pop()
 
 Review comment:
   Yes, agree with it! I'll make the change.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407255)
Time Spent: 8h 50m  (was: 8h 40m)

> Interactive Beam
> 
>
> Key: BEAM-7923
> URL: https://issues.apache.org/jira/browse/BEAM-7923
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> This is the top level ticket for all efforts leveraging [interactive 
> Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]]
> As the development goes, blocking tickets will be added to this one.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407257
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:07
Start Date: 20/Mar/20 21:07
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#discussion_r395887788
 
 

 ##
 File path: model/pipeline/src/main/proto/metrics.proto
 ##
 @@ -139,7 +137,7 @@ message MonitoringInfoSpecs {
 
 USER_DISTRIBUTION_COUNTER = 6 [(monitoring_info_spec) = {
   urn: "beam:metric:user_distribution",
-  type_urn: "beam:metrics:distribution_int_64",
+  type_urn: "beam:metrics:distribution_int64",
 
 Review comment:
   Add a :v1 here
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407257)
Time Spent: 19.5h  (was: 19h 20m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 19.5h
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8078) streaming_wordcount_debugging.py is missing a test

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8078?focusedWorklogId=407253=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407253
 ]

ASF GitHub Bot logged work on BEAM-8078:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:06
Start Date: 20/Mar/20 21:06
Worklog Time Spent: 10m 
  Work Description: Tesio commented on issue #10914: [BEAM-8078] 
streaming_wordcount_debugging.py is missing a test
URL: https://github.com/apache/beam/pull/10914#issuecomment-601910749
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407253)
Time Spent: 3h 40m  (was: 3.5h)

> streaming_wordcount_debugging.py is missing a test
> --
>
> Key: BEAM-8078
> URL: https://issues.apache.org/jira/browse/BEAM-8078
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Aleksey Vysotin
>Priority: Minor
>  Labels: beginner, easy, newbie, starter
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> It's example code and should have a basic_test (like the other wordcount 
> variants in [1]) to at least verify that it runs in the latest Beam release.
> [1] 
> https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9558) Make end of data channel explicit

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9558?focusedWorklogId=407252=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407252
 ]

ASF GitHub Bot logged work on BEAM-9558:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:06
Start Date: 20/Mar/20 21:06
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11173: [BEAM-9558] Add 
explicit end bit for data channel.
URL: https://github.com/apache/beam/pull/11173#issuecomment-601910742
 
 
   Rebased on #11177 and updated the proto for timers.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407252)
Time Spent: 40m  (was: 0.5h)

> Make end of data channel explicit
> -
>
> Key: BEAM-9558
> URL: https://issues.apache.org/jira/browse/BEAM-9558
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently the end of a data channel is implicitly marked by sending an empty 
> data block. The protocol would be simplified by making this explicit, and it 
> would also prevent data loss bugs that might occur by accidentally sending an 
> empty block. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407251
 ]

ASF GitHub Bot logged work on BEAM-9537:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:05
Start Date: 20/Mar/20 21:05
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11153: [BEAM-9537] 
Adding a new module for FnApiRunner
URL: https://github.com/apache/beam/pull/11153#discussion_r395889275
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
 ##
 @@ -269,8 +269,8 @@ def visit_transform(self, transform_node):
   pcoll.element_type, transform_node.full_label)
   key_type, value_type = pcoll.element_type.tuple_types
   if transform_node.outputs:
-from apache_beam.runners.portability.fn_api_runner_transforms 
import \
-  only_element
+from apache_beam.runners.portability.fn_api_runner.transforms \
 
 Review comment:
   You mean the newly named `translations` module may need to exist in 
`apache_beam/utils`?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407251)
Time Spent: 2h  (was: 1h 50m)

> Refactor FnApiRunner into its own package
> -
>
> Key: BEAM-9537
> URL: https://issues.apache.org/jira/browse/BEAM-9537
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407249=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407249
 ]

ASF GitHub Bot logged work on BEAM-9537:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:04
Start Date: 20/Mar/20 21:04
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11153: [BEAM-9537] 
Adding a new module for FnApiRunner
URL: https://github.com/apache/beam/pull/11153#discussion_r395888936
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py
 ##
 @@ -78,12 +78,12 @@
 from apache_beam.runners import pipeline_context
 from apache_beam.runners import runner
 from apache_beam.runners.portability import artifact_service
-from apache_beam.runners.portability import fn_api_runner_transforms
 from apache_beam.runners.portability import portable_metrics
-from apache_beam.runners.portability.fn_api_runner_transforms import 
create_buffer_id
-from apache_beam.runners.portability.fn_api_runner_transforms import 
only_element
-from apache_beam.runners.portability.fn_api_runner_transforms import 
split_buffer_id
-from apache_beam.runners.portability.fn_api_runner_transforms import 
unique_name
+from apache_beam.runners.portability.fn_api_runner import transforms
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407249)
Time Spent: 1h 40m  (was: 1.5h)

> Refactor FnApiRunner into its own package
> -
>
> Key: BEAM-9537
> URL: https://issues.apache.org/jira/browse/BEAM-9537
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407250=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407250
 ]

ASF GitHub Bot logged work on BEAM-9537:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:04
Start Date: 20/Mar/20 21:04
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #11153: [BEAM-9537] 
Adding a new module for FnApiRunner
URL: https://github.com/apache/beam/pull/11153#discussion_r395888936
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py
 ##
 @@ -78,12 +78,12 @@
 from apache_beam.runners import pipeline_context
 from apache_beam.runners import runner
 from apache_beam.runners.portability import artifact_service
-from apache_beam.runners.portability import fn_api_runner_transforms
 from apache_beam.runners.portability import portable_metrics
-from apache_beam.runners.portability.fn_api_runner_transforms import 
create_buffer_id
-from apache_beam.runners.portability.fn_api_runner_transforms import 
only_element
-from apache_beam.runners.portability.fn_api_runner_transforms import 
split_buffer_id
-from apache_beam.runners.portability.fn_api_runner_transforms import 
unique_name
+from apache_beam.runners.portability.fn_api_runner import transforms
 
 Review comment:
   Done. Went with translations. But I also like optimizations... Maybe I'll 
rename if it becomes a little bit more optimizations than translations down the 
road.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407250)
Time Spent: 1h 50m  (was: 1h 40m)

> Refactor FnApiRunner into its own package
> -
>
> Key: BEAM-9537
> URL: https://issues.apache.org/jira/browse/BEAM-9537
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407237=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407237
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 21:00
Start Date: 20/Mar/20 21:00
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11184: [WIP][BEAM-4374] 
Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#issuecomment-601897399
 
 
   CC: @ajamato @robertwb 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407237)
Time Spent: 19h 20m  (was: 19h 10m)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 19h 20m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407235=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407235
 ]

ASF GitHub Bot logged work on BEAM-9537:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:58
Start Date: 20/Mar/20 20:58
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11153: [BEAM-9537] 
Adding a new module for FnApiRunner
URL: https://github.com/apache/beam/pull/11153#discussion_r395885958
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py
 ##
 @@ -78,12 +78,12 @@
 from apache_beam.runners import pipeline_context
 from apache_beam.runners import runner
 from apache_beam.runners.portability import artifact_service
-from apache_beam.runners.portability import fn_api_runner_transforms
 from apache_beam.runners.portability import portable_metrics
-from apache_beam.runners.portability.fn_api_runner_transforms import 
create_buffer_id
-from apache_beam.runners.portability.fn_api_runner_transforms import 
only_element
-from apache_beam.runners.portability.fn_api_runner_transforms import 
split_buffer_id
-from apache_beam.runners.portability.fn_api_runner_transforms import 
unique_name
+from apache_beam.runners.portability.fn_api_runner import transforms
 
 Review comment:
   Seeing this unqualified makes me realize that it's ambiguous with the notion 
of a PTransform (and the apache_beam.transforms package). Maybe we should call 
it `optimizations` or `translations`? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407235)
Time Spent: 1.5h  (was: 1h 20m)

> Refactor FnApiRunner into its own package
> -
>
> Key: BEAM-9537
> URL: https://issues.apache.org/jira/browse/BEAM-9537
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407236=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407236
 ]

ASF GitHub Bot logged work on BEAM-9537:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:58
Start Date: 20/Mar/20 20:58
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11153: [BEAM-9537] 
Adding a new module for FnApiRunner
URL: https://github.com/apache/beam/pull/11153#discussion_r395884481
 
 

 ##
 File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
 ##
 @@ -269,8 +269,8 @@ def visit_transform(self, transform_node):
   pcoll.element_type, transform_node.full_label)
   key_type, value_type = pcoll.element_type.tuple_types
   if transform_node.outputs:
-from apache_beam.runners.portability.fn_api_runner_transforms 
import \
-  only_element
+from apache_beam.runners.portability.fn_api_runner.transforms \
 
 Review comment:
   You don't have to do this in this PR, but this should probably be in 
apache_beam/utils. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407236)
Time Spent: 1.5h  (was: 1h 20m)

> Refactor FnApiRunner into its own package
> -
>
> Key: BEAM-9537
> URL: https://issues.apache.org/jira/browse/BEAM-9537
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7923) Interactive Beam

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407231=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407231
 ]

ASF GitHub Bot logged work on BEAM-7923:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:49
Start Date: 20/Mar/20 20:49
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11174: [BEAM-7923] 
Pop failed transform in CombineGlobally
URL: https://github.com/apache/beam/pull/11174#discussion_r395883196
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -1811,6 +1811,8 @@ def add_input_types(transform):
   return view
 else:
   if pcoll.windowing.windowfn != GlobalWindows():
+# Remove the broken transform when running into value error.
+pcoll.pipeline.transforms_stack.pop()
 
 Review comment:
   E.g. put this 
https://github.com/apache/beam/blob/release-2.19.0/sdks/python/apache_beam/pipeline.py#L330
 in a finally clause of a try block that starts where it's pushed. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407231)
Time Spent: 8h 40m  (was: 8.5h)

> Interactive Beam
> 
>
> Key: BEAM-7923
> URL: https://issues.apache.org/jira/browse/BEAM-7923
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> This is the top level ticket for all efforts leveraging [interactive 
> Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]]
> As the development goes, blocking tickets will be added to this one.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7923) Interactive Beam

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407229
 ]

ASF GitHub Bot logged work on BEAM-7923:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:47
Start Date: 20/Mar/20 20:47
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11174: [BEAM-7923] 
Pop failed transform in CombineGlobally
URL: https://github.com/apache/beam/pull/11174#discussion_r395882293
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -1811,6 +1811,8 @@ def add_input_types(transform):
   return view
 else:
   if pcoll.windowing.windowfn != GlobalWindows():
+# Remove the broken transform when running into value error.
+pcoll.pipeline.transforms_stack.pop()
 
 Review comment:
   This is not the right place to pop this (internal) stack. Instead, we should 
popping from the stack in a finally clause of a try block that pushes to the 
stack. (Alternatively, we could manage the stack with a Python context, but 
that might be overkill.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407229)
Time Spent: 8.5h  (was: 8h 20m)

> Interactive Beam
> 
>
> Key: BEAM-7923
> URL: https://issues.apache.org/jira/browse/BEAM-7923
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> This is the top level ticket for all efforts leveraging [interactive 
> Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]]
> As the development goes, blocking tickets will be added to this one.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407225=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407225
 ]

ASF GitHub Bot logged work on BEAM-9542:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:35
Start Date: 20/Mar/20 20:35
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and 
clarify the effect of "force" in Java build
URL: https://github.com/apache/beam/pull/11168#issuecomment-601899334
 
 
   Run Spark ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407225)
Time Spent: 4h  (was: 3h 50m)

> Where the BeamModulePlugin's force is needed?
> -
>
> Key: BEAM-9542
> URL: https://issues.apache.org/jira/browse/BEAM-9542
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735
> {noformat}
> > Task :sdks:java:core:compileTestJava FAILED
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 2 errors
> <-> 36% EXECUTING [19m 37s]
> {noformat}
> Memo for my Mac:
> {noformat}
> suztomo-macbookpro44% ./gradlew -p sdks/java check -x 
> extensions:sql:zetasql:check -x harness:test -x io:jdbc:test  -x 
> io:kafka:test -x io:solr:test -x core:test
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407223=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407223
 ]

ASF GitHub Bot logged work on BEAM-9542:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:35
Start Date: 20/Mar/20 20:35
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and 
clarify the effect of "force" in Java build
URL: https://github.com/apache/beam/pull/11168#issuecomment-601899190
 
 
   Run Java HadoopFormatIO Performance Test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407223)
Time Spent: 3h 40m  (was: 3.5h)

> Where the BeamModulePlugin's force is needed?
> -
>
> Key: BEAM-9542
> URL: https://issues.apache.org/jira/browse/BEAM-9542
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735
> {noformat}
> > Task :sdks:java:core:compileTestJava FAILED
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 2 errors
> <-> 36% EXECUTING [19m 37s]
> {noformat}
> Memo for my Mac:
> {noformat}
> suztomo-macbookpro44% ./gradlew -p sdks/java check -x 
> extensions:sql:zetasql:check -x harness:test -x io:jdbc:test  -x 
> io:kafka:test -x io:solr:test -x core:test
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407224
 ]

ASF GitHub Bot logged work on BEAM-9542:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:35
Start Date: 20/Mar/20 20:35
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and 
clarify the effect of "force" in Java build
URL: https://github.com/apache/beam/pull/11168#issuecomment-601899254
 
 
   Run Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407224)
Time Spent: 3h 50m  (was: 3h 40m)

> Where the BeamModulePlugin's force is needed?
> -
>
> Key: BEAM-9542
> URL: https://issues.apache.org/jira/browse/BEAM-9542
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735
> {noformat}
> > Task :sdks:java:core:compileTestJava FAILED
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 2 errors
> <-> 36% EXECUTING [19m 37s]
> {noformat}
> Memo for my Mac:
> {noformat}
> suztomo-macbookpro44% ./gradlew -p sdks/java check -x 
> extensions:sql:zetasql:check -x harness:test -x io:jdbc:test  -x 
> io:kafka:test -x io:solr:test -x core:test
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407226=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407226
 ]

ASF GitHub Bot logged work on BEAM-9542:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:35
Start Date: 20/Mar/20 20:35
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and 
clarify the effect of "force" in Java build
URL: https://github.com/apache/beam/pull/11168#issuecomment-601899388
 
 
   Run SQL Postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407226)
Time Spent: 4h 10m  (was: 4h)

> Where the BeamModulePlugin's force is needed?
> -
>
> Key: BEAM-9542
> URL: https://issues.apache.org/jira/browse/BEAM-9542
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735
> {noformat}
> > Task :sdks:java:core:compileTestJava FAILED
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 2 errors
> <-> 36% EXECUTING [19m 37s]
> {noformat}
> Memo for my Mac:
> {noformat}
> suztomo-macbookpro44% ./gradlew -p sdks/java check -x 
> extensions:sql:zetasql:check -x harness:test -x io:jdbc:test  -x 
> io:kafka:test -x io:solr:test -x core:test
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407222
 ]

ASF GitHub Bot logged work on BEAM-9542:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:34
Start Date: 20/Mar/20 20:34
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and 
clarify the effect of "force" in Java build
URL: https://github.com/apache/beam/pull/11168#issuecomment-601899121
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407222)
Time Spent: 3.5h  (was: 3h 20m)

> Where the BeamModulePlugin's force is needed?
> -
>
> Key: BEAM-9542
> URL: https://issues.apache.org/jira/browse/BEAM-9542
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735
> {noformat}
> > Task :sdks:java:core:compileTestJava FAILED
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 2 errors
> <-> 36% EXECUTING [19m 37s]
> {noformat}
> Memo for my Mac:
> {noformat}
> suztomo-macbookpro44% ./gradlew -p sdks/java check -x 
> extensions:sql:zetasql:check -x harness:test -x io:jdbc:test  -x 
> io:kafka:test -x io:solr:test -x core:test
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407219=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407219
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:30
Start Date: 20/Mar/20 20:30
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11184: 
[WIP][BEAM-4374] Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184
 
 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | 

[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407220
 ]

ASF GitHub Bot logged work on BEAM-4374:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:30
Start Date: 20/Mar/20 20:30
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11184: [WIP][BEAM-4374] 
Update protos related to MonitoringInfo.
URL: https://github.com/apache/beam/pull/11184#issuecomment-601897399
 
 
   CC: @ajamato 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407220)
Time Spent: 19h 10m  (was: 19h)

> Update existing metrics in the FN API to use new Metric Schema
> --
>
> Key: BEAM-4374
> URL: https://issues.apache.org/jira/browse/BEAM-4374
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Alex Amato
>Priority: Major
>  Time Spent: 19h 10m
>  Remaining Estimate: 0h
>
> Update existing metrics to use the new proto and cataloging schema defined in:
> [_https://s.apache.org/beam-fn-api-metrics_]
>  * Check in new protos
>  * Define catalog file for metrics
>  * Port existing metrics to use this new format, based on catalog 
> names+metadata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407217=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407217
 ]

ASF GitHub Bot logged work on BEAM-9542:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:22
Start Date: 20/Mar/20 20:22
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #11168: [BEAM-9542] Limit 
and clarify the effect of "force" in Java build
URL: https://github.com/apache/beam/pull/11168#issuecomment-601894622
 
 
   Yes, please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407217)
Time Spent: 3h 20m  (was: 3h 10m)

> Where the BeamModulePlugin's force is needed?
> -
>
> Key: BEAM-9542
> URL: https://issues.apache.org/jira/browse/BEAM-9542
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735
> {noformat}
> > Task :sdks:java:core:compileTestJava FAILED
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 2 errors
> <-> 36% EXECUTING [19m 37s]
> {noformat}
> Memo for my Mac:
> {noformat}
> suztomo-macbookpro44% ./gradlew -p sdks/java check -x 
> extensions:sql:zetasql:check -x harness:test -x io:jdbc:test  -x 
> io:kafka:test -x io:solr:test -x core:test
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407216=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407216
 ]

ASF GitHub Bot logged work on BEAM-9542:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:21
Start Date: 20/Mar/20 20:21
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and 
clarify the effect of "force" in Java build
URL: https://github.com/apache/beam/pull/11168#issuecomment-601894197
 
 
   @suztomo do you need to rerun the checks that were not triggered? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407216)
Time Spent: 3h 10m  (was: 3h)

> Where the BeamModulePlugin's force is needed?
> -
>
> Key: BEAM-9542
> URL: https://issues.apache.org/jira/browse/BEAM-9542
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735
> {noformat}
> > Task :sdks:java:core:compileTestJava FAILED
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21:
>  error: cannot find symbol
> import static org.junit.Assert.assertThrows;
> ^
>   symbol:   static assertThrows
>   location: class
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 2 errors
> <-> 36% EXECUTING [19m 37s]
> {noformat}
> Memo for my Mac:
> {noformat}
> suztomo-macbookpro44% ./gradlew -p sdks/java check -x 
> extensions:sql:zetasql:check -x harness:test -x io:jdbc:test  -x 
> io:kafka:test -x io:solr:test -x core:test
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=407213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407213
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:08
Start Date: 20/Mar/20 20:08
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #11067: [BEAM-9136]Add 
licenses for dependencies
URL: https://github.com/apache/beam/pull/11067#issuecomment-601889129
 
 
   @tvalentyn , I fixed all your comments about Python. PTAL when you have time.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407213)
Time Spent: 2h 50m  (was: 2h 40m)

> Add LICENSES and NOTICES to docker images
> -
>
> Key: BEAM-9136
> URL: https://issues.apache.org/jira/browse/BEAM-9136
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Scan dependencies and add licenses and notices of the dependencies to SDK 
> docker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=407212=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407212
 ]

ASF GitHub Bot logged work on BEAM-9136:


Author: ASF GitHub Bot
Created on: 20/Mar/20 20:07
Start Date: 20/Mar/20 20:07
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #11067: 
[BEAM-9136]Add licenses for dependencies
URL: https://github.com/apache/beam/pull/11067#discussion_r395865763
 
 

 ##
 File path: licenses/scripts/pull_licenses_py.py
 ##
 @@ -0,0 +1,157 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+"""A script to pull licenses for Python.
+"""
+import json
+import os
+import shutil
+import subprocess
+import sys
+import yaml
+
+from tenacity import retry
+from tenacity import stop_after_attempt
+
+
+def run_bash_command(command):
+  process = subprocess.Popen(command.split(),
+ stdout=subprocess.PIPE,
+ stderr=subprocess.PIPE)
+  result, error = process.communicate()
+  if error:
+raise RuntimeError('Error occurred when running a bash command.',
+   'command: ', command, 'error message: ',
+   error.decode('utf-8'))
+  return result.decode('utf-8')
+
+
+try:
+  import wget
+except:
+  command = 'pip install wget --no-cache-dir'
+  run_bash_command(command)
+  import wget
+
+
+def install_pip_licenses():
+  command = 'pip install pip-licenses --no-cache-dir'
+  run_bash_command(command)
+
+
+def run_pip_licenses():
+  command = 'pip-licenses --with-license-file --format=json'
+  dependencies = run_bash_command(command)
+  return json.loads(dependencies)
+
+
+@retry(stop=stop_after_attempt(3))
+def copy_license_files(dep):
+  source_license_file = dep['LicenseFile']
+  if source_license_file.lower() == 'unknown':
+return False
+  name = dep['Name']
+  dest_dir = '/'.join([license_dir, name.lower()])
+  try:
+if not os.path.isdir(dest_dir):
+  os.mkdir(dest_dir)
+shutil.copy(source_license_file, dest_dir + '/LICENSE')
+return True
+  except Exception as e:
+print(e)
+return False
+
+
+@retry(stop=stop_after_attempt(3))
+def pull_from_url(dep, configs):
+  '''
+  :param dep: name of a dependency
+  :param configs: a dict from dep_urls_py.yaml
+  :return: boolean
+
+  It downloads files form urls to a temp directory first in order to avoid
+  to deal with any temp files. It helps keep clean final directory.
+  '''
+  if dep in configs.keys():
+config = configs[dep]
+dest_dir = '/'.join([license_dir, dep])
+cur_temp_dir = 'temp_license_' + dep
+os.mkdir(cur_temp_dir)
+try:
+  is_file_available = False
+  # license is required, but not all dependencies have license.
+  # In case we have to skip, print out a message.
+  if config['license'] != 'skip':
+wget.download(config['license'], cur_temp_dir + '/LICENSE')
+is_file_available = True
+  # notice is optional.
+  if 'notice' in config:
+wget.download(config['notice'], cur_temp_dir + '/NOTICE')
+is_file_available = True
+  # copy from temp dir to final dir only when either file is abailable.
+  if is_file_available:
+if os.path.isdir(dest_dir):
+  shutil.rmtree(dest_dir)
+shutil.copytree(cur_temp_dir, dest_dir)
+  result = True
+except Exception as e:
+  print('Error occurred when pull license from url.', 'dependency =',
+dep, 'url =', config, 'error = ', e.decode('utf-8'))
+  result = False
+finally:
+  shutil.rmtree(cur_temp_dir)
+  return result
+  else:
+return False
+
+
+if __name__ == "__main__":
+  cur_dir = os.getcwd()
+  if cur_dir.split('/')[-1] != 'beam':
+raise RuntimeError('This script should run from ~/beam directory.')
+  license_dir = os.getcwd() + '/licenses/python'
+  no_licenses = []
+
+  with open('licenses/scripts/dep_urls_py.yaml') as file:
+dep_config = yaml.load(file)
+
+  install_pip_licenses()
 
 Review comment:
   The script is running at a virtual environment with [apache-beam, test, gcp, 
docs, 

[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407195=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407195
 ]

ASF GitHub Bot logged work on BEAM-3301:


Author: ASF GitHub Bot
Created on: 20/Mar/20 19:32
Start Date: 20/Mar/20 19:32
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11179: [BEAM-3301] 
Bugfix in DoFn validation.
URL: https://github.com/apache/beam/pull/11179#discussion_r395850295
 
 

 ##
 File path: sdks/go/pkg/beam/core/graph/fn.go
 ##
 @@ -446,23 +444,16 @@ func validateMainInputs(fn *Fn, method *funcx.Fn, 
methodName string, numMainIn m
return err
}
 
-   // Check that the first numMainIn inputs are not side inputs (Iters or
-   // ReIters). We aren't able to catch singleton side inputs here since
-   // they're indistinguishable from main inputs.
-   mainInputs := method.Param[pos : pos+int(numMainIn)]
-   for i, p := range mainInputs {
-   if p.Kind != funcx.FnValue {
-   err := errors.Errorf("expected main input parameter but 
found "+
-   "side input parameter in position %v",
-   pos+i)
-   err = errors.SetTopLevelMsgf(err,
-   "Method %v in DoFn %v should have all main 
inputs before side inputs, "+
-   "but a side input (as Iter or ReIter) 
appears as parameter %v when a "+
-   "main input was expected.",
-   methodName, fn.Name(), pos+i)
-   err = errors.WithContextf(err, "method %v", methodName)
-   return err
-   }
+   // Check that the first input is not an Iter or ReIter (those aren't 
valid
+   // as the first main input).
+   first := method.Param[pos].Kind
+   if first != funcx.FnValue {
+   err := errors.New("first main input parameter must be value 
type")
 
 Review comment:
   I'll just add it in real quick while squashing the commits.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407195)
Time Spent: 8h  (was: 7h 50m)

> Go SplittableDoFn support
> -
>
> Key: BEAM-3301
> URL: https://issues.apache.org/jira/browse/BEAM-3301
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> SDFs will be the only way to add streaming and liquid sharded IO for Go.
> Design doc: https://s.apache.org/splittable-do-fn



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407190=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407190
 ]

ASF GitHub Bot logged work on BEAM-9340:


Author: ASF GitHub Bot
Created on: 20/Mar/20 19:22
Start Date: 20/Mar/20 19:22
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11165: [BEAM-9340] 
Populate requirements for Java.
URL: https://github.com/apache/beam/pull/11165#issuecomment-601872470
 
 
   I rebased on top of your PR.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407190)
Time Spent: 1h 40m  (was: 1.5h)

> Properly populate pipeline proto requirements.
> --
>
> Key: BEAM-9340
> URL: https://issues.apache.org/jira/browse/BEAM-9340
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-go, sdk-java-core, sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407189=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407189
 ]

ASF GitHub Bot logged work on BEAM-9340:


Author: ASF GitHub Bot
Created on: 20/Mar/20 19:22
Start Date: 20/Mar/20 19:22
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11165: [BEAM-9340] 
Populate requirements for Java.
URL: https://github.com/apache/beam/pull/11165#discussion_r395845584
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
 ##
 @@ -124,6 +124,15 @@
   public static final String 
SPLITTABLE_PROCESS_SIZED_ELEMENTS_AND_RESTRICTIONS_URN =
   "beam:transform:sdf_process_sized_element_and_restrictions:v1";
 
+  public static final String REQUIRES_STATEFUL_PROCESSING_URN =
+  getUrn(RunnerApi.StandardRequirements.Enum.REQUIRES_STATEFUL_PROCESSING);
 
 Review comment:
   That's unfortunate, but I see the pattern. Have to be vigilant to prevent 
bugs. (Unlikely that these'll be used in switch statements, but consistency is 
good.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407189)
Time Spent: 1.5h  (was: 1h 20m)

> Properly populate pipeline proto requirements.
> --
>
> Key: BEAM-9340
> URL: https://issues.apache.org/jira/browse/BEAM-9340
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-go, sdk-java-core, sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407186=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407186
 ]

ASF GitHub Bot logged work on BEAM-3301:


Author: ASF GitHub Bot
Created on: 20/Mar/20 19:12
Start Date: 20/Mar/20 19:12
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11179: [BEAM-3301] 
Bugfix in DoFn validation.
URL: https://github.com/apache/beam/pull/11179#discussion_r395840270
 
 

 ##
 File path: sdks/go/pkg/beam/core/graph/fn.go
 ##
 @@ -446,23 +444,16 @@ func validateMainInputs(fn *Fn, method *funcx.Fn, 
methodName string, numMainIn m
return err
}
 
-   // Check that the first numMainIn inputs are not side inputs (Iters or
-   // ReIters). We aren't able to catch singleton side inputs here since
-   // they're indistinguishable from main inputs.
-   mainInputs := method.Param[pos : pos+int(numMainIn)]
-   for i, p := range mainInputs {
-   if p.Kind != funcx.FnValue {
-   err := errors.Errorf("expected main input parameter but 
found "+
-   "side input parameter in position %v",
-   pos+i)
-   err = errors.SetTopLevelMsgf(err,
-   "Method %v in DoFn %v should have all main 
inputs before side inputs, "+
-   "but a side input (as Iter or ReIter) 
appears as parameter %v when a "+
-   "main input was expected.",
-   methodName, fn.Name(), pos+i)
-   err = errors.WithContextf(err, "method %v", methodName)
-   return err
-   }
+   // Check that the first input is not an Iter or ReIter (those aren't 
valid
+   // as the first main input).
+   first := method.Param[pos].Kind
+   if first != funcx.FnValue {
+   err := errors.New("first main input parameter must be value 
type")
 
 Review comment:
   ...must be a value type..
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407186)
Time Spent: 7h 50m  (was: 7h 40m)

> Go SplittableDoFn support
> -
>
> Key: BEAM-3301
> URL: https://issues.apache.org/jira/browse/BEAM-3301
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> SDFs will be the only way to add streaming and liquid sharded IO for Go.
> Design doc: https://s.apache.org/splittable-do-fn



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407185=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407185
 ]

ASF GitHub Bot logged work on BEAM-3301:


Author: ASF GitHub Bot
Created on: 20/Mar/20 19:06
Start Date: 20/Mar/20 19:06
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #11179: [BEAM-3301] 
Bugfix in DoFn validation.
URL: https://github.com/apache/beam/pull/11179#discussion_r395838378
 
 

 ##
 File path: sdks/go/pkg/beam/pcollection.go
 ##
 @@ -60,6 +60,12 @@ func (p PCollection) Type() FullType {
return p.n.Type()
 }
 
+// OutputsKV returns whether the output of this PCollection are single value
+// elements or KV pairs.
+func (p PCollection) OutputsKV() bool {
 
 Review comment:
   That's my usual guideline. If I use it once, keep it in place; twice, copy 
it; three times, helper function.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407185)
Time Spent: 7h 40m  (was: 7.5h)

> Go SplittableDoFn support
> -
>
> Key: BEAM-3301
> URL: https://issues.apache.org/jira/browse/BEAM-3301
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> SDFs will be the only way to add streaming and liquid sharded IO for Go.
> Design doc: https://s.apache.org/splittable-do-fn



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=407183=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407183
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 20/Mar/20 19:02
Start Date: 20/Mar/20 19:02
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #11183: [BEAM-8889]add 
experiment flag use_grpc_for_gcs
URL: https://github.com/apache/beam/pull/11183#issuecomment-601864385
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407183)
Remaining Estimate: 148h 20m  (was: 148.5h)
Time Spent: 19h 40m  (was: 19.5h)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 19h 40m
>  Remaining Estimate: 148h 20m
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=407184=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407184
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 20/Mar/20 19:02
Start Date: 20/Mar/20 19:02
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #11183: [BEAM-8889]add 
experiment flag use_grpc_for_gcs
URL: https://github.com/apache/beam/pull/11183#issuecomment-601864474
 
 
   LGTM. Thanks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407184)
Remaining Estimate: 148h 10m  (was: 148h 20m)
Time Spent: 19h 50m  (was: 19h 40m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 19h 50m
>  Remaining Estimate: 148h 10m
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9563) TFRecordIO inefficient read from sideinput causing pipeline to be slow - fix

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9563?focusedWorklogId=407182=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407182
 ]

ASF GitHub Bot logged work on BEAM-9563:


Author: ASF GitHub Bot
Created on: 20/Mar/20 19:02
Start Date: 20/Mar/20 19:02
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #11180: [BEAM-9563] Change 
ToListCombineFn access level to private
URL: https://github.com/apache/beam/pull/11180#issuecomment-601864250
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407182)
Time Spent: 0.5h  (was: 20m)

> TFRecordIO inefficient read from sideinput causing pipeline to be slow - fix
> 
>
> Key: BEAM-9563
> URL: https://issues.apache.org/jira/browse/BEAM-9563
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: Not applicable
>Reporter: Piotr Szuberski
>Assignee: Piotr Szuberski
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The class ToListCombineFn in the previous task has public access level but 
> can be private.
> sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java:420
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9383) Staging Dataflow artifacts from environment

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9383?focusedWorklogId=407181=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407181
 ]

ASF GitHub Bot logged work on BEAM-9383:


Author: ASF GitHub Bot
Created on: 20/Mar/20 19:01
Start Date: 20/Mar/20 19:01
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #11039: [BEAM-9383] 
Staging Dataflow artifacts from environment
URL: https://github.com/apache/beam/pull/11039#issuecomment-601864064
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407181)
Time Spent: 2h  (was: 1h 50m)

> Staging Dataflow artifacts from environment
> ---
>
> Key: BEAM-9383
> URL: https://issues.apache.org/jira/browse/BEAM-9383
> Project: Beam
>  Issue Type: Sub-task
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Staging Dataflow artifacts from environment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407180=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407180
 ]

ASF GitHub Bot logged work on BEAM-3301:


Author: ASF GitHub Bot
Created on: 20/Mar/20 19:01
Start Date: 20/Mar/20 19:01
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11179: [BEAM-3301] 
Bugfix in DoFn validation.
URL: https://github.com/apache/beam/pull/11179#discussion_r395836071
 
 

 ##
 File path: sdks/go/pkg/beam/pcollection.go
 ##
 @@ -60,6 +60,12 @@ func (p PCollection) Type() FullType {
return p.n.Type()
 }
 
+// OutputsKV returns whether the output of this PCollection are single value
+// elements or KV pairs.
+func (p PCollection) OutputsKV() bool {
 
 Review comment:
   I was originally picturing this as a helper function for callers of NewDoFn. 
It seems easy for future callers to make a mistake and only check if the 
PCollection is a KV and forget to check for CoGBK, so I thought a helper method 
would be useful in the future.
   
   1. I missed that pardo.go is in the same package as pcollection.go. I'm also 
leaning to not expanding the user surface if it's not necessary.
   
   2 & 3. Yeah I was unsure about the name, since it's not technically checking 
for KVs, I just couldn't think of anything better. IsKeyed sounds good though.
   
   4. That's the other part I was debating. My goal was to make it easy to 
avoid the mistake in the future, but thinking about it... It seems unlikely 
that anyone else would even be using this code, and I would expect that if they 
were they were an advanced user doing something tricky.
   
   I think I'll go with just putting the conditional in pardo.go and adding a 
comment. We can always turn it into a helper later if it does get used in 
multiple places.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407180)
Time Spent: 7.5h  (was: 7h 20m)

> Go SplittableDoFn support
> -
>
> Key: BEAM-3301
> URL: https://issues.apache.org/jira/browse/BEAM-3301
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> SDFs will be the only way to add streaming and liquid sharded IO for Go.
> Design doc: https://s.apache.org/splittable-do-fn



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9339) Declare capabilities in SDK environments

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9339?focusedWorklogId=407176=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407176
 ]

ASF GitHub Bot logged work on BEAM-9339:


Author: ASF GitHub Bot
Created on: 20/Mar/20 18:55
Start Date: 20/Mar/20 18:55
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11162: [BEAM-9339, 
BEAM-2939] Drop splittable field from ParDoPayload, add splittable dofn 
requirement to Python SDK.
URL: https://github.com/apache/beam/pull/11162
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407176)
Time Spent: 7h 10m  (was: 7h)

> Declare capabilities in SDK environments
> 
>
> Key: BEAM-9339
> URL: https://issues.apache.org/jira/browse/BEAM-9339
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9558) Make end of data channel explicit

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9558?focusedWorklogId=407169=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407169
 ]

ASF GitHub Bot logged work on BEAM-9558:


Author: ASF GitHub Bot
Created on: 20/Mar/20 18:48
Start Date: 20/Mar/20 18:48
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #11173: [BEAM-9558] Add 
explicit end bit for data channel.
URL: https://github.com/apache/beam/pull/11173#issuecomment-601858195
 
 
   So I spent quite a bit of time refactoring things to do the separate "end 
all streams for this bundle" proposal that we discussed, and it turns out that 
gets quite messy. (One of the things that makes this less nice is, as things 
stand, there may be multiple data stream endpoints for a single bundle. The 
need to preserve ordering between the data and end markers across threads other 
than those handling the grpc request, and the fact that buffering streams need 
to flush so closing them all needs to happen anyway, reduces the potential 
wins.) Combined with the fact that it gets even uglier to try to support both 
modes simultaneously during any transition period, I'm just going to go for 
this simpler, original fix for now. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407169)
Time Spent: 0.5h  (was: 20m)

> Make end of data channel explicit
> -
>
> Key: BEAM-9558
> URL: https://issues.apache.org/jira/browse/BEAM-9558
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently the end of a data channel is implicitly marked by sending an empty 
> data block. The protocol would be simplified by making this explicit, and it 
> would also prevent data loss bugs that might occur by accidentally sending an 
> empty block. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=407167=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407167
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 20/Mar/20 18:42
Start Date: 20/Mar/20 18:42
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #11070: [BEAM-8280] Blog 
post: Python typing changes
URL: https://github.com/apache/beam/pull/11070#discussion_r395826290
 
 

 ##
 File path: website/src/_posts/2020-03-06-python-typing.md
 ##
 @@ -0,0 +1,117 @@
+---
+layout: post
+title:  "Python SDK Typing Changes"
+date:   2020-03-06 00:00:01 -0800
+excerpt_separator: 
+categories: blog python typing
+authors:
+  - chadrik
+  - udim
+
+---
+
+
+TODO excerpt
+
+
+
+Python supports type annotations on functions (PEP 484). Static type checkers,
+such as mypy, are used to verify adherence to these types.
+For example:
+```py
+def f(v: int) -> int:
+  return v[0]
+```
+Running mypy on the above code will give the error:
+`Value of type "int" is not indexable`.
+
+We've recently made changes to Beam in 2 areas:
+
+Adding type hints throughout Beam. TODO expand
+
+Second, we've added support for Python 3 type annotations. This allows SDK
+users to specify a DoFn's type hints in one place. 
+We've also expanded Beam's support of `typing` module types.
+
+For more background see: 
+[Ensuring Python Type 
Safety](https://beam.apache.org/documentation/sdks/python-type-safety/).
+
+# Beam Is Typed
+
+TODO
 
 Review comment:
   So, no immediate rush. We are also adjusting to WFH
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407167)
Time Spent: 9h 50m  (was: 9h 40m)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements

2020-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=407164=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407164
 ]

ASF GitHub Bot logged work on BEAM-9562:


Author: ASF GitHub Bot
Created on: 20/Mar/20 18:37
Start Date: 20/Mar/20 18:37
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #11177: [BEAM-9562] 
Add Timer to Elements proto representation.
URL: https://github.com/apache/beam/pull/11177
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 407164)
Time Spent: 1h  (was: 50m)

> Remove timer from PCollection and treat timers as Elements 
> ---
>
> Key: BEAM-9562
> URL: https://issues.apache.org/jira/browse/BEAM-9562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >