[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?
[ https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407367 ] ASF GitHub Bot logged work on BEAM-9542: Author: ASF GitHub Bot Created on: 21/Mar/20 03:34 Start Date: 21/Mar/20 03:34 Worklog Time Spent: 10m Work Description: suztomo commented on issue #11168: [BEAM-9542] Limit and clarify the effect of "force" in Java build URL: https://github.com/apache/beam/pull/11168#issuecomment-601987472 Because the effect of `force` is only for tests. The compileJava configuration still fetches the latest google-api-client, which is 1.30.8, which has a bug of android annotation dependency. ``` 17:21:41 Execution failed for task ':sdks:java:testing:test-utils:compileJava'. 17:21:41 > Could not resolve all files for configuration ':sdks:java:testing:test-utils:compileClasspath'. 17:21:41> Could not find androidx.annotation:annotation:1.1.0. 17:21:41 Searched in the following locations: 17:21:41- file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41 Required by: 17:21:41 project :sdks:java:testing:test-utils > project :sdks:java:extensions:google-cloud-platform-core > com.google.api-client:google-api-client:1.30.8 ``` Maybe I'll need to upgrade google-api-client to the latest version that has the fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407367) Time Spent: 4h 40m (was: 4.5h) > Where the BeamModulePlugin's force is needed? > - > > Key: BEAM-9542 > URL: https://issues.apache.org/jira/browse/BEAM-9542 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735 > {noformat} > > Task :sdks:java:core:compileTestJava FAILED > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > Note: Some input files use or override a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 2 errors > <-> 36% EXECUTING [19m 37s] > {noformat} >
[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?
[ https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407366 ] ASF GitHub Bot logged work on BEAM-9542: Author: ASF GitHub Bot Created on: 21/Mar/20 03:34 Start Date: 21/Mar/20 03:34 Worklog Time Spent: 10m Work Description: suztomo commented on issue #11168: [BEAM-9542] Limit and clarify the effect of "force" in Java build URL: https://github.com/apache/beam/pull/11168#issuecomment-601987472 Because the effect of `force` is only for tests. The compileJava configuration still fetches the latest google-api-client, which is 1.30.8, which has a bug of android annotation dependency. ``` 17:21:41 Execution failed for task ':sdks:java:testing:test-utils:compileJava'. 17:21:41 > Could not resolve all files for configuration ':sdks:java:testing:test-utils:compileClasspath'. 17:21:41> Could not find androidx.annotation:annotation:1.1.0. 17:21:41 Searched in the following locations: 17:21:41- file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41 Required by: 17:21:41 project :sdks:java:testing:test-utils > project :sdks:java:extensions:google-cloud-platform-core > com.google.api-client:google-api-client:1.30.8 ``` Maybe I'll need to update google-api-client. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407366) Time Spent: 4.5h (was: 4h 20m) > Where the BeamModulePlugin's force is needed? > - > > Key: BEAM-9542 > URL: https://issues.apache.org/jira/browse/BEAM-9542 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735 > {noformat} > > Task :sdks:java:core:compileTestJava FAILED > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > Note: Some input files use or override a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 2 errors > <-> 36% EXECUTING [19m 37s] > {noformat} > Memo for my Mac: > {noformat} >
[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?
[ https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407365 ] ASF GitHub Bot logged work on BEAM-9542: Author: ASF GitHub Bot Created on: 21/Mar/20 03:33 Start Date: 21/Mar/20 03:33 Worklog Time Spent: 10m Work Description: suztomo commented on issue #11168: [BEAM-9542] Limit and clarify the effect of "force" in Java build URL: https://github.com/apache/beam/pull/11168#issuecomment-601987472 Because the effect of `force` is only for tests. The compileJava configuration still fetches the latest google-cloud-api, which is 1.30.8, which has a bug of android annotation dependency. ``` 17:21:41 Execution failed for task ':sdks:java:testing:test-utils:compileJava'. 17:21:41 > Could not resolve all files for configuration ':sdks:java:testing:test-utils:compileClasspath'. 17:21:41> Could not find androidx.annotation:annotation:1.1.0. 17:21:41 Searched in the following locations: 17:21:41- file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- file:/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_PR/src/sdks/java/testing/test-utils/offline-repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://repo.maven.apache.org/maven2/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- file:/home/jenkins/.m2/repository/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://jcenter.bintray.com/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://oss.sonatype.org/content/repositories/staging/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://repository.apache.org/snapshots/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41- https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.pom 17:21:41- https://repository.apache.org/content/repositories/releases/androidx/annotation/annotation/1.1.0/annotation-1.1.0.jar 17:21:41 Required by: 17:21:41 project :sdks:java:testing:test-utils > project :sdks:java:extensions:google-cloud-platform-core > com.google.api-client:google-api-client:1.30.8 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407365) Time Spent: 4h 20m (was: 4h 10m) > Where the BeamModulePlugin's force is needed? > - > > Key: BEAM-9542 > URL: https://issues.apache.org/jira/browse/BEAM-9542 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735 > {noformat} > > Task :sdks:java:core:compileTestJava FAILED > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > Note: Some input files use or override a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 2 errors > <-> 36% EXECUTING [19m 37s] > {noformat} > Memo for my Mac: > {noformat} > suztomo-macbookpro44% ./gradlew -p sdks/java check -x >
[jira] [Work logged] (BEAM-8078) streaming_wordcount_debugging.py is missing a test
[ https://issues.apache.org/jira/browse/BEAM-8078?focusedWorklogId=407351=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407351 ] ASF GitHub Bot logged work on BEAM-8078: Author: ASF GitHub Bot Created on: 21/Mar/20 01:56 Start Date: 21/Mar/20 01:56 Worklog Time Spent: 10m Work Description: udim commented on issue #10914: [BEAM-8078] streaming_wordcount_debugging.py is missing a test URL: https://github.com/apache/beam/pull/10914#issuecomment-601976783 will merge once I get postcommit to run this test and verify it passes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407351) Time Spent: 4h 20m (was: 4h 10m) > streaming_wordcount_debugging.py is missing a test > -- > > Key: BEAM-8078 > URL: https://issues.apache.org/jira/browse/BEAM-8078 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Aleksey Vysotin >Priority: Minor > Labels: beginner, easy, newbie, starter > Time Spent: 4h 20m > Remaining Estimate: 0h > > It's example code and should have a basic_test (like the other wordcount > variants in [1]) to at least verify that it runs in the latest Beam release. > [1] > https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8078) streaming_wordcount_debugging.py is missing a test
[ https://issues.apache.org/jira/browse/BEAM-8078?focusedWorklogId=407348=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407348 ] ASF GitHub Bot logged work on BEAM-8078: Author: ASF GitHub Bot Created on: 21/Mar/20 01:54 Start Date: 21/Mar/20 01:54 Worklog Time Spent: 10m Work Description: udim commented on issue #10914: [BEAM-8078] streaming_wordcount_debugging.py is missing a test URL: https://github.com/apache/beam/pull/10914#issuecomment-601976548 run python 3.7 postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407348) Time Spent: 4h 10m (was: 4h) > streaming_wordcount_debugging.py is missing a test > -- > > Key: BEAM-8078 > URL: https://issues.apache.org/jira/browse/BEAM-8078 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Aleksey Vysotin >Priority: Minor > Labels: beginner, easy, newbie, starter > Time Spent: 4h 10m > Remaining Estimate: 0h > > It's example code and should have a basic_test (like the other wordcount > variants in [1]) to at least verify that it runs in the latest Beam release. > [1] > https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9468) Add Google Cloud Healthcare API IO Connectors
[ https://issues.apache.org/jira/browse/BEAM-9468?focusedWorklogId=407342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407342 ] ASF GitHub Bot logged work on BEAM-9468: Author: ASF GitHub Bot Created on: 21/Mar/20 01:06 Start Date: 21/Mar/20 01:06 Worklog Time Spent: 10m Work Description: jaketf commented on pull request #11151: [BEAM-9468] Hl7v2 io URL: https://github.com/apache/beam/pull/11151#discussion_r395946114 ## File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/healthcare/HL7v2IO.java ## @@ -0,0 +1,620 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.io.gcp.healthcare; + +import com.google.api.services.healthcare.v1alpha2.model.IngestMessageResponse; +import com.google.api.services.healthcare.v1alpha2.model.Message; +import com.google.auto.value.AutoValue; +import java.io.IOException; +import java.text.ParseException; +import java.util.Collections; +import java.util.List; +import org.apache.beam.sdk.io.gcp.datastore.AdaptiveThrottler; +import org.apache.beam.sdk.io.gcp.pubsub.PubsubIO; +import org.apache.beam.sdk.metrics.Counter; +import org.apache.beam.sdk.metrics.Metrics; +import org.apache.beam.sdk.transforms.Create; +import org.apache.beam.sdk.transforms.DoFn; +import org.apache.beam.sdk.transforms.PTransform; +import org.apache.beam.sdk.transforms.ParDo; +import org.apache.beam.sdk.util.Sleeper; +import org.apache.beam.sdk.values.PBegin; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionTuple; +import org.apache.beam.sdk.values.PDone; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.sdk.values.TupleTagList; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Throwables; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * {@link HL7v2IO} provides an API for reading from and writing to https://cloud.google.com/healthcare/docs/concepts/hl7v2;>Google Cloud Healthcare HL7v2 API. + * + * + * Read HL7v2 Messages are fetched from the HL7v2 store based on the {@link PCollection} of of + * message IDs {@link String}s produced by the {@link AutoValue_HL7v2IO_Read#getMessageIDTransform} + * as {@link PCollectionTuple}*** containing an {@link HL7v2IO.Read#OUT} tag for successfully + * fetched messages and a {@link HL7v2IO.Read#DEAD_LETTER} tag for message IDs that could not be + * fetched. + * + * HL7v2 stores can be read in several ways: - Unbounded: based on the Pub/Sub Notification + * Channel {@link HL7v2IO#readNotificationSubscription(String)} - Bounded: based on reading an + * entire HL7v2 store (or stores) {@link HL7v2IO#readHL7v2Store(String)} - Bounded: based on reading + * an HL7v2 store with a filter + * + * Note, due to the flexibility of this Read transform, this must output a dead letter queue. + * This handles the scenario where the the PTransform that populates a PCollection of message IDs + * contains message IDs that do not exist in the HL7v2 stores. + * + * Example: + * + * {@code + * PipelineOptions options = ...; + * Pipeline pipeline = Pipeline.create(options) + * + * + * PCollectionTuple messages = pipeline.apply( + * new HLv2IO.readNotifications(options.getNotificationSubscription()) + * + * // Write errors to your favorite dead letter queue (e.g. Pub/Sub, GCS, BigQuery) + * messages.get(PubsubNotificationToHL7v2Message.DEAD_LETTER) + *.apply("WriteToDeadLetterQueue", ...); + * + * PCollection fetchedMessages = fetchResults.get(PubsubNotificationToHL7v2Message.OUT) + *.apply("ExtractFetchedMessage", + *MapElements + *.into(TypeDescriptor.of(Message.class)) + *.via(FailsafeElement::getPayload)); + * + * // Go about your happy path transformations. + * PCollection out = fetchedMessages.apply("ProcessFetchedMessages", ...); + * + * // Write using the Message.Ingest method of the HL7v2 REST API. + * out.apply(HL7v2IO.ingestMessages(options.getOutputHL7v2Store())); + * + * pipeline.run(); + * + * }*** + * + */ +public class HL7v2IO { + // TODO add metrics
[jira] [Work logged] (BEAM-8019) Support cross-language transforms for DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-8019?focusedWorklogId=407340=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407340 ] ASF GitHub Bot logged work on BEAM-8019: Author: ASF GitHub Bot Created on: 21/Mar/20 00:50 Start Date: 21/Mar/20 00:50 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11185: [BEAM-8019] Some generalizations to support cross-language transforms. URL: https://github.com/apache/beam/pull/11185#discussion_r395944333 ## File path: sdks/python/apache_beam/pipeline.py ## @@ -1127,30 +1133,79 @@ def transform_to_runner_api(transform, # type: Optional[ptransform.PTransform] def from_runner_api(proto, # type: beam_runner_api_pb2.PTransform context # type: PipelineContext ): +side_input_tags = [] +if common_urns.primitives.PAR_DO.urn == proto.spec.urn: + # Preserving side input tags. + from apache_beam.utils import proto_utils + from apache_beam.portability.api import beam_runner_api_pb2 + payload = ( + proto_utils.parse_Bytes( + proto.spec.payload, beam_runner_api_pb2.ParDoPayload)) + for tag, si in payload.side_inputs.items(): +side_input_tags.append(tag) + # type: (...) -> AppliedPTransform -def is_side_input(tag): - # type: (str) -> bool +def is_python_side_input(tag): # As per named_inputs() above. - return tag.startswith('side') + return re.match(SIDE_INPUT_REGEX, tag) + +all_input_tags = [tag for tag, id in proto.inputs.items()] + +# All side inputs have to be available in input tags +python_indexed_side_inputs = False +for side_tag in side_input_tags: + if side_tag not in all_input_tags: +raise Exception( +'Side input tag %s is not available in list of input tags %r' % +(side_tag, all_input_tags)) + + # We process Python and external side inputs differently. We fail early + # here if we cannot decide which way to go. + if is_python_side_input(side_tag): +python_indexed_side_inputs = True + else: +if python_indexed_side_inputs: + raise Exception( + 'Cannot process side inputs due to inconsistent sideinput ' + 'naming. If using an external transform consider re-naming side ' + 'inputs to not match Python indexed format %s' % + SIDE_INPUT_REGEX) main_inputs = [ context.pcollections.get_by_id(id) for tag, -id in proto.inputs.items() if not is_side_input(tag) +id in proto.inputs.items() if tag not in side_input_tags ] -# Ordering is important here. -indexed_side_inputs = [ -(get_sideinput_index(tag), context.pcollections.get_by_id(id)) for tag, -id in proto.inputs.items() if is_side_input(tag) -] -side_inputs = [si for _, si in sorted(indexed_side_inputs)] +if python_indexed_side_inputs: + # Ordering is important here. Review comment: This all seems rather fragile. Would it be possible to just make side_inputs a dict everywhere in the internal SDK representation? (Or is there introspection with the legacy worker code that would make this hard?) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407340) Time Spent: 8h 40m (was: 8.5h) > Support cross-language transforms for DataflowRunner > > > Key: BEAM-8019 > URL: https://issues.apache.org/jira/browse/BEAM-8019 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Chamikara Madhusanka Jayalath >Assignee: Chamikara Madhusanka Jayalath >Priority: Major > Time Spent: 8h 40m > Remaining Estimate: 0h > > This is to capture the Beam changes needed for this task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8019) Support cross-language transforms for DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-8019?focusedWorklogId=407339=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407339 ] ASF GitHub Bot logged work on BEAM-8019: Author: ASF GitHub Bot Created on: 21/Mar/20 00:50 Start Date: 21/Mar/20 00:50 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11185: [BEAM-8019] Some generalizations to support cross-language transforms. URL: https://github.com/apache/beam/pull/11185#discussion_r395944033 ## File path: sdks/python/apache_beam/coders/coders.py ## @@ -1383,22 +1377,67 @@ def from_runner_api_parameter(payload, components, context): write_state_threshold=int(payload)) -class RunnerAPICoderHolder(Coder): - """A `Coder` that holds a runner API `Coder` proto. +class ElementTypeHolder(typehints.TypeConstraint): + """A dummy element type for external coders that cannot be parsed in Python""" - This is used for coders for which corresponding objects cannot be - initialized in Python SDK. For example, coders for remote SDKs that may - be available in Python SDK transform graph when expanding a cross-language - transform. - """ - def __init__(self, proto): -self._proto = proto + def __init__(self, coder, context): +self.coder = coder +self.context = context - def proto(self): -return self._proto - def to_runner_api(self, context): -return self._proto +class ExternalCoder(Coder): - def to_type_hint(self): -return Any + coder_count = 0 + + def __init__(self, element_type_holder): +self.element_type_holder = element_type_holder + + def as_cloud_object(self, coders_context=None): +if not coders_context: + raise Exception( + 'coders_context must be specified to correctly encode external coders') +coder_id = coders_context.get_by_proto( +self.element_type_holder.coder, deduplicate=True) + +coder_proto = self.element_type_holder.coder + + +kind_str = 'kind:external' + str(ExternalCoder.coder_count) +ExternalCoder.coder_count = ExternalCoder.coder_count + 1 +component_encodings = [] +if coder_proto.spec.urn == 'beam:coder:kv:v1': + kind_str = 'kind:pair' + for component_coder_id in coder_proto.component_coder_ids: +component_encodings.append({ +'@type': 'kind:external' + str(ExternalCoder.coder_count), +'pipeline_proto_coder_id': component_coder_id +}) +ExternalCoder.coder_count = ExternalCoder.coder_count + 1 + +value = { +# This is a placeholder type. Dataflow will get the actual coder from +# pipeline proto using the pipeline_proto_coder_id property. +'@type': kind_str, +'pipeline_proto_coder_id': coder_id +} +if component_encodings: + value['is_pair_like'] = True + value['component_encodings'] = component_encodings + +return value + + @staticmethod + def from_type_hint(typehint, unused_registry): +if isinstance(typehint, ElementTypeHolder): + return ExternalCoder(typehint) +else: + raise ValueError(( + 'Expected an instance of ElementTypeHolder' + ', but got a %s' % typehint)) + Review comment: `yapf -ir path/to/files/...` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407339) Time Spent: 8.5h (was: 8h 20m) > Support cross-language transforms for DataflowRunner > > > Key: BEAM-8019 > URL: https://issues.apache.org/jira/browse/BEAM-8019 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Chamikara Madhusanka Jayalath >Assignee: Chamikara Madhusanka Jayalath >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > This is to capture the Beam changes needed for this task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9446) FlinkRunner discards parallelism and execution_mode_for_batch pipeline options
[ https://issues.apache.org/jira/browse/BEAM-9446?focusedWorklogId=407335=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407335 ] ASF GitHub Bot logged work on BEAM-9446: Author: ASF GitHub Bot Created on: 21/Mar/20 00:08 Start Date: 21/Mar/20 00:08 Worklog Time Spent: 10m Work Description: ibzib commented on issue #11189: [BEAM-9446] Retain unknown arguments when using uber jar job server. URL: https://github.com/apache/beam/pull/11189#issuecomment-601960942 Run PortableJar_Flink PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407335) Time Spent: 3h 20m (was: 3h 10m) > FlinkRunner discards parallelism and execution_mode_for_batch pipeline options > -- > > Key: BEAM-9446 > URL: https://issues.apache.org/jira/browse/BEAM-9446 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Fix For: 2.21.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > I need these options for TFX, but they're being discarded (I believe they are > normally supplied by the job server). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407332 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 21/Mar/20 00:03 Start Date: 21/Mar/20 00:03 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395933544 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,121 +330,148 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } - // A set of key+value labels which define the scope of the metric. + + // A set of key and value labels which define the scope of the metric. For + // well known URNs, the set of required labels is provided by the associated + // MonitoringInfoSpec. + // // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } +// A set of well known URNs that specify the encoding and aggregation method. message MonitoringInfoTypeUrns { enum Enum { +// Represents an integer counter where values are summed across bundles. +// +// Encoding: +// - value: beam:coder:varint:v1 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) = -"beam:metrics:sum_int_64"]; - -DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:distribution_int_64"]; - -LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:latest_int_64"]; - -// iterable is encoded with a beam:coder:double:v1 coder for each -// element. -LATEST_DOUBLES_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:latest_doubles"]; - } -} - -message Metric { - // (Required) The data for this metric. - oneof data { -CounterData counter_data = 1; -DistributionData distribution_data = 2; -ExtremaData extrema_data = 3; - } -} - -// Data associated with a Counter or Gauge metric. -// This is designed to be compatible with metric collection -// systems such as DropWizard. -message CounterData { - oneof value { -int64 int64_value = 1; -double double_value = 2; -string string_value = 3; - } -} - -// Extrema messages are used for calculating -// Top-N/Bottom-N metrics. -message ExtremaData { - oneof extrema { -IntExtremaData int_extrema_data = 1; -DoubleExtremaData double_extrema_data = 2; - } -} - -message IntExtremaData { - repeated int64 int_values = 1; -} +"beam:metrics:sum_int64:v1"]; + +// Represents a double counter where values are summed across bundles. +// +// Encoding: +// value: beam:coder:double:v1 +SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = +"beam:metrics:sum_double:v1"]; + +// Represents a distribution of an integer value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:varint:v1 +// - min: beam:coder:varint:v1 +// - max: beam:coder:varint:v1 +DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents a distribution of a double value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:double:v1 +// - min: beam:coder:double:v1 +// - max:
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407334 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 21/Mar/20 00:03 Start Date: 21/Mar/20 00:03 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395936249 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -52,38 +55,157 @@ message Annotation { string value = 2; } -// Populated MonitoringInfoSpecs for specific URNs. -// Indicating the required fields to be set. -// SDKs and RunnerHarnesses can load these instances into memory and write a -// validator or code generator to assist with populating and validating -// MonitoringInfo protos. +// A set of well known MonitoringInfo specifications. message MonitoringInfoSpecs { enum Enum { -// TODO(BEAM-6926): Add the PTRANSFORM name as a required label after -// upgrading the python SDK. -USER_COUNTER = 0 [(monitoring_info_spec) = { - urn: "beam:metric:user", - type_urn: "beam:metrics:sum_int_64", +// Represents an integer counter where values are summed across bundles. +USER_SUM_INT64 = 0 [(monitoring_info_spec) = { + urn: "beam:metric:user:v1", + type: "beam:metrics:sum_int64:v1", required_labels: ["PTRANSFORM", "NAMESPACE", "NAME"], annotations: [{ key: "description", -value: "URN utilized to report user numeric counters." +value: "URN utilized to report user metric." }] }]; -ELEMENT_COUNT = 1 [(monitoring_info_spec) = { +// Represents a double counter where values are summed across bundles. +USER_SUM_DOUBLE = 1 [(monitoring_info_spec) = { + urn: "beam:metric:user:v1", Review comment: Should it be legal to have two counters with the same URN but different types. (This seems to fly agains the idea of a URN being a Unique identifier.) Seeing this explosion of types, however, makes it feel like we should not be manually be enumerating them (or at least I'm struggling to see the value in that over just saying that user counters may have any type). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407334) Time Spent: 21.5h (was: 21h 20m) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 21.5h > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407333 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 21/Mar/20 00:03 Start Date: 21/Mar/20 00:03 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395935001 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,101 +215,127 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } + // A set of key+value labels which define the scope of the metric. // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } message MonitoringInfoTypeUrns { enum Enum { +// Represents an integer counter where values are summed across bundles. +// +// Encoding: +// - value: beam:coder:varint:v1 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) = Review comment: Rather than manually write out the cross product, how about we define `{sum,min,max,top_n,bottom_n,distribtuion,latest}_{int64,double,string}` types as having known semantics and encoding? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407333) Time Spent: 21h 20m (was: 21h 10m) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 21h 20m > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9446) FlinkRunner discards parallelism and execution_mode_for_batch pipeline options
[ https://issues.apache.org/jira/browse/BEAM-9446?focusedWorklogId=407331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407331 ] ASF GitHub Bot logged work on BEAM-9446: Author: ASF GitHub Bot Created on: 20/Mar/20 23:50 Start Date: 20/Mar/20 23:50 Worklog Time Spent: 10m Work Description: ibzib commented on pull request #11189: [BEAM-9446] Retain unknown arguments when using uber jar job server. URL: https://github.com/apache/beam/pull/11189 See #11052 for context. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407330=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407330 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 23:48 Start Date: 20/Mar/20 23:48 Worklog Time Spent: 10m Work Description: youngoli commented on issue #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188#issuecomment-601956825 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407330) Time Spent: 8.5h (was: 8h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407329 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 23:48 Start Date: 20/Mar/20 23:48 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11188: [BEAM-3301] Adding restriction trackers and validation. URL: https://github.com/apache/beam/pull/11188 Adding RTrackers as an interface, and adding them to the SDF validation. I think this is the last real code involved in SDF validation, assuming I'm not forgetting anything. I might do a second pass on the error messages because they seem inconsistent with the old error messages, but the next major task is going to be working on the SDF exec code and doing some testing with the Flink runner. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [x] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable
[ https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=407326=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407326 ] ASF GitHub Bot logged work on BEAM-8280: Author: ASF GitHub Bot Created on: 20/Mar/20 23:41 Start Date: 20/Mar/20 23:41 Worklog Time Spent: 10m Work Description: udim commented on pull request #10717: [BEAM-8280] Enable type hint annotations URL: https://github.com/apache/beam/pull/10717 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407326) Time Spent: 10h 10m (was: 10h) > re-enable IOTypeHints.from_callable > --- > > Key: BEAM-8280 > URL: https://issues.apache.org/jira/browse/BEAM-8280 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 10h 10m > Remaining Estimate: 0h > > See https://issues.apache.org/jira/browse/BEAM-8279 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7923) Interactive Beam
[ https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407327=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407327 ] ASF GitHub Bot logged work on BEAM-7923: Author: ASF GitHub Bot Created on: 20/Mar/20 23:41 Start Date: 20/Mar/20 23:41 Worklog Time Spent: 10m Work Description: davidyan74 commented on pull request #11174: [BEAM-7923] Pop failed transform when error is raised URL: https://github.com/apache/beam/pull/11174#discussion_r395933604 ## File path: sdks/python/apache_beam/pipeline.py ## @@ -307,58 +307,61 @@ def _replace_if_needed(self, original_transform_node): elif len(inputs) == 0: input_node = pvalue.PBegin(self.pipeline) - # We have to add the new AppliedTransform to the stack before expand() - # and pop it out later to make sure that parts get added correctly. - self.pipeline.transforms_stack.append(replacement_transform_node) - - # Keeping the same label for the replaced node but recursively - # removing labels of child transforms of original transform since they - # will be replaced during the expand below. This is needed in case - # the replacement contains children that have labels that conflicts - # with labels of the children of the original. - self.pipeline._remove_labels_recursively(original_transform_node) - - new_output = replacement_transform.expand(input_node) - assert isinstance( - new_output, (dict, pvalue.PValue, pvalue.DoOutputsTuple)) - - if isinstance(new_output, pvalue.PValue): -new_output.element_type = None -self.pipeline._infer_result_type( -replacement_transform, inputs, new_output) - - if isinstance(new_output, dict): -for new_tag, new_pcoll in new_output.items(): - replacement_transform_node.add_output(new_pcoll, new_tag) - elif isinstance(new_output, pvalue.DoOutputsTuple): -replacement_transform_node.add_output( -new_output, new_output._main_tag) - else: -replacement_transform_node.add_output(new_output, new_output.tag) - - # Recording updated outputs. This cannot be done in the same visitor - # since if we dynamically update output type here, we'll run into - # errors when visiting child nodes. - # - # NOTE: When replacing multiple outputs, the replacement PCollection - # tags must have a matching tag in the original transform. - if isinstance(new_output, pvalue.PValue): -if not new_output.producer: - new_output.producer = replacement_transform_node -output_map[original_transform_node.outputs[new_output.tag]] = \ -new_output - elif isinstance(new_output, (pvalue.DoOutputsTuple, tuple)): -for pcoll in new_output: - if not pcoll.producer: -pcoll.producer = replacement_transform_node - output_map[original_transform_node.outputs[pcoll.tag]] = pcoll - elif isinstance(new_output, dict): -for tag, pcoll in new_output.items(): - if not pcoll.producer: -pcoll.producer = replacement_transform_node - output_map[original_transform_node.outputs[tag]] = pcoll - - self.pipeline.transforms_stack.pop() + try: +# We have to add the new AppliedTransform to the stack before +# expand() and pop it out later to make sure that parts get added +# correctly. +self.pipeline.transforms_stack.append(replacement_transform_node) Review comment: This is a python newbie question. Does list.append() ever throw an exception? If so, should we move this out of the try block so that we don't pop() if list.append() fails? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407327) Time Spent: 9h 20m (was: 9h 10m) > Interactive Beam > > > Key: BEAM-7923 > URL: https://issues.apache.org/jira/browse/BEAM-7923 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 9h 20m > Remaining Estimate: 0h > > This is the top level ticket for all efforts leveraging [interactive >
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407325 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 23:40 Start Date: 20/Mar/20 23:40 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395933271 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,101 +215,127 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } + // A set of key+value labels which define the scope of the metric. // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } message MonitoringInfoTypeUrns { enum Enum { +// Represents an integer counter where values are summed across bundles. +// +// Encoding: +// - value: beam:coder:varint:v1 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) = -"beam:metrics:sum_int_64"]; - -DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:distribution_int_64"]; - -LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:latest_int_64"]; +"beam:metrics:sum_int64:v1"]; + +// Represents a double counter where values are summed across bundles. +// +// Encoding: +// value: beam:coder:double:v1 +SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = +"beam:metrics:sum_int64:v1"]; + +// Represents a distribution of an integer value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:varint:v1 +// - min: beam:coder:varint:v1 +// - max: beam:coder:varint:v1 +DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents a distribution of a double value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:double:v1 +// - min: beam:coder:double:v1 +// - max: beam:coder:double:v1 +DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:varint:v1 +LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:double:v1 +LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the largest set of integer values seen across bundles. +// +// Encoding: ... +// - valueX: beam:coder:varint:v1 +TOP_N_INT64_TYPE = 6
[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package
[ https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407323=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407323 ] ASF GitHub Bot logged work on BEAM-9537: Author: ASF GitHub Bot Created on: 20/Mar/20 23:31 Start Date: 20/Mar/20 23:31 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11153: [BEAM-9537] Adding a new module for FnApiRunner URL: https://github.com/apache/beam/pull/11153 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407323) Time Spent: 2h 40m (was: 2.5h) > Refactor FnApiRunner into its own package > - > > Key: BEAM-9537 > URL: https://issues.apache.org/jira/browse/BEAM-9537 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9398) Python type hints: AbstractDoFnWrapper does not wrap setup
[ https://issues.apache.org/jira/browse/BEAM-9398?focusedWorklogId=407322=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407322 ] ASF GitHub Bot logged work on BEAM-9398: Author: ASF GitHub Bot Created on: 20/Mar/20 23:26 Start Date: 20/Mar/20 23:26 Worklog Time Spent: 10m Work Description: robertwb commented on issue #0: [BEAM-9398] runtime_type_check: support setup URL: https://github.com/apache/beam/pull/0#issuecomment-601952041 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407322) Time Spent: 0.5h (was: 20m) > Python type hints: AbstractDoFnWrapper does not wrap setup > -- > > Key: BEAM-9398 > URL: https://issues.apache.org/jira/browse/BEAM-9398 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > And possibly other methods. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9558) Make end of data channel explicit
[ https://issues.apache.org/jira/browse/BEAM-9558?focusedWorklogId=407321=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407321 ] ASF GitHub Bot logged work on BEAM-9558: Author: ASF GitHub Bot Created on: 20/Mar/20 23:23 Start Date: 20/Mar/20 23:23 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11173: [BEAM-9558] Add explicit end bit for data channel. URL: https://github.com/apache/beam/pull/11173 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407321) Time Spent: 1h (was: 50m) > Make end of data channel explicit > - > > Key: BEAM-9558 > URL: https://issues.apache.org/jira/browse/BEAM-9558 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.21.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Currently the end of a data channel is implicitly marked by sending an empty > data block. The protocol would be simplified by making this explicit, and it > would also prevent data loss bugs that might occur by accidentally sending an > empty block. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7923) Interactive Beam
[ https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407316 ] ASF GitHub Bot logged work on BEAM-7923: Author: ASF GitHub Bot Created on: 20/Mar/20 23:14 Start Date: 20/Mar/20 23:14 Worklog Time Spent: 10m Work Description: aaltay commented on issue #11174: [BEAM-7923] Pop failed transform when error is raised URL: https://github.com/apache/beam/pull/11174#issuecomment-601949155 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407316) Time Spent: 9h 10m (was: 9h) > Interactive Beam > > > Key: BEAM-7923 > URL: https://issues.apache.org/jira/browse/BEAM-7923 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 9h 10m > Remaining Estimate: 0h > > This is the top level ticket for all efforts leveraging [interactive > Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]] > As the development goes, blocking tickets will be added to this one. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.
[ https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407315=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407315 ] ASF GitHub Bot logged work on BEAM-9340: Author: ASF GitHub Bot Created on: 20/Mar/20 23:06 Start Date: 20/Mar/20 23:06 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11165: [BEAM-9340] Populate requirements for Java. URL: https://github.com/apache/beam/pull/11165#issuecomment-601947065 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407315) Time Spent: 2h 20m (was: 2h 10m) > Properly populate pipeline proto requirements. > -- > > Key: BEAM-9340 > URL: https://issues.apache.org/jira/browse/BEAM-9340 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.20.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-9512) Anonymous structs have name collision in schema
[ https://issues.apache.org/jira/browse/BEAM-9512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Pilloud reassigned BEAM-9512: Assignee: Andrew Pilloud > Anonymous structs have name collision in schema > --- > > Key: BEAM-9512 > URL: https://issues.apache.org/jira/browse/BEAM-9512 > Project: Beam > Issue Type: Bug > Components: dsl-sql-zetasql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Labels: zetasql-compliance > > {code:java} > Mar 16, 2020 12:57:42 PM > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl > executeQuery > INFO: Processing Sql statement: SELECT STRUCT(ARRAY INT64>>[(11, 12), (21, 22)]) > Mar 16, 2020 12:57:42 PM > com.google.zetasql.io.grpc.internal.SerializingExecutor run > SEVERE: Exception while executing runnable > com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed@c73b08a > java.lang.IllegalArgumentException: Duplicate field added to schema > at org.apache.beam.sdk.schemas.Schema.(Schema.java:228) > at org.apache.beam.sdk.schemas.Schema.fromFields(Schema.java:966) > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:503) > at > org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toSchema(CalciteUtils.java:194) > at > org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toFieldType(CalciteUtils.java:251) > at > org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toFieldType(CalciteUtils.java:246) > at > org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toField(CalciteUtils.java:239) > at > org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toField(CalciteUtils.java:235) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at java.util.Iterator.forEachRemaining(Iterator.java:116) > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > at > org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toSchema(CalciteUtils.java:194) > at > org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toFieldType(CalciteUtils.java:251) > at > org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toField(CalciteUtils.java:239) > at > org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toField(CalciteUtils.java:235) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at java.util.Iterator.forEachRemaining(Iterator.java:116) > at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > at > org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils.toSchema(CalciteUtils.java:194) > at > cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:243) > at > com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423) > at > com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171) > at > com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) > at > com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711) > at > com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent
[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.
[ https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407311=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407311 ] ASF GitHub Bot logged work on BEAM-9340: Author: ASF GitHub Bot Created on: 20/Mar/20 23:02 Start Date: 20/Mar/20 23:02 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11165: [BEAM-9340] Populate requirements for Java. URL: https://github.com/apache/beam/pull/11165#issuecomment-601946285 This passes locally. org.apache.beam.sdk.transforms.ParDoLifecycleTest.testTeardownCalledAfterExceptionInFinishBundleStateful seems unrelated. Retrying again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407311) Time Spent: 2h (was: 1h 50m) > Properly populate pipeline proto requirements. > -- > > Key: BEAM-9340 > URL: https://issues.apache.org/jira/browse/BEAM-9340 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.20.0 > > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.
[ https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407312=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407312 ] ASF GitHub Bot logged work on BEAM-9340: Author: ASF GitHub Bot Created on: 20/Mar/20 23:02 Start Date: 20/Mar/20 23:02 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11165: [BEAM-9340] Populate requirements for Java. URL: https://github.com/apache/beam/pull/11165#issuecomment-601946329 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407312) Time Spent: 2h 10m (was: 2h) > Properly populate pipeline proto requirements. > -- > > Key: BEAM-9340 > URL: https://issues.apache.org/jira/browse/BEAM-9340 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.20.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9564) Remove insecure ssl options from MongoDBIO
[ https://issues.apache.org/jira/browse/BEAM-9564?focusedWorklogId=407309=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407309 ] ASF GitHub Bot logged work on BEAM-9564: Author: ASF GitHub Bot Created on: 20/Mar/20 22:57 Start Date: 20/Mar/20 22:57 Worklog Time Spent: 10m Work Description: iemejia commented on pull request #11186: [BEAM-9564] Remove insecure ssl options from MongoDBIO URL: https://github.com/apache/beam/pull/11186 These changes are not backwards compatible but this is intended to solve the potential security issues and also because MongoDBIO does not have strong backwards compatibility yet (aka it is still tagged as `@Experimental`). R: @alexvanboxel This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407309) Remaining Estimate: 0h Time Spent: 10m > Remove insecure ssl options from MongoDBIO > -- > > Key: BEAM-9564 > URL: https://issues.apache.org/jira/browse/BEAM-9564 > Project: Beam > Issue Type: Improvement > Components: io-java-mongodb >Affects Versions: 2.21.0 >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Critical > Labels: backward-incompatible > Time Spent: 10m > Remaining Estimate: 0h > > The option MongoDBIO.withIgnoreSSLCertificate and > withSSLInvalidHostNameAllowedslInvalidHostNameAllowed() are insecure by > design. We should not encourage users to be able to use them so better to > remove these options. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9564) Remove insecure ssl options from MongoDBIO
[ https://issues.apache.org/jira/browse/BEAM-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9564: --- Description: The option MongoDBIO.withIgnoreSSLCertificate and withSSLInvalidHostNameAllowedslInvalidHostNameAllowed() are insecure by design. We should not encourage users to be able to use them so better to remove these options. (was: The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We should not encourage this behavior so better to remove this option. I thought this was used only for tests but it does not seem to be the case so there is not even a valid case to keep it.) > Remove insecure ssl options from MongoDBIO > -- > > Key: BEAM-9564 > URL: https://issues.apache.org/jira/browse/BEAM-9564 > Project: Beam > Issue Type: Improvement > Components: io-java-mongodb >Affects Versions: 2.21.0 >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Critical > Labels: backward-incompatible > > The option MongoDBIO.withIgnoreSSLCertificate and > withSSLInvalidHostNameAllowedslInvalidHostNameAllowed() are insecure by > design. We should not encourage users to be able to use them so better to > remove these options. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9564) Remove insecure ssl options from MongoDBIO
[ https://issues.apache.org/jira/browse/BEAM-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9564: --- Summary: Remove insecure ssl options from MongoDBIO (was: Remove insecure withIgnoreSSLCertificate method from MongoDBIO) > Remove insecure ssl options from MongoDBIO > -- > > Key: BEAM-9564 > URL: https://issues.apache.org/jira/browse/BEAM-9564 > Project: Beam > Issue Type: Improvement > Components: io-java-mongodb >Affects Versions: 2.21.0 >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Critical > Labels: backward-incompatible > > The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We > should not encourage this behavior so better to remove this option. > I thought this was used only for tests but it does not seem to be the case so > there is not even a valid case to keep it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9564) Remove insecure withIgnoreSSLCertificate method from MongoDBIO
[ https://issues.apache.org/jira/browse/BEAM-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9564: --- Summary: Remove insecure withIgnoreSSLCertificate method from MongoDBIO (was: Remove withIgnoreSSLCertificate from MongoDBIO) > Remove insecure withIgnoreSSLCertificate method from MongoDBIO > -- > > Key: BEAM-9564 > URL: https://issues.apache.org/jira/browse/BEAM-9564 > Project: Beam > Issue Type: Improvement > Components: io-java-mongodb >Affects Versions: 2.21.0 >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Critical > Labels: backward-incompatible > > The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We > should not encourage this behavior so better to remove this option. > I thought this was used only for tests but it does not seem to be the case so > there is not even a valid case to keep it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407302 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 22:25 Start Date: 20/Mar/20 22:25 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407302) Time Spent: 8h 10m (was: 8h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 8h 10m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9430) Migrate from ProcessContext#updateWatermark to WatermarkEstimators
[ https://issues.apache.org/jira/browse/BEAM-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9430: --- Labels: backward-incompatible (was: ) > Migrate from ProcessContext#updateWatermark to WatermarkEstimators > -- > > Key: BEAM-9430 > URL: https://issues.apache.org/jira/browse/BEAM-9430 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Labels: backward-incompatible > Fix For: 2.21.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Current discussion underway in > [https://lists.apache.org/thread.html/r5d974b6a58bc04ff4c02682fda4ef68608121f1bf23a86e9d592ca6e%40%3Cdev.beam.apache.org%3E] > > Proposed API: [https://github.com/apache/beam/pull/10992] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9564) Remove withIgnoreSSLCertificate from MongoDBIO
Ismaël Mejía created BEAM-9564: -- Summary: Remove withIgnoreSSLCertificate from MongoDBIO Key: BEAM-9564 URL: https://issues.apache.org/jira/browse/BEAM-9564 Project: Beam Issue Type: Improvement Components: io-java-mongodb Affects Versions: 2.21.0 Reporter: Ismaël Mejía Assignee: Ismaël Mejía The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We should not encourage this behavior so better to remove this option. I thought this was used only for tests but it does not seem to be the case so there is not even a valid case to keep it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9564) Remove withIgnoreSSLCertificate from MongoDBIO
[ https://issues.apache.org/jira/browse/BEAM-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9564: --- Labels: backward-incompatible (was: ) > Remove withIgnoreSSLCertificate from MongoDBIO > -- > > Key: BEAM-9564 > URL: https://issues.apache.org/jira/browse/BEAM-9564 > Project: Beam > Issue Type: Improvement > Components: io-java-mongodb >Affects Versions: 2.21.0 >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Critical > Labels: backward-incompatible > > The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We > should not encourage this behavior so better to remove this option. > I thought this was used only for tests but it does not seem to be the case so > there is not even a valid case to keep it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9564) Remove withIgnoreSSLCertificate from MongoDBIO
[ https://issues.apache.org/jira/browse/BEAM-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9564: --- Status: Open (was: Triage Needed) > Remove withIgnoreSSLCertificate from MongoDBIO > -- > > Key: BEAM-9564 > URL: https://issues.apache.org/jira/browse/BEAM-9564 > Project: Beam > Issue Type: Improvement > Components: io-java-mongodb >Affects Versions: 2.21.0 >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Critical > > The option MongoDBIO.withIgnoreSSLCertificate is insecure by design. We > should not encourage this behavior so better to remove this option. > I thought this was used only for tests but it does not seem to be the case so > there is not even a valid case to keep it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9549) Flaky portableWordCountBatch and portableWordCountStreaming tests
[ https://issues.apache.org/jira/browse/BEAM-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9549: --- Status: Open (was: Triage Needed) > Flaky portableWordCountBatch and portableWordCountStreaming tests > - > > Key: BEAM-9549 > URL: https://issues.apache.org/jira/browse/BEAM-9549 > Project: Beam > Issue Type: Test > Components: test-failures >Reporter: Ning Kang >Assignee: Ankur Goenka >Priority: Major > Attachments: Sr5cNnx8sAW.png > > > The tests :sdks:python:test-suites:portable:py2:portableWordCountBatch and > :sdks:python:test-suites:portable:py2:portableWordCountStreaming are flaky, > sometimes throws grpc errrors. > Stacktrace > !Sr5cNnx8sAW.png|width=2049,height=1001! > In text: > {code:java} > INFO:root:Using Python SDK docker image: > apache/beam_python2.7_sdk:2.21.0.dev. If the image is not available at local, > we will try to pull from > hub.docker.comINFO:apache_beam.runners.portability.fn_api_runner_transforms: > > INFO:apache_beam.utils.subprocess_server:Starting service > with ['docker' 'run' '-v' '/usr/bin/docker:/bin/docker' '-v' > '/var/run/docker.sock:/var/run/docker.sock' '--network=host' > 'apache/beam_flink1.9_job_server:latest' '--job-host' 'localhost' > '--job-port' '58753' '--artifact-port' '60175' '--expansion-port' > '33067']INFO:apache_beam.utils.subprocess_server:[main] INFO > org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - > ArtifactStagingService started on > localhost:60175INFO:apache_beam.utils.subprocess_server:[main] INFO > org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - Java > ExpansionService started on > localhost:33067INFO:apache_beam.utils.subprocess_server:[main] INFO > org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - > JobService started on localhost:58753ERROR:grpc._common:Exception > deserializing message!Traceback (most recent call last): File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_common.py", > line 84, in _transformreturn transformer(message)DecodeError: Error > parsing messageTraceback (most recent call last): File > "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main > "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", > line 72, in _run_codeexec code in run_globals File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/lib/python2.7/site-packages/apache_beam/examples/wordcount.py", > line 142, in run() File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/lib/python2.7/site-packages/apache_beam/examples/wordcount.py", > line 121, in runresult = p.run() File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 495, in runself._options).run(False) File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/pipeline.py", > line 508, in runreturn self.runner.run_pipeline(self, self._options) > File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py", > line 401, in run_pipelinejob_service_handle.submit(proto_pipeline) File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py", > line 102, in submitprepare_response = self.prepare(proto_pipeline) File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py", > line 179, in preparetimeout=self.timeout) File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py", > line 826, in __call__return _end_unary_response_blocking(state, call, > False, None) File > "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py", > line 729, in _end_unary_response_blockingraise > _InactiveRpcError(state)grpc._channel._InactiveRpcError: <_InactiveRpcError > of
[jira] [Updated] (BEAM-9563) TFRecordIO inefficient read from sideinput causing pipeline to be slow - fix
[ https://issues.apache.org/jira/browse/BEAM-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-9563: --- Status: Open (was: Triage Needed) > TFRecordIO inefficient read from sideinput causing pipeline to be slow - fix > > > Key: BEAM-9563 > URL: https://issues.apache.org/jira/browse/BEAM-9563 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: Not applicable >Reporter: Piotr Szuberski >Assignee: Piotr Szuberski >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > The class ToListCombineFn in the previous task has public access level but > can be private. > sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java:420 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9444) Shall we use GCP Libraries BOM to specify Google-related library versions?
[ https://issues.apache.org/jira/browse/BEAM-9444?focusedWorklogId=407297=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407297 ] ASF GitHub Bot logged work on BEAM-9444: Author: ASF GitHub Bot Created on: 20/Mar/20 22:09 Start Date: 20/Mar/20 22:09 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #11156: [BEAM-9444] Use GCP Libraries BOM for Google Cloud Dependencies URL: https://github.com/apache/beam/pull/11156#discussion_r395911140 ## File path: buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy ## @@ -444,45 +439,46 @@ class BeamModulePlugin implements Plugin { commons_lang3 : "org.apache.commons:commons-lang3:3.9", commons_math3 : "org.apache.commons:commons-math3:3.6.1", error_prone_annotations : "com.google.errorprone:error_prone_annotations:2.0.15", -gax : "com.google.api:gax:$gax_version", -gax_grpc: "com.google.api:gax-grpc:$gax_version", +gax : "com.google.api:gax", +gax_grpc: "com.google.api:gax-grpc", google_api_client : "com.google.api-client:google-api-client:$google_clients_version", google_api_client_jackson2 : "com.google.api-client:google-api-client-jackson2:$google_clients_version", google_api_client_java6 : "com.google.api-client:google-api-client-java6:$google_clients_version", -google_api_common : "com.google.api:api-common:1.8.1", +google_api_common : "com.google.api:api-common", Review comment: A question for my learning. Previously this file had specific dependency versions. For any released Beam version we could check the file in the release branch and get a list of dependencies and their versions. After this change, how can we do the same thing? Would it happen through a generated BOM file in the release branch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407297) Time Spent: 7h 20m (was: 7h 10m) > Shall we use GCP Libraries BOM to specify Google-related library versions? > -- > > Key: BEAM-9444 > URL: https://issues.apache.org/jira/browse/BEAM-9444 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Attachments: Screen Shot 2020-03-13 at 13.33.01.png, Screen Shot > 2020-03-17 at 16.01.16.png > > Time Spent: 7h 20m > Remaining Estimate: 0h > > Shall we use GCP Libraries BOM to specify Google-related library versions? > > I've been working on Beam's dependency upgrades in the past few months. I > think it's time to consider a long-term solution to keep the libraries > up-to-date with small maintenance effort. To achieve that, I propose Beam to > use GCP Libraries BOM to set the Google-related library versions, rather than > trying to make changes in each of ~30 Google libraries. > > h1. Background > A BOM is pom.xml that provides dependencyManagement to importing projects. > > GCP Libraries BOM is a BOM that includes many Google Cloud related libraries > + gRPC + protobuf. We (Google Cloud Java Diamond Dependency team) maintain > the BOM so that the set of the libraries are compatible with each other. > > h1. Implementation > Notes for obstacles. > h2. BeamModulePlugin's "force" does not take BOM into account (thus fails) > {{forcedModules}} via version resolution strategy is playing bad. This causes > {noformat} > A problem occurred evaluating project ':sdks:java:extensions:sql'. > Could not resolve all dependencies for configuration > ':sdks:java:extensions:sql:fmppTemplates'. > Invalid format: 'com.google.cloud:google-cloud-core'. Group, name and version > cannot be empty. Correct example: 'org.gradle:gradle-core:1.0'{noformat} > !Screen Shot 2020-03-13 at 13.33.01.png|width=489,height=287! > > h2. :sdks:java:maven-archetypes:examples needs the version of > google-http-client > The task requires the version for the library: > {code:java} > 'google-http-client.version': >
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407296=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407296 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 22:09 Start Date: 20/Mar/20 22:09 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395908901 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,101 +215,127 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } + // A set of key+value labels which define the scope of the metric. // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } message MonitoringInfoTypeUrns { enum Enum { +// Represents an integer counter where values are summed across bundles. +// +// Encoding: +// - value: beam:coder:varint:v1 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) = -"beam:metrics:sum_int_64"]; - -DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:distribution_int_64"]; - -LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:latest_int_64"]; +"beam:metrics:sum_int64:v1"]; + +// Represents a double counter where values are summed across bundles. +// +// Encoding: +// value: beam:coder:double:v1 +SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = +"beam:metrics:sum_int64:v1"]; + +// Represents a distribution of an integer value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:varint:v1 +// - min: beam:coder:varint:v1 +// - max: beam:coder:varint:v1 +DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents a distribution of a double value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:double:v1 +// - min: beam:coder:double:v1 +// - max: beam:coder:double:v1 +DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:varint:v1 +LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:double:v1 +LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the largest set of integer values seen across bundles. +// +// Encoding: ... +// - valueX: beam:coder:varint:v1 +TOP_N_INT64_TYPE = 6
[jira] [Work logged] (BEAM-8910) Use AVRO instead of JSON in BigQuery bounded source.
[ https://issues.apache.org/jira/browse/BEAM-8910?focusedWorklogId=407295=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407295 ] ASF GitHub Bot logged work on BEAM-8910: Author: ASF GitHub Bot Created on: 20/Mar/20 22:08 Start Date: 20/Mar/20 22:08 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11086: [BEAM-8910] Make custom BQ source read from Avro URL: https://github.com/apache/beam/pull/11086#discussion_r395910855 ## File path: sdks/python/apache_beam/io/gcp/bigquery_read_it_test.py ## @@ -236,11 +251,12 @@ def create_table(cls, table_name): cls.bigquery_client.insert_rows( cls.project, cls.dataset_id, table_name, table_data) - def get_expected_data(self): + def get_expected_data(self, native=True): +byts = b'\xab\xac' expected_row = { 'float': 0.33, 'numeric': Decimal('10'), -'bytes': base64.b64encode(b'\xab\xac'), +'bytes': base64.b64encode(byts) if native else byts, Review comment: The behavior will be different for different transforms. Users will need to explicitly change the transform in their code. We can make them aware of the new transform, and its typing differences in release notes, and possibly in Pydoc for the new transform as well. Thoughts? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407295) Time Spent: 2h 50m (was: 2h 40m) > Use AVRO instead of JSON in BigQuery bounded source. > > > Key: BEAM-8910 > URL: https://issues.apache.org/jira/browse/BEAM-8910 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Kamil Wasilewski >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 2h 50m > Remaining Estimate: 0h > > The proposed BigQuery bounded source in Python SDK (see PR: > [https://github.com/apache/beam/pull/9772)] uses a BigQuery export job to > take a snapshot of the table and read from each produced JSON file. A > performance improvement can be gain by switching to AVRO instead. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9558) Make end of data channel explicit
[ https://issues.apache.org/jira/browse/BEAM-9558?focusedWorklogId=407294=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407294 ] ASF GitHub Bot logged work on BEAM-9558: Author: ASF GitHub Bot Created on: 20/Mar/20 22:06 Start Date: 20/Mar/20 22:06 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11173: [BEAM-9558] Add explicit end bit for data channel. URL: https://github.com/apache/beam/pull/11173#issuecomment-601930536 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407294) Time Spent: 50m (was: 40m) > Make end of data channel explicit > - > > Key: BEAM-9558 > URL: https://issues.apache.org/jira/browse/BEAM-9558 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.21.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Currently the end of a data channel is implicitly marked by sending an empty > data block. The protocol would be simplified by making this explicit, and it > would also prevent data loss bugs that might occur by accidentally sending an > empty block. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7923) Interactive Beam
[ https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407292=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407292 ] ASF GitHub Bot logged work on BEAM-7923: Author: ASF GitHub Bot Created on: 20/Mar/20 22:04 Start Date: 20/Mar/20 22:04 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11174: [BEAM-7923] Pop failed transform when error is raised URL: https://github.com/apache/beam/pull/11174#issuecomment-601929637 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407292) Time Spent: 9h (was: 8h 50m) > Interactive Beam > > > Key: BEAM-7923 > URL: https://issues.apache.org/jira/browse/BEAM-7923 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > This is the top level ticket for all efforts leveraging [interactive > Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]] > As the development goes, blocking tickets will be added to this one. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407290 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 22:01 Start Date: 20/Mar/20 22:01 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395908901 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,101 +215,127 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } + // A set of key+value labels which define the scope of the metric. // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } message MonitoringInfoTypeUrns { enum Enum { +// Represents an integer counter where values are summed across bundles. +// +// Encoding: +// - value: beam:coder:varint:v1 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) = -"beam:metrics:sum_int_64"]; - -DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:distribution_int_64"]; - -LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:latest_int_64"]; +"beam:metrics:sum_int64:v1"]; + +// Represents a double counter where values are summed across bundles. +// +// Encoding: +// value: beam:coder:double:v1 +SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = +"beam:metrics:sum_int64:v1"]; + +// Represents a distribution of an integer value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:varint:v1 +// - min: beam:coder:varint:v1 +// - max: beam:coder:varint:v1 +DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents a distribution of a double value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:double:v1 +// - min: beam:coder:double:v1 +// - max: beam:coder:double:v1 +DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:varint:v1 +LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:double:v1 +LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the largest set of integer values seen across bundles. +// +// Encoding: ... +// - valueX: beam:coder:varint:v1 +TOP_N_INT64_TYPE = 6
[jira] [Work logged] (BEAM-9420) Configurable timeout for Kafka setupInitialOffset()
[ https://issues.apache.org/jira/browse/BEAM-9420?focusedWorklogId=407289=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407289 ] ASF GitHub Bot logged work on BEAM-9420: Author: ASF GitHub Bot Created on: 20/Mar/20 21:59 Start Date: 20/Mar/20 21:59 Worklog Time Spent: 10m Work Description: iemejia commented on issue #11099: [BEAM-9420] Configurable timeout for blocking kafka API call(s) URL: https://github.com/apache/beam/pull/11099#issuecomment-601928283 @aromanenko-dev maybe since he has been maintaing KafkaIo for a while. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407289) Time Spent: 1h (was: 50m) > Configurable timeout for Kafka setupInitialOffset() > --- > > Key: BEAM-9420 > URL: https://issues.apache.org/jira/browse/BEAM-9420 > Project: Beam > Issue Type: Bug > Components: io-java-kafka >Affects Versions: 2.19.0 >Reporter: Jozef Vilcek >Assignee: Jozef Vilcek >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > If bootstrap brokers does contain an unhealthy server, it can break the start > of a whole Beam job. During the start, `KafkaUnboundedReader` is waiting for > `setupInitialOffset()`. Wait timeout is either a double time of `request. > timeout.ms` or some default constant. In both cases, it might not be enough > time for kafka-client to initiate fallback and retry metadata discovery via > another broker from given bootstrap list. > The client should be able to specify timeout for `setupInitialOffset()` > explicitly as a setting to KafkaIO read. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8019) Support cross-language transforms for DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-8019?focusedWorklogId=407287=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407287 ] ASF GitHub Bot logged work on BEAM-8019: Author: ASF GitHub Bot Created on: 20/Mar/20 21:56 Start Date: 20/Mar/20 21:56 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #11185: [BEAM-8019] Some generalizations to support cross-language transforms. URL: https://github.com/apache/beam/pull/11185#issuecomment-601927190 cc: @robertwb @lukecwik Still addressing some unit test failures but sharing for any early comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407287) Time Spent: 8h 20m (was: 8h 10m) > Support cross-language transforms for DataflowRunner > > > Key: BEAM-8019 > URL: https://issues.apache.org/jira/browse/BEAM-8019 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Chamikara Madhusanka Jayalath >Assignee: Chamikara Madhusanka Jayalath >Priority: Major > Time Spent: 8h 20m > Remaining Estimate: 0h > > This is to capture the Beam changes needed for this task. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8019) Support cross-language transforms for DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-8019?focusedWorklogId=407285=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407285 ] ASF GitHub Bot logged work on BEAM-8019: Author: ASF GitHub Bot Created on: 20/Mar/20 21:55 Start Date: 20/Mar/20 21:55 Worklog Time Spent: 10m Work Description: chamikaramj commented on pull request #11185: [BEAM-8019] Some generalizations to support cross-language transforms. URL: https://github.com/apache/beam/pull/11185 These are needed for for runners that need to build a Python object graph from a runner API proto with external transforms (for example, Dataflow). Some generalizations to support cross-language transforms. Testing - cross-language test suite [1] works for Dataflow with these changes (will be enabled separately). WIP: not all tests pass yet [1] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/validate_runner_xlang_test.py#L51 **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407283 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:50 Start Date: 20/Mar/20 21:50 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395905193 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,101 +215,127 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } + // A set of key+value labels which define the scope of the metric. // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } message MonitoringInfoTypeUrns { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407283) Time Spent: 20.5h (was: 20h 20m) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 20.5h > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package
[ https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407277=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407277 ] ASF GitHub Bot logged work on BEAM-9537: Author: ASF GitHub Bot Created on: 20/Mar/20 21:45 Start Date: 20/Mar/20 21:45 Worklog Time Spent: 10m Work Description: pabloem commented on issue #11153: [BEAM-9537] Adding a new module for FnApiRunner URL: https://github.com/apache/beam/pull/11153#issuecomment-601923625 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407277) Time Spent: 2h 20m (was: 2h 10m) > Refactor FnApiRunner into its own package > - > > Key: BEAM-9537 > URL: https://issues.apache.org/jira/browse/BEAM-9537 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable
[ https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=407276=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407276 ] ASF GitHub Bot logged work on BEAM-8280: Author: ASF GitHub Bot Created on: 20/Mar/20 21:44 Start Date: 20/Mar/20 21:44 Worklog Time Spent: 10m Work Description: udim commented on issue #10717: [BEAM-8280] Enable type hint annotations URL: https://github.com/apache/beam/pull/10717#issuecomment-601923382 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407276) Time Spent: 10h (was: 9h 50m) > re-enable IOTypeHints.from_callable > --- > > Key: BEAM-8280 > URL: https://issues.apache.org/jira/browse/BEAM-8280 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 10h > Remaining Estimate: 0h > > See https://issues.apache.org/jira/browse/BEAM-8279 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.
[ https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407272=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407272 ] ASF GitHub Bot logged work on BEAM-9340: Author: ASF GitHub Bot Created on: 20/Mar/20 21:37 Start Date: 20/Mar/20 21:37 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11165: [BEAM-9340] Populate requirements for Java. URL: https://github.com/apache/beam/pull/11165#issuecomment-601921071 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407272) Time Spent: 1h 50m (was: 1h 40m) > Properly populate pipeline proto requirements. > -- > > Key: BEAM-9340 > URL: https://issues.apache.org/jira/browse/BEAM-9340 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.20.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9545) MVP: DataframeTransform
[ https://issues.apache.org/jira/browse/BEAM-9545?focusedWorklogId=407271=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407271 ] ASF GitHub Bot logged work on BEAM-9545: Author: ASF GitHub Bot Created on: 20/Mar/20 21:37 Start Date: 20/Mar/20 21:37 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #10760: [BEAM-9545] Dataframe transforms URL: https://github.com/apache/beam/pull/10760#discussion_r395900936 ## File path: sdks/python/apache_beam/dataframe/transforms.py ## @@ -0,0 +1,246 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import absolute_import + +import pandas as pd + +import apache_beam as beam +from apache_beam import transforms +from apache_beam.dataframe import expressions +from apache_beam.dataframe import frame_base +from apache_beam.dataframe import frames # pylint: disable=unused-import + + +class DataframeTransform(transforms.PTransform): + """A PTransform for applying function that takes and returns dataframes + to one or more PCollections. + + For example, if pcoll is a PCollection of dataframes, one could write:: + + pcoll | DataframeTransform(lambda df: df.group_by('key').sum(), proxy=...) + + To pass multiple PCollections, pass a tuple of PCollections wich will be + passed to the callable as positional arguments, or a dictionary of + PCollections, in which case they will be passed as keyword arguments. + """ + def __init__(self, func, proxy): +self._func = func +self._proxy = proxy + + def expand(self, input_pcolls): +def wrap_as_dict(values): + if isinstance(values, dict): +return values + elif isinstance(values, tuple): +return dict(enumerate(values)) + else: +return {None: values} + +# TODO: Infer the proxy from the input schema. +def proxy(key): + if key is None: +return self._proxy + else: +return self._proxy[key] + +# The input can be a dictionary, tuple, or plain PCollection. +# Wrap as a dict for homogeneity. +# TODO: Possibly inject batching here. +input_dict = wrap_as_dict(input_pcolls) +placeholders = { +key: frame_base.DeferredFrame.wrap( +expressions.PlaceholderExpression(proxy(key))) +for key in input_dict.keys() +} + +# The calling convention of the user-supplied func varies according to the +# type of the input. +if isinstance(input_pcolls, dict): + result_frames = self._func(**placeholders) +elif isinstance(input_pcolls, tuple): + result_frames = self._func( + *(value for _, value in sorted(placeholders.items( +else: + result_frames = self._func(placeholders[None]) + +# Likewise the output may be a dict, tuple, or raw (deferred) Dataframe. +result_dict = wrap_as_dict(result_frames) + +result_pcolls = self._apply_deferred_ops( +{placeholders[key]._expr: pcoll + for key, pcoll in input_dict.items()}, +{key: df._expr + for key, df in result_dict.items()}) + +# Convert the result back into a set of PCollections. +if isinstance(result_frames, dict): + return result_pcolls +elif isinstance(result_frames, tuple): + return tuple((value for _, value in sorted(result_pcolls.items( +else: + return result_pcolls[None] + + def _apply_deferred_ops( + self, + inputs, # type: Dict[PlaceholderExpr, PCollection] + outputs, # type: Dict[Any, Expression] + ): # -> Dict[Any, PCollection] +"""Construct a Beam graph that evaluates a set of expressions on a set of +input PCollections. + +:param inputs: A mapping of placeholder expressions to PCollections. +:param outputs: A mapping of keys to expressions defined in terms of the +placeholders of inputs. + +Returns a dictionary whose keys are those of outputs, and whose values are +PCollections corresponding to the values of outputs evaluated at the +values of inputs. + +Logically, `_apply_deferred_ops({x: a, y: b}, {f: F(x, y), g: G(x, y)})` +returns `{f: F(a,
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407270=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407270 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:35 Start Date: 20/Mar/20 21:35 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395900617 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -194,33 +188,25 @@ extend google.protobuf.EnumValueOptions { } message MonitoringInfo { - // The name defining the metric or monitored state. + // The name defining the semantic meaning of the metric or monitored state. + // + // See MonitoringInfoSpecs.Enum for the set of well known metrics/monitored + // state. string urn = 1; - // This is specified as a URN that implies: - // A message class: (Distribution, Counter, Extrema, MonitoringDataTable). - // Sub types like field formats - int64, double, string. - // Aggregation methods - SUM, LATEST, TOP-N, BOTTOM-N, DISTRIBUTION - // valid values are: - // beam:metrics:[sum_int_64|latest_int_64|top_n_int_64|bottom_n_int_64| - // sum_double|latest_double|top_n_double|bottom_n_double| - // distribution_int_64|distribution_double|monitoring_data_table| - // latest_doubles + // This is specified as a URN that implies the encoding and aggregation + // method. See MonitoringInfoTypeUrns.Enum for the set of well known types. string type = 2; - // The Metric or monitored state. - oneof data { -MonitoringTableData monitoring_table_data = 3; -Metric metric = 4; -bytes payload = 7; - } + // The monitored state encoded as per the specification defined by the type. + bytes payload = 3; Review comment: Yup This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407270) Time Spent: 20h 20m (was: 20h 10m) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 20h 20m > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407269=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407269 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:35 Start Date: 20/Mar/20 21:35 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395900505 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,101 +215,127 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } + // A set of key+value labels which define the scope of the metric. // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } message MonitoringInfoTypeUrns { enum Enum { +// Represents an integer counter where values are summed across bundles. +// +// Encoding: +// - value: beam:coder:varint:v1 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) = -"beam:metrics:sum_int_64"]; - -DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:distribution_int_64"]; - -LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:latest_int_64"]; +"beam:metrics:sum_int64:v1"]; + +// Represents a double counter where values are summed across bundles. +// +// Encoding: +// value: beam:coder:double:v1 +SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = +"beam:metrics:sum_int64:v1"]; + +// Represents a distribution of an integer value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:varint:v1 +// - min: beam:coder:varint:v1 +// - max: beam:coder:varint:v1 +DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents a distribution of a double value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:double:v1 +// - min: beam:coder:double:v1 +// - max: beam:coder:double:v1 +DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:varint:v1 +LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:double:v1 +LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the largest set of integer values seen across bundles. +// +// Encoding: ... +// - valueX: beam:coder:varint:v1 +TOP_N_INT64_TYPE = 6
[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files
[ https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=407268=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407268 ] ASF GitHub Bot logged work on BEAM-8561: Author: ASF GitHub Bot Created on: 20/Mar/20 21:27 Start Date: 20/Mar/20 21:27 Worklog Time Spent: 10m Work Description: iemejia commented on issue #10290: [BEAM-8561] Add ThriftIO to support IO for Thrift files URL: https://github.com/apache/beam/pull/10290#issuecomment-601918008 We let it like this because we had a regression, if we want to generate and compile the class we need to have thrift installed in all Beam workers (and as a requirement for everyone building Beam) which I think is clearly overklll just for a test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407268) Time Spent: 17.5h (was: 17h 20m) > Add ThriftIO to Support IO for Thrift Files > --- > > Key: BEAM-8561 > URL: https://issues.apache.org/jira/browse/BEAM-8561 > Project: Beam > Issue Type: New Feature > Components: io-java-files >Reporter: Chris Larsen >Assignee: Chris Larsen >Priority: Major > Fix For: 2.20.0 > > Time Spent: 17.5h > Remaining Estimate: 0h > > Similar to AvroIO it would be very useful to support reading and writing > to/from Thrift files with a native connector. > Functionality would include: > # read() - Reading from one or more Thrift files. > # write() - Writing to one or more Thrift files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package
[ https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407267=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407267 ] ASF GitHub Bot logged work on BEAM-9537: Author: ASF GitHub Bot Created on: 20/Mar/20 21:26 Start Date: 20/Mar/20 21:26 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11153: [BEAM-9537] Adding a new module for FnApiRunner URL: https://github.com/apache/beam/pull/11153#discussion_r395897316 ## File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py ## @@ -269,8 +269,8 @@ def visit_transform(self, transform_node): pcoll.element_type, transform_node.full_label) key_type, value_type = pcoll.element_type.tuple_types if transform_node.outputs: -from apache_beam.runners.portability.fn_api_runner_transforms import \ - only_element +from apache_beam.runners.portability.fn_api_runner.transforms \ Review comment: I meant that utilities like only_element should probably be in a more common place rather than imported elsewhere from here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407267) Time Spent: 2h 10m (was: 2h) > Refactor FnApiRunner into its own package > - > > Key: BEAM-9537 > URL: https://issues.apache.org/jira/browse/BEAM-9537 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407265=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407265 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:16 Start Date: 20/Mar/20 21:16 Worklog Time Spent: 10m Work Description: ajamato commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395893496 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,101 +215,127 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } + // A set of key+value labels which define the scope of the metric. // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } message MonitoringInfoTypeUrns { enum Enum { +// Represents an integer counter where values are summed across bundles. +// +// Encoding: +// - value: beam:coder:varint:v1 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) = -"beam:metrics:sum_int_64"]; - -DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:distribution_int_64"]; - -LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:latest_int_64"]; +"beam:metrics:sum_int64:v1"]; + +// Represents a double counter where values are summed across bundles. +// +// Encoding: +// value: beam:coder:double:v1 +SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = +"beam:metrics:sum_int64:v1"]; + +// Represents a distribution of an integer value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:varint:v1 +// - min: beam:coder:varint:v1 +// - max: beam:coder:varint:v1 +DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents a distribution of a double value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:double:v1 +// - min: beam:coder:double:v1 +// - max: beam:coder:double:v1 +DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:varint:v1 +LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:double:v1 +LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the largest set of integer values seen across bundles. +// +// Encoding: ... +// - valueX: beam:coder:varint:v1 +TOP_N_INT64_TYPE = 6
[jira] [Work logged] (BEAM-8078) streaming_wordcount_debugging.py is missing a test
[ https://issues.apache.org/jira/browse/BEAM-8078?focusedWorklogId=407264=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407264 ] ASF GitHub Bot logged work on BEAM-8078: Author: ASF GitHub Bot Created on: 20/Mar/20 21:14 Start Date: 20/Mar/20 21:14 Worklog Time Spent: 10m Work Description: aaltay commented on issue #10914: [BEAM-8078] streaming_wordcount_debugging.py is missing a test URL: https://github.com/apache/beam/pull/10914#issuecomment-601913507 @Tesio - unfortunately, due to an ongoing issue, jenkins only starts test when a committer requests it. This should be resolved but in the meantime, if you need to trigger tests feel free to ask on the dev@ list for someone to do it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407264) Time Spent: 4h (was: 3h 50m) > streaming_wordcount_debugging.py is missing a test > -- > > Key: BEAM-8078 > URL: https://issues.apache.org/jira/browse/BEAM-8078 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Aleksey Vysotin >Priority: Minor > Labels: beginner, easy, newbie, starter > Time Spent: 4h > Remaining Estimate: 0h > > It's example code and should have a basic_test (like the other wordcount > variants in [1]) to at least verify that it runs in the latest Beam release. > [1] > https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407263=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407263 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:13 Start Date: 20/Mar/20 21:13 Worklog Time Spent: 10m Work Description: ajamato commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395892463 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,101 +215,127 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } + // A set of key+value labels which define the scope of the metric. // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } message MonitoringInfoTypeUrns { Review comment: Let's add some comments to make it clear the type is referring to what is collected in each MonitoringInfo update, and how they should be aggregated together This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407263) Time Spent: 19h 50m (was: 19h 40m) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 19h 50m > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8078) streaming_wordcount_debugging.py is missing a test
[ https://issues.apache.org/jira/browse/BEAM-8078?focusedWorklogId=407262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407262 ] ASF GitHub Bot logged work on BEAM-8078: Author: ASF GitHub Bot Created on: 20/Mar/20 21:13 Start Date: 20/Mar/20 21:13 Worklog Time Spent: 10m Work Description: aaltay commented on issue #10914: [BEAM-8078] streaming_wordcount_debugging.py is missing a test URL: https://github.com/apache/beam/pull/10914#issuecomment-601913195 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407262) Time Spent: 3h 50m (was: 3h 40m) > streaming_wordcount_debugging.py is missing a test > -- > > Key: BEAM-8078 > URL: https://issues.apache.org/jira/browse/BEAM-8078 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Aleksey Vysotin >Priority: Minor > Labels: beginner, easy, newbie, starter > Time Spent: 3h 50m > Remaining Estimate: 0h > > It's example code and should have a basic_test (like the other wordcount > variants in [1]) to at least verify that it runs in the latest Beam release. > [1] > https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407260 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:07 Start Date: 20/Mar/20 21:07 Worklog Time Spent: 10m Work Description: ajamato commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395888786 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -194,33 +188,25 @@ extend google.protobuf.EnumValueOptions { } message MonitoringInfo { - // The name defining the metric or monitored state. + // The name defining the semantic meaning of the metric or monitored state. + // + // See MonitoringInfoSpecs.Enum for the set of well known metrics/monitored + // state. string urn = 1; - // This is specified as a URN that implies: - // A message class: (Distribution, Counter, Extrema, MonitoringDataTable). - // Sub types like field formats - int64, double, string. - // Aggregation methods - SUM, LATEST, TOP-N, BOTTOM-N, DISTRIBUTION - // valid values are: - // beam:metrics:[sum_int_64|latest_int_64|top_n_int_64|bottom_n_int_64| - // sum_double|latest_double|top_n_double|bottom_n_double| - // distribution_int_64|distribution_double|monitoring_data_table| - // latest_doubles + // This is specified as a URN that implies the encoding and aggregation + // method. See MonitoringInfoTypeUrns.Enum for the set of well known types. string type = 2; - // The Metric or monitored state. - oneof data { -MonitoringTableData monitoring_table_data = 3; -Metric metric = 4; -bytes payload = 7; - } + // The monitored state encoded as per the specification defined by the type. + bytes payload = 3; Review comment: Probably you are already planning on doing this. But having helper functions to easily encode/decode these would be great. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407260) Time Spent: 19h 40m (was: 19.5h) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 19h 40m > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407258 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:07 Start Date: 20/Mar/20 21:07 Worklog Time Spent: 10m Work Description: ajamato commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395888153 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,101 +215,127 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } + // A set of key+value labels which define the scope of the metric. // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } message MonitoringInfoTypeUrns { enum Enum { +// Represents an integer counter where values are summed across bundles. +// +// Encoding: +// - value: beam:coder:varint:v1 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) = -"beam:metrics:sum_int_64"]; - -DISTRIBUTION_INT64_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:distribution_int_64"]; - -LATEST_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = - "beam:metrics:latest_int_64"]; +"beam:metrics:sum_int64:v1"]; + +// Represents a double counter where values are summed across bundles. +// +// Encoding: +// value: beam:coder:double:v1 +SUM_DOUBLE_TYPE = 1 [(org.apache.beam.model.pipeline.v1.beam_urn) = +"beam:metrics:sum_int64:v1"]; + +// Represents a distribution of an integer value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:varint:v1 +// - min: beam:coder:varint:v1 +// - max: beam:coder:varint:v1 +DISTRIBUTION_INT64_TYPE = 2 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents a distribution of a double value where: +// - count: represents the number of values seen across all bundles +// - sum: represents the total of the value across all bundles +// - min: represents the smallest value seen across all bundles +// - max: represents the largest value seen across all bundles +// +// Encoding: +// - count: beam:coder:varint:v1 +// - sum: beam:coder:double:v1 +// - min: beam:coder:double:v1 +// - max: beam:coder:double:v1 +DISTRIBUTION_DOUBLE_TYPE = 3 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:distribution_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:varint:v1 +LATEST_INT64_TYPE = 4 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the latest seen integer value. The timestamp is used to +// provide an "ordering" over multiple values to determine which is the +// latest. +// +// Encoding: +// - timestamp: beam:coder:varint:v1 (milliseconds since epoch) +// - value: beam:coder:double:v1 +LATEST_DOUBLE_TYPE = 5 [(org.apache.beam.model.pipeline.v1.beam_urn) = + "beam:metrics:latest_int64:v1"]; + +// Represents the largest set of integer values seen across bundles. +// +// Encoding: ... +// - valueX: beam:coder:varint:v1 +TOP_N_INT64_TYPE = 6
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407256=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407256 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:07 Start Date: 20/Mar/20 21:07 Worklog Time Spent: 10m Work Description: ajamato commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395888595 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -229,101 +215,127 @@ message MonitoringInfo { NAMESPACE = 5 [(label_props) = { name: "NAMESPACE" }]; NAME = 6 [(label_props) = { name: "NAME" }]; } + // A set of key+value labels which define the scope of the metric. // Either a well defined entity id for matching the enum names in // the MonitoringInfoLabels enum or any arbitrary label // set by a custom metric or user metric. + // // A monitoring system is expected to be able to aggregate the metrics // together for all updates having the same URN and labels. Some systems such // as Stackdriver will be able to aggregate the metrics using a subset of the // provided labels - map labels = 5; - - // The walltime of the most recent update. - // Useful for aggregation for latest types such as LatestInt64. - google.protobuf.Timestamp timestamp = 6; + map labels = 4; } message MonitoringInfoTypeUrns { enum Enum { +// Represents an integer counter where values are summed across bundles. +// +// Encoding: +// - value: beam:coder:varint:v1 SUM_INT64_TYPE = 0 [(org.apache.beam.model.pipeline.v1.beam_urn) = Review comment: I think there are only a few of these types being used now. SUM_INT64_TYPE and DISTRIBUTION_INT64_TYPE. I hope we can make it very simple to add new ones of these with minimal changes (Adding MonitoringInfoSpec and reusing existing framework/libraries in the SDK, runners can mostly pass them through to a service to aggregate across multiple workers) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407256) Time Spent: 19.5h (was: 19h 20m) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 19.5h > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407259=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407259 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:07 Start Date: 20/Mar/20 21:07 Worklog Time Spent: 10m Work Description: ajamato commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395889104 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -194,33 +188,25 @@ extend google.protobuf.EnumValueOptions { } message MonitoringInfo { - // The name defining the metric or monitored state. + // The name defining the semantic meaning of the metric or monitored state. + // + // See MonitoringInfoSpecs.Enum for the set of well known metrics/monitored + // state. string urn = 1; - // This is specified as a URN that implies: - // A message class: (Distribution, Counter, Extrema, MonitoringDataTable). - // Sub types like field formats - int64, double, string. - // Aggregation methods - SUM, LATEST, TOP-N, BOTTOM-N, DISTRIBUTION - // valid values are: - // beam:metrics:[sum_int_64|latest_int_64|top_n_int_64|bottom_n_int_64| - // sum_double|latest_double|top_n_double|bottom_n_double| - // distribution_int_64|distribution_double|monitoring_data_table| - // latest_doubles + // This is specified as a URN that implies the encoding and aggregation + // method. See MonitoringInfoTypeUrns.Enum for the set of well known types. string type = 2; - // The Metric or monitored state. - oneof data { -MonitoringTableData monitoring_table_data = 3; -Metric metric = 4; -bytes payload = 7; - } + // The monitored state encoded as per the specification defined by the type. + bytes payload = 3; Review comment: My biggest concern is losing the ability to print debug strings, which are helpful when people are trying to learn how these are populated. But maybe we can just add a few obvious places to dump debug logs, debug files, etc with the MonitoringInfors parses properly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407259) Time Spent: 19h 40m (was: 19.5h) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 19h 40m > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7923) Interactive Beam
[ https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407255 ] ASF GitHub Bot logged work on BEAM-7923: Author: ASF GitHub Bot Created on: 20/Mar/20 21:07 Start Date: 20/Mar/20 21:07 Worklog Time Spent: 10m Work Description: KevinGG commented on pull request #11174: [BEAM-7923] Pop failed transform in CombineGlobally URL: https://github.com/apache/beam/pull/11174#discussion_r395890026 ## File path: sdks/python/apache_beam/transforms/core.py ## @@ -1811,6 +1811,8 @@ def add_input_types(transform): return view else: if pcoll.windowing.windowfn != GlobalWindows(): +# Remove the broken transform when running into value error. +pcoll.pipeline.transforms_stack.pop() Review comment: Yes, agree with it! I'll make the change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407255) Time Spent: 8h 50m (was: 8h 40m) > Interactive Beam > > > Key: BEAM-7923 > URL: https://issues.apache.org/jira/browse/BEAM-7923 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 8h 50m > Remaining Estimate: 0h > > This is the top level ticket for all efforts leveraging [interactive > Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]] > As the development goes, blocking tickets will be added to this one. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407257 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:07 Start Date: 20/Mar/20 21:07 Worklog Time Spent: 10m Work Description: ajamato commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#discussion_r395887788 ## File path: model/pipeline/src/main/proto/metrics.proto ## @@ -139,7 +137,7 @@ message MonitoringInfoSpecs { USER_DISTRIBUTION_COUNTER = 6 [(monitoring_info_spec) = { urn: "beam:metric:user_distribution", - type_urn: "beam:metrics:distribution_int_64", + type_urn: "beam:metrics:distribution_int64", Review comment: Add a :v1 here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407257) Time Spent: 19.5h (was: 19h 20m) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 19.5h > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8078) streaming_wordcount_debugging.py is missing a test
[ https://issues.apache.org/jira/browse/BEAM-8078?focusedWorklogId=407253=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407253 ] ASF GitHub Bot logged work on BEAM-8078: Author: ASF GitHub Bot Created on: 20/Mar/20 21:06 Start Date: 20/Mar/20 21:06 Worklog Time Spent: 10m Work Description: Tesio commented on issue #10914: [BEAM-8078] streaming_wordcount_debugging.py is missing a test URL: https://github.com/apache/beam/pull/10914#issuecomment-601910749 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407253) Time Spent: 3h 40m (was: 3.5h) > streaming_wordcount_debugging.py is missing a test > -- > > Key: BEAM-8078 > URL: https://issues.apache.org/jira/browse/BEAM-8078 > Project: Beam > Issue Type: Improvement > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Aleksey Vysotin >Priority: Minor > Labels: beginner, easy, newbie, starter > Time Spent: 3h 40m > Remaining Estimate: 0h > > It's example code and should have a basic_test (like the other wordcount > variants in [1]) to at least verify that it runs in the latest Beam release. > [1] > https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9558) Make end of data channel explicit
[ https://issues.apache.org/jira/browse/BEAM-9558?focusedWorklogId=407252=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407252 ] ASF GitHub Bot logged work on BEAM-9558: Author: ASF GitHub Bot Created on: 20/Mar/20 21:06 Start Date: 20/Mar/20 21:06 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11173: [BEAM-9558] Add explicit end bit for data channel. URL: https://github.com/apache/beam/pull/11173#issuecomment-601910742 Rebased on #11177 and updated the proto for timers. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407252) Time Spent: 40m (was: 0.5h) > Make end of data channel explicit > - > > Key: BEAM-9558 > URL: https://issues.apache.org/jira/browse/BEAM-9558 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.21.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Currently the end of a data channel is implicitly marked by sending an empty > data block. The protocol would be simplified by making this explicit, and it > would also prevent data loss bugs that might occur by accidentally sending an > empty block. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package
[ https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407251 ] ASF GitHub Bot logged work on BEAM-9537: Author: ASF GitHub Bot Created on: 20/Mar/20 21:05 Start Date: 20/Mar/20 21:05 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11153: [BEAM-9537] Adding a new module for FnApiRunner URL: https://github.com/apache/beam/pull/11153#discussion_r395889275 ## File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py ## @@ -269,8 +269,8 @@ def visit_transform(self, transform_node): pcoll.element_type, transform_node.full_label) key_type, value_type = pcoll.element_type.tuple_types if transform_node.outputs: -from apache_beam.runners.portability.fn_api_runner_transforms import \ - only_element +from apache_beam.runners.portability.fn_api_runner.transforms \ Review comment: You mean the newly named `translations` module may need to exist in `apache_beam/utils`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407251) Time Spent: 2h (was: 1h 50m) > Refactor FnApiRunner into its own package > - > > Key: BEAM-9537 > URL: https://issues.apache.org/jira/browse/BEAM-9537 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package
[ https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407249=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407249 ] ASF GitHub Bot logged work on BEAM-9537: Author: ASF GitHub Bot Created on: 20/Mar/20 21:04 Start Date: 20/Mar/20 21:04 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11153: [BEAM-9537] Adding a new module for FnApiRunner URL: https://github.com/apache/beam/pull/11153#discussion_r395888936 ## File path: sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py ## @@ -78,12 +78,12 @@ from apache_beam.runners import pipeline_context from apache_beam.runners import runner from apache_beam.runners.portability import artifact_service -from apache_beam.runners.portability import fn_api_runner_transforms from apache_beam.runners.portability import portable_metrics -from apache_beam.runners.portability.fn_api_runner_transforms import create_buffer_id -from apache_beam.runners.portability.fn_api_runner_transforms import only_element -from apache_beam.runners.portability.fn_api_runner_transforms import split_buffer_id -from apache_beam.runners.portability.fn_api_runner_transforms import unique_name +from apache_beam.runners.portability.fn_api_runner import transforms Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407249) Time Spent: 1h 40m (was: 1.5h) > Refactor FnApiRunner into its own package > - > > Key: BEAM-9537 > URL: https://issues.apache.org/jira/browse/BEAM-9537 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package
[ https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407250=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407250 ] ASF GitHub Bot logged work on BEAM-9537: Author: ASF GitHub Bot Created on: 20/Mar/20 21:04 Start Date: 20/Mar/20 21:04 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #11153: [BEAM-9537] Adding a new module for FnApiRunner URL: https://github.com/apache/beam/pull/11153#discussion_r395888936 ## File path: sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py ## @@ -78,12 +78,12 @@ from apache_beam.runners import pipeline_context from apache_beam.runners import runner from apache_beam.runners.portability import artifact_service -from apache_beam.runners.portability import fn_api_runner_transforms from apache_beam.runners.portability import portable_metrics -from apache_beam.runners.portability.fn_api_runner_transforms import create_buffer_id -from apache_beam.runners.portability.fn_api_runner_transforms import only_element -from apache_beam.runners.portability.fn_api_runner_transforms import split_buffer_id -from apache_beam.runners.portability.fn_api_runner_transforms import unique_name +from apache_beam.runners.portability.fn_api_runner import transforms Review comment: Done. Went with translations. But I also like optimizations... Maybe I'll rename if it becomes a little bit more optimizations than translations down the road. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407250) Time Spent: 1h 50m (was: 1h 40m) > Refactor FnApiRunner into its own package > - > > Key: BEAM-9537 > URL: https://issues.apache.org/jira/browse/BEAM-9537 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407237=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407237 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 21:00 Start Date: 20/Mar/20 21:00 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#issuecomment-601897399 CC: @ajamato @robertwb This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407237) Time Spent: 19h 20m (was: 19h 10m) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 19h 20m > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package
[ https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407235=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407235 ] ASF GitHub Bot logged work on BEAM-9537: Author: ASF GitHub Bot Created on: 20/Mar/20 20:58 Start Date: 20/Mar/20 20:58 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11153: [BEAM-9537] Adding a new module for FnApiRunner URL: https://github.com/apache/beam/pull/11153#discussion_r395885958 ## File path: sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py ## @@ -78,12 +78,12 @@ from apache_beam.runners import pipeline_context from apache_beam.runners import runner from apache_beam.runners.portability import artifact_service -from apache_beam.runners.portability import fn_api_runner_transforms from apache_beam.runners.portability import portable_metrics -from apache_beam.runners.portability.fn_api_runner_transforms import create_buffer_id -from apache_beam.runners.portability.fn_api_runner_transforms import only_element -from apache_beam.runners.portability.fn_api_runner_transforms import split_buffer_id -from apache_beam.runners.portability.fn_api_runner_transforms import unique_name +from apache_beam.runners.portability.fn_api_runner import transforms Review comment: Seeing this unqualified makes me realize that it's ambiguous with the notion of a PTransform (and the apache_beam.transforms package). Maybe we should call it `optimizations` or `translations`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407235) Time Spent: 1.5h (was: 1h 20m) > Refactor FnApiRunner into its own package > - > > Key: BEAM-9537 > URL: https://issues.apache.org/jira/browse/BEAM-9537 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9537) Refactor FnApiRunner into its own package
[ https://issues.apache.org/jira/browse/BEAM-9537?focusedWorklogId=407236=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407236 ] ASF GitHub Bot logged work on BEAM-9537: Author: ASF GitHub Bot Created on: 20/Mar/20 20:58 Start Date: 20/Mar/20 20:58 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11153: [BEAM-9537] Adding a new module for FnApiRunner URL: https://github.com/apache/beam/pull/11153#discussion_r395884481 ## File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py ## @@ -269,8 +269,8 @@ def visit_transform(self, transform_node): pcoll.element_type, transform_node.full_label) key_type, value_type = pcoll.element_type.tuple_types if transform_node.outputs: -from apache_beam.runners.portability.fn_api_runner_transforms import \ - only_element +from apache_beam.runners.portability.fn_api_runner.transforms \ Review comment: You don't have to do this in this PR, but this should probably be in apache_beam/utils. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407236) Time Spent: 1.5h (was: 1h 20m) > Refactor FnApiRunner into its own package > - > > Key: BEAM-9537 > URL: https://issues.apache.org/jira/browse/BEAM-9537 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7923) Interactive Beam
[ https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407231=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407231 ] ASF GitHub Bot logged work on BEAM-7923: Author: ASF GitHub Bot Created on: 20/Mar/20 20:49 Start Date: 20/Mar/20 20:49 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11174: [BEAM-7923] Pop failed transform in CombineGlobally URL: https://github.com/apache/beam/pull/11174#discussion_r395883196 ## File path: sdks/python/apache_beam/transforms/core.py ## @@ -1811,6 +1811,8 @@ def add_input_types(transform): return view else: if pcoll.windowing.windowfn != GlobalWindows(): +# Remove the broken transform when running into value error. +pcoll.pipeline.transforms_stack.pop() Review comment: E.g. put this https://github.com/apache/beam/blob/release-2.19.0/sdks/python/apache_beam/pipeline.py#L330 in a finally clause of a try block that starts where it's pushed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407231) Time Spent: 8h 40m (was: 8.5h) > Interactive Beam > > > Key: BEAM-7923 > URL: https://issues.apache.org/jira/browse/BEAM-7923 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 8h 40m > Remaining Estimate: 0h > > This is the top level ticket for all efforts leveraging [interactive > Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]] > As the development goes, blocking tickets will be added to this one. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-7923) Interactive Beam
[ https://issues.apache.org/jira/browse/BEAM-7923?focusedWorklogId=407229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407229 ] ASF GitHub Bot logged work on BEAM-7923: Author: ASF GitHub Bot Created on: 20/Mar/20 20:47 Start Date: 20/Mar/20 20:47 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11174: [BEAM-7923] Pop failed transform in CombineGlobally URL: https://github.com/apache/beam/pull/11174#discussion_r395882293 ## File path: sdks/python/apache_beam/transforms/core.py ## @@ -1811,6 +1811,8 @@ def add_input_types(transform): return view else: if pcoll.windowing.windowfn != GlobalWindows(): +# Remove the broken transform when running into value error. +pcoll.pipeline.transforms_stack.pop() Review comment: This is not the right place to pop this (internal) stack. Instead, we should popping from the stack in a finally clause of a try block that pushes to the stack. (Alternatively, we could manage the stack with a Python context, but that might be overkill.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407229) Time Spent: 8.5h (was: 8h 20m) > Interactive Beam > > > Key: BEAM-7923 > URL: https://issues.apache.org/jira/browse/BEAM-7923 > Project: Beam > Issue Type: New Feature > Components: runner-py-interactive >Reporter: Ning Kang >Assignee: Ning Kang >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > This is the top level ticket for all efforts leveraging [interactive > Beam|[https://github.com/apache/beam/tree/master/sdks/python/apache_beam/runners/interactive]] > As the development goes, blocking tickets will be added to this one. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?
[ https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407225=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407225 ] ASF GitHub Bot logged work on BEAM-9542: Author: ASF GitHub Bot Created on: 20/Mar/20 20:35 Start Date: 20/Mar/20 20:35 Worklog Time Spent: 10m Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and clarify the effect of "force" in Java build URL: https://github.com/apache/beam/pull/11168#issuecomment-601899334 Run Spark ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407225) Time Spent: 4h (was: 3h 50m) > Where the BeamModulePlugin's force is needed? > - > > Key: BEAM-9542 > URL: https://issues.apache.org/jira/browse/BEAM-9542 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735 > {noformat} > > Task :sdks:java:core:compileTestJava FAILED > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > Note: Some input files use or override a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 2 errors > <-> 36% EXECUTING [19m 37s] > {noformat} > Memo for my Mac: > {noformat} > suztomo-macbookpro44% ./gradlew -p sdks/java check -x > extensions:sql:zetasql:check -x harness:test -x io:jdbc:test -x > io:kafka:test -x io:solr:test -x core:test > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?
[ https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407223=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407223 ] ASF GitHub Bot logged work on BEAM-9542: Author: ASF GitHub Bot Created on: 20/Mar/20 20:35 Start Date: 20/Mar/20 20:35 Worklog Time Spent: 10m Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and clarify the effect of "force" in Java build URL: https://github.com/apache/beam/pull/11168#issuecomment-601899190 Run Java HadoopFormatIO Performance Test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407223) Time Spent: 3h 40m (was: 3.5h) > Where the BeamModulePlugin's force is needed? > - > > Key: BEAM-9542 > URL: https://issues.apache.org/jira/browse/BEAM-9542 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735 > {noformat} > > Task :sdks:java:core:compileTestJava FAILED > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > Note: Some input files use or override a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 2 errors > <-> 36% EXECUTING [19m 37s] > {noformat} > Memo for my Mac: > {noformat} > suztomo-macbookpro44% ./gradlew -p sdks/java check -x > extensions:sql:zetasql:check -x harness:test -x io:jdbc:test -x > io:kafka:test -x io:solr:test -x core:test > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?
[ https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407224 ] ASF GitHub Bot logged work on BEAM-9542: Author: ASF GitHub Bot Created on: 20/Mar/20 20:35 Start Date: 20/Mar/20 20:35 Worklog Time Spent: 10m Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and clarify the effect of "force" in Java build URL: https://github.com/apache/beam/pull/11168#issuecomment-601899254 Run Dataflow ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407224) Time Spent: 3h 50m (was: 3h 40m) > Where the BeamModulePlugin's force is needed? > - > > Key: BEAM-9542 > URL: https://issues.apache.org/jira/browse/BEAM-9542 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735 > {noformat} > > Task :sdks:java:core:compileTestJava FAILED > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > Note: Some input files use or override a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 2 errors > <-> 36% EXECUTING [19m 37s] > {noformat} > Memo for my Mac: > {noformat} > suztomo-macbookpro44% ./gradlew -p sdks/java check -x > extensions:sql:zetasql:check -x harness:test -x io:jdbc:test -x > io:kafka:test -x io:solr:test -x core:test > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?
[ https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407226=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407226 ] ASF GitHub Bot logged work on BEAM-9542: Author: ASF GitHub Bot Created on: 20/Mar/20 20:35 Start Date: 20/Mar/20 20:35 Worklog Time Spent: 10m Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and clarify the effect of "force" in Java build URL: https://github.com/apache/beam/pull/11168#issuecomment-601899388 Run SQL Postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407226) Time Spent: 4h 10m (was: 4h) > Where the BeamModulePlugin's force is needed? > - > > Key: BEAM-9542 > URL: https://issues.apache.org/jira/browse/BEAM-9542 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735 > {noformat} > > Task :sdks:java:core:compileTestJava FAILED > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > Note: Some input files use or override a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 2 errors > <-> 36% EXECUTING [19m 37s] > {noformat} > Memo for my Mac: > {noformat} > suztomo-macbookpro44% ./gradlew -p sdks/java check -x > extensions:sql:zetasql:check -x harness:test -x io:jdbc:test -x > io:kafka:test -x io:solr:test -x core:test > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?
[ https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407222 ] ASF GitHub Bot logged work on BEAM-9542: Author: ASF GitHub Bot Created on: 20/Mar/20 20:34 Start Date: 20/Mar/20 20:34 Worklog Time Spent: 10m Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and clarify the effect of "force" in Java build URL: https://github.com/apache/beam/pull/11168#issuecomment-601899121 Run Java PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407222) Time Spent: 3.5h (was: 3h 20m) > Where the BeamModulePlugin's force is needed? > - > > Key: BEAM-9542 > URL: https://issues.apache.org/jira/browse/BEAM-9542 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735 > {noformat} > > Task :sdks:java:core:compileTestJava FAILED > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > Note: Some input files use or override a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 2 errors > <-> 36% EXECUTING [19m 37s] > {noformat} > Memo for my Mac: > {noformat} > suztomo-macbookpro44% ./gradlew -p sdks/java check -x > extensions:sql:zetasql:check -x harness:test -x io:jdbc:test -x > io:kafka:test -x io:solr:test -x core:test > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407219=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407219 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 20:30 Start Date: 20/Mar/20 20:30 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] Update `CHANGES.md` with noteworthy changes. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python |
[jira] [Work logged] (BEAM-4374) Update existing metrics in the FN API to use new Metric Schema
[ https://issues.apache.org/jira/browse/BEAM-4374?focusedWorklogId=407220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407220 ] ASF GitHub Bot logged work on BEAM-4374: Author: ASF GitHub Bot Created on: 20/Mar/20 20:30 Start Date: 20/Mar/20 20:30 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11184: [WIP][BEAM-4374] Update protos related to MonitoringInfo. URL: https://github.com/apache/beam/pull/11184#issuecomment-601897399 CC: @ajamato This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407220) Time Spent: 19h 10m (was: 19h) > Update existing metrics in the FN API to use new Metric Schema > -- > > Key: BEAM-4374 > URL: https://issues.apache.org/jira/browse/BEAM-4374 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Alex Amato >Priority: Major > Time Spent: 19h 10m > Remaining Estimate: 0h > > Update existing metrics to use the new proto and cataloging schema defined in: > [_https://s.apache.org/beam-fn-api-metrics_] > * Check in new protos > * Define catalog file for metrics > * Port existing metrics to use this new format, based on catalog > names+metadata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?
[ https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407217=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407217 ] ASF GitHub Bot logged work on BEAM-9542: Author: ASF GitHub Bot Created on: 20/Mar/20 20:22 Start Date: 20/Mar/20 20:22 Worklog Time Spent: 10m Work Description: suztomo commented on issue #11168: [BEAM-9542] Limit and clarify the effect of "force" in Java build URL: https://github.com/apache/beam/pull/11168#issuecomment-601894622 Yes, please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407217) Time Spent: 3h 20m (was: 3h 10m) > Where the BeamModulePlugin's force is needed? > - > > Key: BEAM-9542 > URL: https://issues.apache.org/jira/browse/BEAM-9542 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735 > {noformat} > > Task :sdks:java:core:compileTestJava FAILED > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > Note: Some input files use or override a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 2 errors > <-> 36% EXECUTING [19m 37s] > {noformat} > Memo for my Mac: > {noformat} > suztomo-macbookpro44% ./gradlew -p sdks/java check -x > extensions:sql:zetasql:check -x harness:test -x io:jdbc:test -x > io:kafka:test -x io:solr:test -x core:test > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9542) Where the BeamModulePlugin's force is needed?
[ https://issues.apache.org/jira/browse/BEAM-9542?focusedWorklogId=407216=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407216 ] ASF GitHub Bot logged work on BEAM-9542: Author: ASF GitHub Bot Created on: 20/Mar/20 20:21 Start Date: 20/Mar/20 20:21 Worklog Time Spent: 10m Work Description: je-ik commented on issue #11168: [BEAM-9542] Limit and clarify the effect of "force" in Java build URL: https://github.com/apache/beam/pull/11168#issuecomment-601894197 @suztomo do you need to rerun the checks that were not triggered? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407216) Time Spent: 3h 10m (was: 3h) > Where the BeamModulePlugin's force is needed? > - > > Key: BEAM-9542 > URL: https://issues.apache.org/jira/browse/BEAM-9542 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Tomo Suzuki >Assignee: Tomo Suzuki >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > Followup of https://github.com/apache/beam/pull/11156#discussion_r394408735 > {noformat} > > Task :sdks:java:core:compileTestJava FAILED > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/DeduplicateTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > /Users/suztomo/beam/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/splittabledofn/WatermarkEstimatorsTest.java:21: > error: cannot find symbol > import static org.junit.Assert.assertThrows; > ^ > symbol: static assertThrows > location: class > Note: Some input files use or override a deprecated API. > Note: Recompile with -Xlint:deprecation for details. > 2 errors > <-> 36% EXECUTING [19m 37s] > {noformat} > Memo for my Mac: > {noformat} > suztomo-macbookpro44% ./gradlew -p sdks/java check -x > extensions:sql:zetasql:check -x harness:test -x io:jdbc:test -x > io:kafka:test -x io:solr:test -x core:test > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images
[ https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=407213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407213 ] ASF GitHub Bot logged work on BEAM-9136: Author: ASF GitHub Bot Created on: 20/Mar/20 20:08 Start Date: 20/Mar/20 20:08 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on issue #11067: [BEAM-9136]Add licenses for dependencies URL: https://github.com/apache/beam/pull/11067#issuecomment-601889129 @tvalentyn , I fixed all your comments about Python. PTAL when you have time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407213) Time Spent: 2h 50m (was: 2h 40m) > Add LICENSES and NOTICES to docker images > - > > Key: BEAM-9136 > URL: https://issues.apache.org/jira/browse/BEAM-9136 > Project: Beam > Issue Type: Task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > Scan dependencies and add licenses and notices of the dependencies to SDK > docker images. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9136) Add LICENSES and NOTICES to docker images
[ https://issues.apache.org/jira/browse/BEAM-9136?focusedWorklogId=407212=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407212 ] ASF GitHub Bot logged work on BEAM-9136: Author: ASF GitHub Bot Created on: 20/Mar/20 20:07 Start Date: 20/Mar/20 20:07 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on pull request #11067: [BEAM-9136]Add licenses for dependencies URL: https://github.com/apache/beam/pull/11067#discussion_r395865763 ## File path: licenses/scripts/pull_licenses_py.py ## @@ -0,0 +1,157 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +"""A script to pull licenses for Python. +""" +import json +import os +import shutil +import subprocess +import sys +import yaml + +from tenacity import retry +from tenacity import stop_after_attempt + + +def run_bash_command(command): + process = subprocess.Popen(command.split(), + stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + result, error = process.communicate() + if error: +raise RuntimeError('Error occurred when running a bash command.', + 'command: ', command, 'error message: ', + error.decode('utf-8')) + return result.decode('utf-8') + + +try: + import wget +except: + command = 'pip install wget --no-cache-dir' + run_bash_command(command) + import wget + + +def install_pip_licenses(): + command = 'pip install pip-licenses --no-cache-dir' + run_bash_command(command) + + +def run_pip_licenses(): + command = 'pip-licenses --with-license-file --format=json' + dependencies = run_bash_command(command) + return json.loads(dependencies) + + +@retry(stop=stop_after_attempt(3)) +def copy_license_files(dep): + source_license_file = dep['LicenseFile'] + if source_license_file.lower() == 'unknown': +return False + name = dep['Name'] + dest_dir = '/'.join([license_dir, name.lower()]) + try: +if not os.path.isdir(dest_dir): + os.mkdir(dest_dir) +shutil.copy(source_license_file, dest_dir + '/LICENSE') +return True + except Exception as e: +print(e) +return False + + +@retry(stop=stop_after_attempt(3)) +def pull_from_url(dep, configs): + ''' + :param dep: name of a dependency + :param configs: a dict from dep_urls_py.yaml + :return: boolean + + It downloads files form urls to a temp directory first in order to avoid + to deal with any temp files. It helps keep clean final directory. + ''' + if dep in configs.keys(): +config = configs[dep] +dest_dir = '/'.join([license_dir, dep]) +cur_temp_dir = 'temp_license_' + dep +os.mkdir(cur_temp_dir) +try: + is_file_available = False + # license is required, but not all dependencies have license. + # In case we have to skip, print out a message. + if config['license'] != 'skip': +wget.download(config['license'], cur_temp_dir + '/LICENSE') +is_file_available = True + # notice is optional. + if 'notice' in config: +wget.download(config['notice'], cur_temp_dir + '/NOTICE') +is_file_available = True + # copy from temp dir to final dir only when either file is abailable. + if is_file_available: +if os.path.isdir(dest_dir): + shutil.rmtree(dest_dir) +shutil.copytree(cur_temp_dir, dest_dir) + result = True +except Exception as e: + print('Error occurred when pull license from url.', 'dependency =', +dep, 'url =', config, 'error = ', e.decode('utf-8')) + result = False +finally: + shutil.rmtree(cur_temp_dir) + return result + else: +return False + + +if __name__ == "__main__": + cur_dir = os.getcwd() + if cur_dir.split('/')[-1] != 'beam': +raise RuntimeError('This script should run from ~/beam directory.') + license_dir = os.getcwd() + '/licenses/python' + no_licenses = [] + + with open('licenses/scripts/dep_urls_py.yaml') as file: +dep_config = yaml.load(file) + + install_pip_licenses() Review comment: The script is running at a virtual environment with [apache-beam, test, gcp, docs,
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407195=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407195 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 19:32 Start Date: 20/Mar/20 19:32 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179#discussion_r395850295 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -446,23 +444,16 @@ func validateMainInputs(fn *Fn, method *funcx.Fn, methodName string, numMainIn m return err } - // Check that the first numMainIn inputs are not side inputs (Iters or - // ReIters). We aren't able to catch singleton side inputs here since - // they're indistinguishable from main inputs. - mainInputs := method.Param[pos : pos+int(numMainIn)] - for i, p := range mainInputs { - if p.Kind != funcx.FnValue { - err := errors.Errorf("expected main input parameter but found "+ - "side input parameter in position %v", - pos+i) - err = errors.SetTopLevelMsgf(err, - "Method %v in DoFn %v should have all main inputs before side inputs, "+ - "but a side input (as Iter or ReIter) appears as parameter %v when a "+ - "main input was expected.", - methodName, fn.Name(), pos+i) - err = errors.WithContextf(err, "method %v", methodName) - return err - } + // Check that the first input is not an Iter or ReIter (those aren't valid + // as the first main input). + first := method.Param[pos].Kind + if first != funcx.FnValue { + err := errors.New("first main input parameter must be value type") Review comment: I'll just add it in real quick while squashing the commits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407195) Time Spent: 8h (was: 7h 50m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 8h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.
[ https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407190=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407190 ] ASF GitHub Bot logged work on BEAM-9340: Author: ASF GitHub Bot Created on: 20/Mar/20 19:22 Start Date: 20/Mar/20 19:22 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11165: [BEAM-9340] Populate requirements for Java. URL: https://github.com/apache/beam/pull/11165#issuecomment-601872470 I rebased on top of your PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407190) Time Spent: 1h 40m (was: 1.5h) > Properly populate pipeline proto requirements. > -- > > Key: BEAM-9340 > URL: https://issues.apache.org/jira/browse/BEAM-9340 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.20.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9340) Properly populate pipeline proto requirements.
[ https://issues.apache.org/jira/browse/BEAM-9340?focusedWorklogId=407189=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407189 ] ASF GitHub Bot logged work on BEAM-9340: Author: ASF GitHub Bot Created on: 20/Mar/20 19:22 Start Date: 20/Mar/20 19:22 Worklog Time Spent: 10m Work Description: robertwb commented on pull request #11165: [BEAM-9340] Populate requirements for Java. URL: https://github.com/apache/beam/pull/11165#discussion_r395845584 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java ## @@ -124,6 +124,15 @@ public static final String SPLITTABLE_PROCESS_SIZED_ELEMENTS_AND_RESTRICTIONS_URN = "beam:transform:sdf_process_sized_element_and_restrictions:v1"; + public static final String REQUIRES_STATEFUL_PROCESSING_URN = + getUrn(RunnerApi.StandardRequirements.Enum.REQUIRES_STATEFUL_PROCESSING); Review comment: That's unfortunate, but I see the pattern. Have to be vigilant to prevent bugs. (Unlikely that these'll be used in switch statements, but consistency is good.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407189) Time Spent: 1.5h (was: 1h 20m) > Properly populate pipeline proto requirements. > -- > > Key: BEAM-9340 > URL: https://issues.apache.org/jira/browse/BEAM-9340 > Project: Beam > Issue Type: New Feature > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.20.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407186=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407186 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 19:12 Start Date: 20/Mar/20 19:12 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179#discussion_r395840270 ## File path: sdks/go/pkg/beam/core/graph/fn.go ## @@ -446,23 +444,16 @@ func validateMainInputs(fn *Fn, method *funcx.Fn, methodName string, numMainIn m return err } - // Check that the first numMainIn inputs are not side inputs (Iters or - // ReIters). We aren't able to catch singleton side inputs here since - // they're indistinguishable from main inputs. - mainInputs := method.Param[pos : pos+int(numMainIn)] - for i, p := range mainInputs { - if p.Kind != funcx.FnValue { - err := errors.Errorf("expected main input parameter but found "+ - "side input parameter in position %v", - pos+i) - err = errors.SetTopLevelMsgf(err, - "Method %v in DoFn %v should have all main inputs before side inputs, "+ - "but a side input (as Iter or ReIter) appears as parameter %v when a "+ - "main input was expected.", - methodName, fn.Name(), pos+i) - err = errors.WithContextf(err, "method %v", methodName) - return err - } + // Check that the first input is not an Iter or ReIter (those aren't valid + // as the first main input). + first := method.Param[pos].Kind + if first != funcx.FnValue { + err := errors.New("first main input parameter must be value type") Review comment: ...must be a value type.. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407186) Time Spent: 7h 50m (was: 7h 40m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 7h 50m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407185=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407185 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 19:06 Start Date: 20/Mar/20 19:06 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179#discussion_r395838378 ## File path: sdks/go/pkg/beam/pcollection.go ## @@ -60,6 +60,12 @@ func (p PCollection) Type() FullType { return p.n.Type() } +// OutputsKV returns whether the output of this PCollection are single value +// elements or KV pairs. +func (p PCollection) OutputsKV() bool { Review comment: That's my usual guideline. If I use it once, keep it in place; twice, copy it; three times, helper function. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407185) Time Spent: 7h 40m (was: 7.5h) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 7h 40m > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage
[ https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=407183=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407183 ] ASF GitHub Bot logged work on BEAM-8889: Author: ASF GitHub Bot Created on: 20/Mar/20 19:02 Start Date: 20/Mar/20 19:02 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #11183: [BEAM-8889]add experiment flag use_grpc_for_gcs URL: https://github.com/apache/beam/pull/11183#issuecomment-601864385 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407183) Remaining Estimate: 148h 20m (was: 148.5h) Time Spent: 19h 40m (was: 19.5h) > Make GcsUtil use GoogleCloudStorage > --- > > Key: BEAM-8889 > URL: https://issues.apache.org/jira/browse/BEAM-8889 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Affects Versions: 2.16.0 >Reporter: Esun Kim >Assignee: VASU NORI >Priority: Major > Labels: gcs > Original Estimate: 168h > Time Spent: 19h 40m > Remaining Estimate: 148h 20m > > [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java] > is a primary class to access Google Cloud Storage on Apache Beam. Current > implementation directly creates GoogleCloudStorageReadChannel and > GoogleCloudStorageWriteChannel by itself to read and write GCS data rather > than using > [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java] > which is an abstract class providing basic IO capability which eventually > creates channel objects. This request is about updating GcsUtil to use > GoogleCloudStorage to create read and write channel, which is expected > flexible because it can easily pick up the new change; e.g. new channel > implementation using new protocol without code change. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage
[ https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=407184=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407184 ] ASF GitHub Bot logged work on BEAM-8889: Author: ASF GitHub Bot Created on: 20/Mar/20 19:02 Start Date: 20/Mar/20 19:02 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #11183: [BEAM-8889]add experiment flag use_grpc_for_gcs URL: https://github.com/apache/beam/pull/11183#issuecomment-601864474 LGTM. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407184) Remaining Estimate: 148h 10m (was: 148h 20m) Time Spent: 19h 50m (was: 19h 40m) > Make GcsUtil use GoogleCloudStorage > --- > > Key: BEAM-8889 > URL: https://issues.apache.org/jira/browse/BEAM-8889 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp >Affects Versions: 2.16.0 >Reporter: Esun Kim >Assignee: VASU NORI >Priority: Major > Labels: gcs > Original Estimate: 168h > Time Spent: 19h 50m > Remaining Estimate: 148h 10m > > [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java] > is a primary class to access Google Cloud Storage on Apache Beam. Current > implementation directly creates GoogleCloudStorageReadChannel and > GoogleCloudStorageWriteChannel by itself to read and write GCS data rather > than using > [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java] > which is an abstract class providing basic IO capability which eventually > creates channel objects. This request is about updating GcsUtil to use > GoogleCloudStorage to create read and write channel, which is expected > flexible because it can easily pick up the new change; e.g. new channel > implementation using new protocol without code change. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9563) TFRecordIO inefficient read from sideinput causing pipeline to be slow - fix
[ https://issues.apache.org/jira/browse/BEAM-9563?focusedWorklogId=407182=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407182 ] ASF GitHub Bot logged work on BEAM-9563: Author: ASF GitHub Bot Created on: 20/Mar/20 19:02 Start Date: 20/Mar/20 19:02 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #11180: [BEAM-9563] Change ToListCombineFn access level to private URL: https://github.com/apache/beam/pull/11180#issuecomment-601864250 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407182) Time Spent: 0.5h (was: 20m) > TFRecordIO inefficient read from sideinput causing pipeline to be slow - fix > > > Key: BEAM-9563 > URL: https://issues.apache.org/jira/browse/BEAM-9563 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: Not applicable >Reporter: Piotr Szuberski >Assignee: Piotr Szuberski >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > The class ToListCombineFn in the previous task has public access level but > can be private. > sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java:420 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9383) Staging Dataflow artifacts from environment
[ https://issues.apache.org/jira/browse/BEAM-9383?focusedWorklogId=407181=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407181 ] ASF GitHub Bot logged work on BEAM-9383: Author: ASF GitHub Bot Created on: 20/Mar/20 19:01 Start Date: 20/Mar/20 19:01 Worklog Time Spent: 10m Work Description: chamikaramj commented on issue #11039: [BEAM-9383] Staging Dataflow artifacts from environment URL: https://github.com/apache/beam/pull/11039#issuecomment-601864064 Retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407181) Time Spent: 2h (was: 1h 50m) > Staging Dataflow artifacts from environment > --- > > Key: BEAM-9383 > URL: https://issues.apache.org/jira/browse/BEAM-9383 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > Staging Dataflow artifacts from environment -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-3301) Go SplittableDoFn support
[ https://issues.apache.org/jira/browse/BEAM-3301?focusedWorklogId=407180=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407180 ] ASF GitHub Bot logged work on BEAM-3301: Author: ASF GitHub Bot Created on: 20/Mar/20 19:01 Start Date: 20/Mar/20 19:01 Worklog Time Spent: 10m Work Description: youngoli commented on pull request #11179: [BEAM-3301] Bugfix in DoFn validation. URL: https://github.com/apache/beam/pull/11179#discussion_r395836071 ## File path: sdks/go/pkg/beam/pcollection.go ## @@ -60,6 +60,12 @@ func (p PCollection) Type() FullType { return p.n.Type() } +// OutputsKV returns whether the output of this PCollection are single value +// elements or KV pairs. +func (p PCollection) OutputsKV() bool { Review comment: I was originally picturing this as a helper function for callers of NewDoFn. It seems easy for future callers to make a mistake and only check if the PCollection is a KV and forget to check for CoGBK, so I thought a helper method would be useful in the future. 1. I missed that pardo.go is in the same package as pcollection.go. I'm also leaning to not expanding the user surface if it's not necessary. 2 & 3. Yeah I was unsure about the name, since it's not technically checking for KVs, I just couldn't think of anything better. IsKeyed sounds good though. 4. That's the other part I was debating. My goal was to make it easy to avoid the mistake in the future, but thinking about it... It seems unlikely that anyone else would even be using this code, and I would expect that if they were they were an advanced user doing something tricky. I think I'll go with just putting the conditional in pardo.go and adding a comment. We can always turn it into a helper later if it does get used in multiple places. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407180) Time Spent: 7.5h (was: 7h 20m) > Go SplittableDoFn support > - > > Key: BEAM-3301 > URL: https://issues.apache.org/jira/browse/BEAM-3301 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Henning Rohde >Assignee: Daniel Oliveira >Priority: Major > Time Spent: 7.5h > Remaining Estimate: 0h > > SDFs will be the only way to add streaming and liquid sharded IO for Go. > Design doc: https://s.apache.org/splittable-do-fn -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9339) Declare capabilities in SDK environments
[ https://issues.apache.org/jira/browse/BEAM-9339?focusedWorklogId=407176=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407176 ] ASF GitHub Bot logged work on BEAM-9339: Author: ASF GitHub Bot Created on: 20/Mar/20 18:55 Start Date: 20/Mar/20 18:55 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #11162: [BEAM-9339, BEAM-2939] Drop splittable field from ParDoPayload, add splittable dofn requirement to Python SDK. URL: https://github.com/apache/beam/pull/11162 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407176) Time Spent: 7h 10m (was: 7h) > Declare capabilities in SDK environments > > > Key: BEAM-9339 > URL: https://issues.apache.org/jira/browse/BEAM-9339 > Project: Beam > Issue Type: New Feature > Components: sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Time Spent: 7h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9558) Make end of data channel explicit
[ https://issues.apache.org/jira/browse/BEAM-9558?focusedWorklogId=407169=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407169 ] ASF GitHub Bot logged work on BEAM-9558: Author: ASF GitHub Bot Created on: 20/Mar/20 18:48 Start Date: 20/Mar/20 18:48 Worklog Time Spent: 10m Work Description: robertwb commented on issue #11173: [BEAM-9558] Add explicit end bit for data channel. URL: https://github.com/apache/beam/pull/11173#issuecomment-601858195 So I spent quite a bit of time refactoring things to do the separate "end all streams for this bundle" proposal that we discussed, and it turns out that gets quite messy. (One of the things that makes this less nice is, as things stand, there may be multiple data stream endpoints for a single bundle. The need to preserve ordering between the data and end markers across threads other than those handling the grpc request, and the fact that buffering streams need to flush so closing them all needs to happen anyway, reduces the potential wins.) Combined with the fact that it gets even uglier to try to support both modes simultaneously during any transition period, I'm just going to go for this simpler, original fix for now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407169) Time Spent: 0.5h (was: 20m) > Make end of data channel explicit > - > > Key: BEAM-9558 > URL: https://issues.apache.org/jira/browse/BEAM-9558 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Major > Fix For: 2.21.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently the end of a data channel is implicitly marked by sending an empty > data block. The protocol would be simplified by making this explicit, and it > would also prevent data loss bugs that might occur by accidentally sending an > empty block. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable
[ https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=407167=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407167 ] ASF GitHub Bot logged work on BEAM-8280: Author: ASF GitHub Bot Created on: 20/Mar/20 18:42 Start Date: 20/Mar/20 18:42 Worklog Time Spent: 10m Work Description: udim commented on pull request #11070: [BEAM-8280] Blog post: Python typing changes URL: https://github.com/apache/beam/pull/11070#discussion_r395826290 ## File path: website/src/_posts/2020-03-06-python-typing.md ## @@ -0,0 +1,117 @@ +--- +layout: post +title: "Python SDK Typing Changes" +date: 2020-03-06 00:00:01 -0800 +excerpt_separator: +categories: blog python typing +authors: + - chadrik + - udim + +--- + + +TODO excerpt + + + +Python supports type annotations on functions (PEP 484). Static type checkers, +such as mypy, are used to verify adherence to these types. +For example: +```py +def f(v: int) -> int: + return v[0] +``` +Running mypy on the above code will give the error: +`Value of type "int" is not indexable`. + +We've recently made changes to Beam in 2 areas: + +Adding type hints throughout Beam. TODO expand + +Second, we've added support for Python 3 type annotations. This allows SDK +users to specify a DoFn's type hints in one place. +We've also expanded Beam's support of `typing` module types. + +For more background see: +[Ensuring Python Type Safety](https://beam.apache.org/documentation/sdks/python-type-safety/). + +# Beam Is Typed + +TODO Review comment: So, no immediate rush. We are also adjusting to WFH This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407167) Time Spent: 9h 50m (was: 9h 40m) > re-enable IOTypeHints.from_callable > --- > > Key: BEAM-8280 > URL: https://issues.apache.org/jira/browse/BEAM-8280 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Time Spent: 9h 50m > Remaining Estimate: 0h > > See https://issues.apache.org/jira/browse/BEAM-8279 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9562) Remove timer from PCollection and treat timers as Elements
[ https://issues.apache.org/jira/browse/BEAM-9562?focusedWorklogId=407164=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-407164 ] ASF GitHub Bot logged work on BEAM-9562: Author: ASF GitHub Bot Created on: 20/Mar/20 18:37 Start Date: 20/Mar/20 18:37 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #11177: [BEAM-9562] Add Timer to Elements proto representation. URL: https://github.com/apache/beam/pull/11177 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 407164) Time Spent: 1h (was: 50m) > Remove timer from PCollection and treat timers as Elements > --- > > Key: BEAM-9562 > URL: https://issues.apache.org/jira/browse/BEAM-9562 > Project: Beam > Issue Type: New Feature > Components: sdk-py-harness >Reporter: Boyuan Zhang >Assignee: Boyuan Zhang >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)