[jira] [Assigned] (BEAM-9289) Improve performance for metrics update of samza runner

2020-02-10 Thread Yixing Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yixing Zhang reassigned BEAM-9289:
--

Assignee: Yixing Zhang

> Improve performance for metrics update of samza runner
> --
>
> Key: BEAM-9289
> URL: https://issues.apache.org/jira/browse/BEAM-9289
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-samza
>Reporter: Yixing Zhang
>Assignee: Yixing Zhang
>Priority: Major
>
> * Reduce metrics update per do function from 4 -> 1.
>  * Query for the metrics for the current step to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=384985=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384985
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 11/Feb/20 06:08
Start Date: 11/Feb/20 06:08
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #10290: [BEAM-8561] 
Add ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#discussion_r377457203
 
 

 ##
 File path: 
sdks/java/io/thrift/src/test/java/org/apache/beam/sdk/io/thrift/ThriftIOTest.java
 ##
 @@ -0,0 +1,233 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.thrift;
+
+import java.io.Serializable;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import 
org.apache.beam.repackaged.core.org.apache.commons.lang3.RandomStringUtils;
+import org.apache.beam.repackaged.core.org.apache.commons.lang3.RandomUtils;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.io.FileIO;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.values.PCollection;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.io.Resources;
+import org.apache.thrift.protocol.TBinaryProtocol;
+import org.apache.thrift.protocol.TCompactProtocol;
+import org.apache.thrift.protocol.TJSONProtocol;
+import org.apache.thrift.protocol.TProtocolFactory;
+import org.apache.thrift.protocol.TSimpleJSONProtocol;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Tests for {@link ThriftIO}. */
+@RunWith(JUnit4.class)
+public class ThriftIOTest implements Serializable {
+
+  private static final String RESOURCE_DIR = "ThriftIOTest/";
+
+  private static final String THRIFT_DIR = 
Resources.getResource(RESOURCE_DIR).getPath();
+  private static final String ALL_THRIFT_STRING =
+  Resources.getResource(RESOURCE_DIR).getPath() + "*";
+  private static final TestThriftStruct TEST_THRIFT_STRUCT = new 
TestThriftStruct();
+  private static List testThriftStructs;
+  private final TProtocolFactory tBinaryProtoFactory = new 
TBinaryProtocol.Factory();
+  private final TProtocolFactory tJsonProtocolFactory = new 
TJSONProtocol.Factory();
+  private final TProtocolFactory tSimpleJsonProtocolFactory = new 
TSimpleJSONProtocol.Factory();
+  private final TProtocolFactory tCompactProtocolFactory = new 
TCompactProtocol.Factory();
+  @Rule public transient TestPipeline mainPipeline = TestPipeline.create();
+  @Rule public transient TestPipeline readPipeline = TestPipeline.create();
+  @Rule public transient TestPipeline writePipeline = TestPipeline.create();
+  @Rule public transient TemporaryFolder temporaryFolder = new 
TemporaryFolder();
+
+  @Before
+  public void setUp() throws Exception {
+byte[] bytes = new byte[10];
+ByteBuffer buffer = ByteBuffer.wrap(bytes);
+
+TEST_THRIFT_STRUCT.testByte = 100;
+TEST_THRIFT_STRUCT.testShort = 200;
+TEST_THRIFT_STRUCT.testInt = 2500;
+TEST_THRIFT_STRUCT.testLong = 79303L;
+TEST_THRIFT_STRUCT.testDouble = 25.007;
+TEST_THRIFT_STRUCT.testBool = true;
+TEST_THRIFT_STRUCT.stringIntMap = new HashMap<>();
+TEST_THRIFT_STRUCT.stringIntMap.put("first", (short) 1);
+TEST_THRIFT_STRUCT.stringIntMap.put("second", (short) 2);
+TEST_THRIFT_STRUCT.testBinary = buffer;
+
+testThriftStructs = ImmutableList.copyOf(generateTestObjects(1000L));
+  }
+
+  /** Tests {@link ThriftIO#readFiles(Class)} with {@link TBinaryProtocol}. */
+  @Test
 
 Review comment:
   There doesn't seem to be a test for `read().from(...)` - should there be 
one? (and it's okay if the answer is 'don't need one') : )
 

[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=384986=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384986
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 11/Feb/20 06:08
Start Date: 11/Feb/20 06:08
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #10290: [BEAM-8561] 
Add ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#discussion_r377457282
 
 

 ##
 File path: 
sdks/java/io/thrift/src/main/java/org/apache/beam/sdk/io/thrift/ThriftIO.java
 ##
 @@ -0,0 +1,289 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.thrift;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import com.google.auto.value.AutoValue;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.nio.channels.Channels;
+import java.nio.channels.WritableByteChannel;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.io.FileIO;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.thrift.TBase;
+import org.apache.thrift.TException;
+import org.apache.thrift.protocol.TProtocol;
+import org.apache.thrift.protocol.TProtocolFactory;
+import org.apache.thrift.transport.TIOStreamTransport;
+import org.apache.thrift.transport.TTransportException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link PTransform}s for reading and writing files containing Thrift encoded 
data.
+ *
+ * Reading Thrift Files
+ *
+ * For simple reading, use {@link ThriftIO#} with the desired file pattern 
to read from.
+ *
+ * For example:
+ *
+ * {@code
+ * PCollection examples = 
pipeline.apply(ThriftIO.read().from("/foo/bar/*"));
 
 Review comment:
   Do we need to pass the class to this transform? Why/why not?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384986)
Time Spent: 10h 10m  (was: 10h)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=384984=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384984
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 11/Feb/20 06:08
Start Date: 11/Feb/20 06:08
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #10290: [BEAM-8561] 
Add ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#discussion_r377417741
 
 

 ##
 File path: 
sdks/java/io/thrift/src/main/java/org/apache/beam/sdk/io/thrift/ThriftCoder.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.thrift;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.ObjectInputStream;
+import java.io.ObjectOutputStream;
+import java.io.OutputStream;
+import org.apache.beam.sdk.coders.CoderException;
+import org.apache.beam.sdk.coders.CustomCoder;
+
+public class ThriftCoder extends CustomCoder {
+
+  public static  ThriftCoder of() {
+return new ThriftCoder<>();
+  }
+
+  /**
+   * Encodes the given value of type {@code T} onto the given output stream.
+   *
+   * @param value {@link org.apache.thrift.TBase} to encode.
+   * @param outStream stream to output encoded value to.
+   * @throws IOException if writing to the {@code OutputStream} fails for some 
reason
+   * @throws CoderException if the value could not be encoded for some reason
+   */
+  @Override
+  public void encode(T value, OutputStream outStream) throws CoderException, 
IOException {
+ObjectOutputStream oos = new ObjectOutputStream(outStream);
+oos.writeObject(value);
+oos.flush();
+  }
 
 Review comment:
   It seems that the coder simply does Java serialization. Is that right?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384984)
Time Spent: 9h 50m  (was: 9h 40m)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=384987=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384987
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 11/Feb/20 06:08
Start Date: 11/Feb/20 06:08
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #10290: [BEAM-8561] 
Add ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#discussion_r377455798
 
 

 ##
 File path: 
sdks/java/io/thrift/src/test/java/org/apache/beam/sdk/io/thrift/TestThriftStruct.java
 ##
 @@ -0,0 +1,1232 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
 
 Review comment:
   Is this file meant to be included in the commit? Or can it be generated from 
the .thrift file - without committing it?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384987)
Time Spent: 10h 20m  (was: 10h 10m)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=384988=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384988
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 11/Feb/20 06:08
Start Date: 11/Feb/20 06:08
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #10290: [BEAM-8561] 
Add ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#discussion_r377456909
 
 

 ##
 File path: 
sdks/java/io/thrift/src/main/java/org/apache/beam/sdk/io/thrift/ThriftCoder.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.thrift;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.ObjectInputStream;
+import java.io.ObjectOutputStream;
+import java.io.OutputStream;
+import org.apache.beam.sdk.coders.CoderException;
+import org.apache.beam.sdk.coders.CustomCoder;
+
+public class ThriftCoder extends CustomCoder {
+
+  public static  ThriftCoder of() {
+return new ThriftCoder<>();
+  }
+
+  /**
+   * Encodes the given value of type {@code T} onto the given output stream.
+   *
+   * @param value {@link org.apache.thrift.TBase} to encode.
+   * @param outStream stream to output encoded value to.
+   * @throws IOException if writing to the {@code OutputStream} fails for some 
reason
+   * @throws CoderException if the value could not be encoded for some reason
+   */
+  @Override
+  public void encode(T value, OutputStream outStream) throws CoderException, 
IOException {
+ObjectOutputStream oos = new ObjectOutputStream(outStream);
+oos.writeObject(value);
+oos.flush();
+  }
 
 Review comment:
   Ah so we only consume thrift-serialized records from the storage system, but 
Java-serialize within the pipeline. That's fine.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384988)
Time Spent: 10h 20m  (was: 10h 10m)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8561) Add ThriftIO to Support IO for Thrift Files

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8561?focusedWorklogId=384989=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384989
 ]

ASF GitHub Bot logged work on BEAM-8561:


Author: ASF GitHub Bot
Created on: 11/Feb/20 06:08
Start Date: 11/Feb/20 06:08
Worklog Time Spent: 10m 
  Work Description: pabloem commented on pull request #10290: [BEAM-8561] 
Add ThriftIO to support IO for Thrift files
URL: https://github.com/apache/beam/pull/10290#discussion_r377457540
 
 

 ##
 File path: 
sdks/java/io/thrift/src/main/java/org/apache/beam/sdk/io/thrift/ThriftIO.java
 ##
 @@ -0,0 +1,289 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.thrift;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import com.google.auto.value.AutoValue;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.nio.channels.Channels;
+import java.nio.channels.WritableByteChannel;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.io.FileIO;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.thrift.TBase;
+import org.apache.thrift.TException;
+import org.apache.thrift.protocol.TProtocol;
+import org.apache.thrift.protocol.TProtocolFactory;
+import org.apache.thrift.transport.TIOStreamTransport;
+import org.apache.thrift.transport.TTransportException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link PTransform}s for reading and writing files containing Thrift encoded 
data.
+ *
+ * Reading Thrift Files
+ *
+ * For simple reading, use {@link ThriftIO#} with the desired file pattern 
to read from.
+ *
+ * For example:
+ *
+ * {@code
+ * PCollection examples = 
pipeline.apply(ThriftIO.read().from("/foo/bar/*"));
 
 Review comment:
   In fact, it seems that `read()` is not implemented?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384989)

> Add ThriftIO to Support IO for Thrift Files
> ---
>
> Key: BEAM-8561
> URL: https://issues.apache.org/jira/browse/BEAM-8561
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-files
>Reporter: Chris Larsen
>Assignee: Chris Larsen
>Priority: Major
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> Similar to AvroIO it would be very useful to support reading and writing 
> to/from Thrift files with a native connector. 
> Functionality would include:
>  # read() - Reading from one or more Thrift files.
>  # write() - Writing to one or more Thrift files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Esun Kim (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034131#comment-17034131
 ] 

Esun Kim commented on BEAM-9288:


Gcsio starts depending on gRPC along with conscrypt and we're considering if 
it's feasible to shade it in gcsio. Since Conscrypt is not ready for being 
shaded, however, shading isn't considered as an option for now. I think it 
shouldn't shade it until Conscrypt becomes fully aware of shading. 

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Assignee: sunjincheng
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034076#comment-17034076
 ] 

Luke Cwik edited comment on BEAM-9288 at 2/11/20 3:46 AM:
--

If gcs io is shading conscrypt, then it should fix it's shading. It shouldn't 
rely on a provided conscrypt unless gcs io marks the dependency as provided.

Apache beam should fix it's own usage to not interfere with others usage of it 
and not have users depend on it.


was (Author: lcwik):
If gcs io is shading conscrypt, then it should fix it's shading. It should rely 
on a provided conscrypt unless gcs io marks the dependency as provided.

Apache beam should fix it's own usage to not interfere with others usage of it 
and not have users depend on it.

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Assignee: sunjincheng
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8537) Provide WatermarkEstimatorProvider for different types of WatermarkEstimator

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8537?focusedWorklogId=384947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384947
 ]

ASF GitHub Bot logged work on BEAM-8537:


Author: ASF GitHub Bot
Created on: 11/Feb/20 02:46
Start Date: 11/Feb/20 02:46
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #10802: [BEAM-8537] Move 
wrappers of RestrictionTracker out of iobase
URL: https://github.com/apache/beam/pull/10802#issuecomment-584457498
 
 
   These failed tests actually passes on my local machine. It's highly possible 
that we have some infra problems with Jenkins.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384947)
Time Spent: 13h 10m  (was: 13h)

> Provide WatermarkEstimatorProvider for different types of WatermarkEstimator
> 
>
> Key: BEAM-8537
> URL: https://issues.apache.org/jira/browse/BEAM-8537
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> This is a follow up for in-progress PR:  
> https://github.com/apache/beam/pull/9794.
> Current implementation in PR9794 provides a default implementation of 
> WatermarkEstimator. For further work, we want to let WatermarkEstimator to be 
> a pure Interface. We'll provide a WatermarkEstimatorProvider to be able to 
> create a custom WatermarkEstimator per windowed value. It should be similar 
> to how we track restriction for SDF: 
> WatermarkEstimator <---> RestrictionTracker 
> WatermarkEstimatorProvider <---> RestrictionTrackerProvider
> WatermarkEstimatorParam <---> RestrictionDoFnParam



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8537) Provide WatermarkEstimatorProvider for different types of WatermarkEstimator

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8537?focusedWorklogId=384941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384941
 ]

ASF GitHub Bot logged work on BEAM-8537:


Author: ASF GitHub Bot
Created on: 11/Feb/20 02:27
Start Date: 11/Feb/20 02:27
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #10802: [BEAM-8537] Move 
wrappers of RestrictionTracker out of iobase
URL: https://github.com/apache/beam/pull/10802#issuecomment-584453992
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384941)
Time Spent: 13h  (was: 12h 50m)

> Provide WatermarkEstimatorProvider for different types of WatermarkEstimator
> 
>
> Key: BEAM-8537
> URL: https://issues.apache.org/jira/browse/BEAM-8537
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> This is a follow up for in-progress PR:  
> https://github.com/apache/beam/pull/9794.
> Current implementation in PR9794 provides a default implementation of 
> WatermarkEstimator. For further work, we want to let WatermarkEstimator to be 
> a pure Interface. We'll provide a WatermarkEstimatorProvider to be able to 
> create a custom WatermarkEstimator per windowed value. It should be similar 
> to how we track restriction for SDF: 
> WatermarkEstimator <---> RestrictionTracker 
> WatermarkEstimatorProvider <---> RestrictionTrackerProvider
> WatermarkEstimatorParam <---> RestrictionDoFnParam



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9273) Explicitly fail pipeline with @RequiresTimeSortedInput with unsupported runner

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9273?focusedWorklogId=384935=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384935
 ]

ASF GitHub Bot logged work on BEAM-9273:


Author: ASF GitHub Bot
Created on: 11/Feb/20 02:25
Start Date: 11/Feb/20 02:25
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #10816: 
[BEAM-9273] Explicitly disable @RequiresTimeSortedInput on unsupported runners
URL: https://github.com/apache/beam/pull/10816#discussion_r377421041
 
 

 ##
 File path: 
runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/DoFnFeaturesTest.java
 ##
 @@ -0,0 +1,276 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.core.construction;
+
+import static org.apache.beam.sdk.testing.SerializableMatchers.equalTo;
+import static org.hamcrest.MatcherAssert.assertThat;
+
+import java.util.List;
+import org.apache.beam.sdk.io.range.OffsetRange;
+import org.apache.beam.sdk.state.MapState;
+import org.apache.beam.sdk.state.SetState;
+import org.apache.beam.sdk.state.StateSpec;
+import org.apache.beam.sdk.state.StateSpecs;
+import org.apache.beam.sdk.state.TimeDomain;
+import org.apache.beam.sdk.state.TimerSpec;
+import org.apache.beam.sdk.state.TimerSpecs;
+import org.apache.beam.sdk.state.ValueState;
+import org.apache.beam.sdk.state.WatermarkHoldState;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
+import org.apache.beam.sdk.transforms.windowing.TimestampCombiner;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.Lists;
+import org.junit.Test;
+
+/** Test suite for {@link DoFnFeatures}. */
+public class DoFnFeaturesTest {
 
 Review comment:
   Ditto `DoFnSignaturesTest` (utility methods tests)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384935)
Time Spent: 3h  (was: 2h 50m)

> Explicitly fail pipeline with @RequiresTimeSortedInput with unsupported runner
> --
>
> Key: BEAM-9273
> URL: https://issues.apache.org/jira/browse/BEAM-9273
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Fail pipeline with @RequiresTimeSortedInput annotation in pipeline 
> translation time when being run with unsupported runner. Currently, 
> unsupported runners are:
>  - apex
>  - portable flink
>  - gearpump
>  - dataflow
>  - jet
>  - samza
>  - spark structured streaming
> These runners should reject the pipeline.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9273) Explicitly fail pipeline with @RequiresTimeSortedInput with unsupported runner

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9273?focusedWorklogId=384933=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384933
 ]

ASF GitHub Bot logged work on BEAM-9273:


Author: ASF GitHub Bot
Created on: 11/Feb/20 02:25
Start Date: 11/Feb/20 02:25
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #10816: 
[BEAM-9273] Explicitly disable @RequiresTimeSortedInput on unsupported runners
URL: https://github.com/apache/beam/pull/10816#discussion_r377421456
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/BatchStatefulParDoOverrides.java
 ##
 @@ -176,7 +176,7 @@ private MultiOutputOverrideFactory(boolean isFnApi) {
 public PCollection expand(PCollection> input) {
   DoFn, OutputT> fn = originalParDo.getFn();
   verifyFnIsStateful(fn);
-  DataflowRunner.verifyStateSupported(fn);
+  DataflowRunner.verifyDoFnSupported(fn, false);
 
 Review comment:
   Passing a raw bool into a function call is not very readable. I suggest 
splitting into `verifyDoFnSupportedForStreaming` and 
`verifyDoFnSupportedForBatch`. These can each call the common code.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384933)
Time Spent: 2h 50m  (was: 2h 40m)

> Explicitly fail pipeline with @RequiresTimeSortedInput with unsupported runner
> --
>
> Key: BEAM-9273
> URL: https://issues.apache.org/jira/browse/BEAM-9273
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Fail pipeline with @RequiresTimeSortedInput annotation in pipeline 
> translation time when being run with unsupported runner. Currently, 
> unsupported runners are:
>  - apex
>  - portable flink
>  - gearpump
>  - dataflow
>  - jet
>  - samza
>  - spark structured streaming
> These runners should reject the pipeline.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9273) Explicitly fail pipeline with @RequiresTimeSortedInput with unsupported runner

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9273?focusedWorklogId=384934=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384934
 ]

ASF GitHub Bot logged work on BEAM-9273:


Author: ASF GitHub Bot
Created on: 11/Feb/20 02:25
Start Date: 11/Feb/20 02:25
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #10816: 
[BEAM-9273] Explicitly disable @RequiresTimeSortedInput on unsupported runners
URL: https://github.com/apache/beam/pull/10816#discussion_r377420966
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/DoFnFeatures.java
 ##
 @@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.core.construction;
+
+import org.apache.beam.sdk.state.BagState;
+import org.apache.beam.sdk.state.MapState;
+import org.apache.beam.sdk.state.SetState;
+import org.apache.beam.sdk.state.State;
+import org.apache.beam.sdk.state.ValueState;
+import org.apache.beam.sdk.state.WatermarkHoldState;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.reflect.DoFnSignatures;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/**
+ * Features a {@link DoFn} can posses. Each runner might implement a different 
(sub)set of this
 
 Review comment:
   The `DoFnSignature` is precisely the list of features already, no? It is OK 
to have helper methods anyhow.
   
   Style point: for a given class, utility static methods usually go in a 
companion class like `DoFnSignature` (class) & `DoFnSignatures` (utility 
methods)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384934)

> Explicitly fail pipeline with @RequiresTimeSortedInput with unsupported runner
> --
>
> Key: BEAM-9273
> URL: https://issues.apache.org/jira/browse/BEAM-9273
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Fail pipeline with @RequiresTimeSortedInput annotation in pipeline 
> translation time when being run with unsupported runner. Currently, 
> unsupported runners are:
>  - apex
>  - portable flink
>  - gearpump
>  - dataflow
>  - jet
>  - samza
>  - spark structured streaming
> These runners should reject the pipeline.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9273) Explicitly fail pipeline with @RequiresTimeSortedInput with unsupported runner

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9273?focusedWorklogId=384932=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384932
 ]

ASF GitHub Bot logged work on BEAM-9273:


Author: ASF GitHub Bot
Created on: 11/Feb/20 02:25
Start Date: 11/Feb/20 02:25
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #10816: 
[BEAM-9273] Explicitly disable @RequiresTimeSortedInput on unsupported runners
URL: https://github.com/apache/beam/pull/10816#discussion_r377421862
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
 ##
 @@ -1224,10 +1225,10 @@ private static void translateFn(
   Map, Coder> outputCoders,
   DoFnSchemaInformation doFnSchemaInformation,
   Map> sideInputMapping) {
-DoFnSignature signature = DoFnSignatures.getSignature(fn.getClass());
 
-if (signature.usesState() || signature.usesTimers()) {
-  DataflowRunner.verifyStateSupported(fn);
+boolean isStateful = DoFnFeatures.isStateful(fn);
 
 Review comment:
   This is a good example of `DoFnFeatures` being useful: since "stateful 
ParDo" is really a totally separate primitive, it makes sense to just have a 
single boolean indicating it. Eventually moving so the different primitives are 
more clearly split with different URNs would be good.
   
   (no action for you here - just a positive comment)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384932)
Time Spent: 2h 40m  (was: 2.5h)

> Explicitly fail pipeline with @RequiresTimeSortedInput with unsupported runner
> --
>
> Key: BEAM-9273
> URL: https://issues.apache.org/jira/browse/BEAM-9273
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Fail pipeline with @RequiresTimeSortedInput annotation in pipeline 
> translation time when being run with unsupported runner. Currently, 
> unsupported runners are:
>  - apex
>  - portable flink
>  - gearpump
>  - dataflow
>  - jet
>  - samza
>  - spark structured streaming
> These runners should reject the pipeline.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9290) runner_harness_container_image experiment is not honored in python released sdks.

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9290?focusedWorklogId=384929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384929
 ]

ASF GitHub Bot logged work on BEAM-9290:


Author: ASF GitHub Bot
Created on: 11/Feb/20 02:15
Start Date: 11/Feb/20 02:15
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #10827: [BEAM-9290] Support 
runner_harness_container_image in released python…
URL: https://github.com/apache/beam/pull/10827#issuecomment-584451965
 
 
   Run PythonLint PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384929)
Time Spent: 40m  (was: 0.5h)

> runner_harness_container_image experiment is not honored in python released 
> sdks.
> -
>
> Key: BEAM-9290
> URL: https://issues.apache.org/jira/browse/BEAM-9290
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> --experiments=runner_harness_container_image=foo_image{code}
> does not have any affect on the job.
>  
>  
> cc: [~tvalentyn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9290) runner_harness_container_image experiment is not honored in python released sdks.

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9290?focusedWorklogId=384928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384928
 ]

ASF GitHub Bot logged work on BEAM-9290:


Author: ASF GitHub Bot
Created on: 11/Feb/20 02:14
Start Date: 11/Feb/20 02:14
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #10827: [BEAM-9290] Support 
runner_harness_container_image in released python…
URL: https://github.com/apache/beam/pull/10827#issuecomment-584451714
 
 
   Run PythonFormatter PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384928)
Time Spent: 0.5h  (was: 20m)

> runner_harness_container_image experiment is not honored in python released 
> sdks.
> -
>
> Key: BEAM-9290
> URL: https://issues.apache.org/jira/browse/BEAM-9290
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  
> {code:java}
> --experiments=runner_harness_container_image=foo_image{code}
> does not have any affect on the job.
>  
>  
> cc: [~tvalentyn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034076#comment-17034076
 ] 

Luke Cwik commented on BEAM-9288:
-

If gcs io is shading conscrypt, then it should fix it's shading. It should rely 
on a provided conscrypt unless gcs io marks the dependency as provided.

Apache beam should fix it's own usage to not interfere with others usage of it 
and not have users depend on it.

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Assignee: sunjincheng
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9290) runner_harness_container_image experiment is not honored in python released sdks.

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9290?focusedWorklogId=384909=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384909
 ]

ASF GitHub Bot logged work on BEAM-9290:


Author: ASF GitHub Bot
Created on: 11/Feb/20 01:29
Start Date: 11/Feb/20 01:29
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #10827: [BEAM-9290] Support 
runner_harness_container_image in released python…
URL: https://github.com/apache/beam/pull/10827#issuecomment-584441121
 
 
   R: @tvalentyn 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384909)
Time Spent: 20m  (was: 10m)

> runner_harness_container_image experiment is not honored in python released 
> sdks.
> -
>
> Key: BEAM-9290
> URL: https://issues.apache.org/jira/browse/BEAM-9290
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> --experiments=runner_harness_container_image=foo_image{code}
> does not have any affect on the job.
>  
>  
> cc: [~tvalentyn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9290) runner_harness_container_image experiment is not honored in python released sdks.

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9290?focusedWorklogId=384908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384908
 ]

ASF GitHub Bot logged work on BEAM-9290:


Author: ASF GitHub Bot
Created on: 11/Feb/20 01:29
Start Date: 11/Feb/20 01:29
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #10827: [BEAM-9290] 
Support runner_harness_container_image in released python…
URL: https://github.com/apache/beam/pull/10827
 
 
   … sdks.
   
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 

[jira] [Commented] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Igor Dvorzhak (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034066#comment-17034066
 ] 

Igor Dvorzhak commented on BEAM-9288:
-

Will it be better to exclude Concrypt from shaded GCS IO jar and rely on 
Conscrypt installed system-wide, if any?

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Assignee: sunjincheng
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9290) runner_harness_container_image experiment is not honored in python released sdks.

2020-02-10 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka updated BEAM-9290:
---
Fix Version/s: 2.20.0

> runner_harness_container_image experiment is not honored in python released 
> sdks.
> -
>
> Key: BEAM-9290
> URL: https://issues.apache.org/jira/browse/BEAM-9290
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Ankur Goenka
>Priority: Major
> Fix For: 2.20.0
>
>
>  
> {code:java}
> --experiments=runner_harness_container_image=foo_image{code}
> does not have any affect on the job.
>  
>  
> cc: [~tvalentyn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9290) runner_harness_container_image experiment is not honored in python released sdks.

2020-02-10 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka reassigned BEAM-9290:
--

Assignee: Ankur Goenka

> runner_harness_container_image experiment is not honored in python released 
> sdks.
> -
>
> Key: BEAM-9290
> URL: https://issues.apache.org/jira/browse/BEAM-9290
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.20.0
>
>
>  
> {code:java}
> --experiments=runner_harness_container_image=foo_image{code}
> does not have any affect on the job.
>  
>  
> cc: [~tvalentyn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9290) runner_harness_container_image experiment is not honored in python released sdks.

2020-02-10 Thread Ankur Goenka (Jira)
Ankur Goenka created BEAM-9290:
--

 Summary: runner_harness_container_image experiment is not honored 
in python released sdks.
 Key: BEAM-9290
 URL: https://issues.apache.org/jira/browse/BEAM-9290
 Project: Beam
  Issue Type: Test
  Components: runner-dataflow, sdk-py-core
Reporter: Ankur Goenka


 
{code:java}
--experiments=runner_harness_container_image=foo_image{code}
does not have any affect on the job.

 

 

cc: [~tvalentyn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9290) runner_harness_container_image experiment is not honored in python released sdks.

2020-02-10 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka updated BEAM-9290:
---
Issue Type: Bug  (was: Test)

> runner_harness_container_image experiment is not honored in python released 
> sdks.
> -
>
> Key: BEAM-9290
> URL: https://issues.apache.org/jira/browse/BEAM-9290
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Ankur Goenka
>Priority: Major
>
>  
> {code:java}
> --experiments=runner_harness_container_image=foo_image{code}
> does not have any affect on the job.
>  
>  
> cc: [~tvalentyn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8537) Provide WatermarkEstimatorProvider for different types of WatermarkEstimator

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8537?focusedWorklogId=384906=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384906
 ]

ASF GitHub Bot logged work on BEAM-8537:


Author: ASF GitHub Bot
Created on: 11/Feb/20 01:24
Start Date: 11/Feb/20 01:24
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #10802: [BEAM-8537] 
Move wrappers of RestrictionTracker out of iobase
URL: https://github.com/apache/beam/pull/10802#discussion_r377410314
 
 

 ##
 File path: sdks/python/apache_beam/runners/sdf_utils.py
 ##
 @@ -0,0 +1,176 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pytype: skip-file
+
+"""Common utility class to help SDK harness to execute an SDF. """
+
+from __future__ import absolute_import
+from __future__ import division
+
+import logging
+import threading
+from builtins import object
+from typing import TYPE_CHECKING
+from typing import Any
+from typing import NamedTuple
+from typing import Optional
+from typing import Tuple
+from apache_beam.utils.windowed_value import WindowedValue
+
+from apache_beam.utils import timestamp
+
+if TYPE_CHECKING:
+  from apache_beam.io.iobase import RestrictionTracker
+  from apache_beam.utils.timestamp import Timestamp
+
+_LOGGER = logging.getLogger(__name__)
+
+SplitResultPrimary = NamedTuple(
+'SplitResultPrimary', [('windowed_value', WindowedValue)])
 
 Review comment:
   Thanks! Like the idea of `primary_value` and `residual_value`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384906)
Time Spent: 12h 50m  (was: 12h 40m)

> Provide WatermarkEstimatorProvider for different types of WatermarkEstimator
> 
>
> Key: BEAM-8537
> URL: https://issues.apache.org/jira/browse/BEAM-8537
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> This is a follow up for in-progress PR:  
> https://github.com/apache/beam/pull/9794.
> Current implementation in PR9794 provides a default implementation of 
> WatermarkEstimator. For further work, we want to let WatermarkEstimator to be 
> a pure Interface. We'll provide a WatermarkEstimatorProvider to be able to 
> create a custom WatermarkEstimator per windowed value. It should be similar 
> to how we track restriction for SDF: 
> WatermarkEstimator <---> RestrictionTracker 
> WatermarkEstimatorProvider <---> RestrictionTrackerProvider
> WatermarkEstimatorParam <---> RestrictionDoFnParam



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3221) Model pipeline representation improvements

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3221?focusedWorklogId=384890=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384890
 ]

ASF GitHub Bot logged work on BEAM-3221:


Author: ASF GitHub Bot
Created on: 11/Feb/20 01:00
Start Date: 11/Feb/20 01:00
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10779: [BEAM-3221, 
BEAM-4180] Clarify documentation for StandardTransforms.Primitives, Pipeline, 
and PTransform.
URL: https://github.com/apache/beam/pull/10779#issuecomment-584432912
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384890)
Time Spent: 5.5h  (was: 5h 20m)

> Model pipeline representation improvements
> --
>
> Key: BEAM-3221
> URL: https://issues.apache.org/jira/browse/BEAM-3221
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Collections of various (breaking) tweaks to the Runner API, notably the 
> pipeline representation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=384889=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384889
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:59
Start Date: 11/Feb/20 00:59
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10826: [BEAM-8335] 
Modify the TestStreamPayload to accept an argument of output_tags and…
URL: https://github.com/apache/beam/pull/10826#discussion_r377403925
 
 

 ##
 File path: model/interactive/src/main/proto/beam_interactive_api.proto
 ##
 @@ -40,18 +40,22 @@ import "google/protobuf/timestamp.proto";
 message TestStreamFileHeader {
   // The PCollection tag this stream is associated with.
   string tag = 1;
+
+  // The file format version. This is used to ensure backwards compatibility
+  // when decoding from file.
+  int32 version = 2;
 }
 
 // A record is a recorded element that a source produced. Its function is to
 // give enough information to create a faithful recreation of the original
 // stream of data.
 message TestStreamFileRecord {
   oneof recorded_event {
-// The recorded element with its event timestamp (when it was produced).
-org.apache.beam.model.pipeline.v1.TestStreamPayload.TimestampedElement 
element = 1;
+// The recorded bundle with its event timestamp (when it was produced).
+org.apache.beam.model.pipeline.v1.TestStreamPayload.Event.AddElements 
element_event = 1;
 
 // Indicating the output watermark of the source changed.
-google.protobuf.Timestamp watermark = 2;
+org.apache.beam.model.pipeline.v1.TestStreamPayload.Event.AdvanceWatermark 
watermark_event = 2;
   }
 
   // The wall-time timestamp of either the new element or watermark change.
 
 Review comment:
   Why do the recorded events not have AdvanceProcessingTime events?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384889)
Time Spent: 55h 40m  (was: 55.5h)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 55h 40m
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=384887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384887
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:59
Start Date: 11/Feb/20 00:59
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10826: [BEAM-8335] 
Modify the TestStreamPayload to accept an argument of output_tags and…
URL: https://github.com/apache/beam/pull/10826#discussion_r377404072
 
 

 ##
 File path: model/interactive/src/main/proto/beam_interactive_api.proto
 ##
 @@ -40,18 +40,22 @@ import "google/protobuf/timestamp.proto";
 message TestStreamFileHeader {
   // The PCollection tag this stream is associated with.
   string tag = 1;
+
+  // The file format version. This is used to ensure backwards compatibility
+  // when decoding from file.
+  int32 version = 2;
 
 Review comment:
   Shouldn't proto versioning handle this?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384887)
Time Spent: 55.5h  (was: 55h 20m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 55.5h
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=384886=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384886
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:59
Start Date: 11/Feb/20 00:59
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10826: [BEAM-8335] 
Modify the TestStreamPayload to accept an argument of output_tags and…
URL: https://github.com/apache/beam/pull/10826#discussion_r377403463
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -576,7 +581,12 @@ service TestStreamService {
   // A TestStream will request for events using this RPC.
   rpc Events(EventsRequest) returns (stream TestStreamPayload.Event) {}
 }
-message EventsRequest {}
+
+message EventsRequest {
+  // The set of keys to read from. The keys are the specific PCollections to
+  // read.
+  repeated string keys = 1;
 
 Review comment:
   Can you provide more context as to why this is needed?
   
   Is this the pcollection id, or the output name or the ptransform id 
representing the test stream?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384886)
Time Spent: 55h 20m  (was: 55h 10m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 55h 20m
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=384888=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384888
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:59
Start Date: 11/Feb/20 00:59
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10826: [BEAM-8335] 
Modify the TestStreamPayload to accept an argument of output_tags and…
URL: https://github.com/apache/beam/pull/10826#discussion_r377402954
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -525,6 +525,11 @@ message TestStreamPayload {
   // used to retrieve events.
   ApiServiceDescriptor endpoint = 3;
 
+  // (Optional) The PCollection tags this TestStream will be outputting to. If
+  // empty, this will assume it will be outputting to the single main
+  // output PCollection.
+  repeated string output_tags = 4;
 
 Review comment:
   Don't think this is worth duplicating information already present on 
existing test stream events.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384888)
Time Spent: 55.5h  (was: 55h 20m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 55.5h
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9269) Set shorter Commit Deadline and handle with backoff/retry

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9269?focusedWorklogId=384877=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384877
 ]

ASF GitHub Bot logged work on BEAM-9269:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:36
Start Date: 11/Feb/20 00:36
Worklog Time Spent: 10m 
  Work Description: allenpradeep commented on pull request #10752: 
[BEAM-9269] Add commit deadline for Spanner writes.
URL: https://github.com/apache/beam/pull/10752#discussion_r377394855
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerConfig.java
 ##
 @@ -42,6 +50,10 @@
   private static final String USER_AGENT_PREFIX = "Apache_Beam_Java";
   // A default host name for batch traffic.
   private static final String DEFAULT_HOST = 
"https://batch-spanner.googleapis.com/;;
+  // Deadline for Commit API call.
+  private static final Duration DEFAULT_COMMIT_DEADLINE = 
Duration.standardSeconds(15);
 
 Review comment:
   Can this be made configurable.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384877)
Time Spent: 1.5h  (was: 1h 20m)

> Set shorter Commit Deadline and handle with backoff/retry
> -
>
> Key: BEAM-9269
> URL: https://issues.apache.org/jira/browse/BEAM-9269
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0, 2.17.0, 2.18.0, 2.19.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Major
>  Labels: google-cloud-spanner
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Default commit deadline in Spanner is 1hr, which can lead to a variety of 
> issues including database overload and session expiry.
> Shorter deadline should be set with backoff/retry when deadline expires, so 
> that the Spanner database does not become overloaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9269) Set shorter Commit Deadline and handle with backoff/retry

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9269?focusedWorklogId=384876=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384876
 ]

ASF GitHub Bot logged work on BEAM-9269:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:36
Start Date: 11/Feb/20 00:36
Worklog Time Spent: 10m 
  Work Description: allenpradeep commented on pull request #10752: 
[BEAM-9269] Add commit deadline for Spanner writes.
URL: https://github.com/apache/beam/pull/10752#discussion_r377395567
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
 ##
 @@ -1285,30 +1373,84 @@ public void processElement(ProcessContext c) throws 
Exception {
   boolean tryIndividual = false;
 
 Review comment:
   we can remove this if this is not being used.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384876)
Time Spent: 1h 20m  (was: 1h 10m)

> Set shorter Commit Deadline and handle with backoff/retry
> -
>
> Key: BEAM-9269
> URL: https://issues.apache.org/jira/browse/BEAM-9269
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0, 2.17.0, 2.18.0, 2.19.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Major
>  Labels: google-cloud-spanner
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Default commit deadline in Spanner is 1hr, which can lead to a variety of 
> issues including database overload and session expiry.
> Shorter deadline should be set with backoff/retry when deadline expires, so 
> that the Spanner database does not become overloaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9269) Set shorter Commit Deadline and handle with backoff/retry

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9269?focusedWorklogId=384878=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384878
 ]

ASF GitHub Bot logged work on BEAM-9269:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:36
Start Date: 11/Feb/20 00:36
Worklog Time Spent: 10m 
  Work Description: allenpradeep commented on pull request #10752: 
[BEAM-9269] Add commit deadline for Spanner writes.
URL: https://github.com/apache/beam/pull/10752#discussion_r377397609
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
 ##
 @@ -1248,20 +1307,44 @@ public void processElement(ProcessContext c) {
 }
   }
 
-  private static class WriteToSpannerFn extends DoFn, 
Void> {
+  @VisibleForTesting
+  static class WriteToSpannerFn extends DoFn, Void> {
 
 private transient SpannerAccessor spannerAccessor;
 private final SpannerConfig spannerConfig;
 private final FailureMode failureMode;
-private final Counter mutationGroupBatchesCounter =
-Metrics.counter(WriteGrouped.class, "mutation_group_batches");
-private final Counter mutationGroupWriteSuccessCounter =
+
+@VisibleForTesting static Sleeper sleeper = Sleeper.DEFAULT;
+
+private final Counter mutationGroupBatchesReceived =
+Metrics.counter(WriteGrouped.class, "mutation_group_batches_received");
+private final Counter mutationGroupBatchesWriteSuccess =
+Metrics.counter(WriteGrouped.class, 
"mutation_group_batches_write_success");
+private final Counter mutationGroupBatchesWriteFail =
+Metrics.counter(WriteGrouped.class, 
"mutation_group_batches_write_fail");
+
+private final Counter mutationGroupsReceived =
+Metrics.counter(WriteGrouped.class, "mutation_groups_received");
+private final Counter mutationGroupsWriteSuccess =
 Metrics.counter(WriteGrouped.class, "mutation_groups_write_success");
-private final Counter mutationGroupWriteFailCounter =
+private final Counter mutationGroupsWriteFail =
 Metrics.counter(WriteGrouped.class, "mutation_groups_write_fail");
 
+private final Counter spannerWriteSuccess =
+Metrics.counter(WriteGrouped.class, "spanner_write_success");
+private final Counter spannerWriteFail =
+Metrics.counter(WriteGrouped.class, "spanner_write_fail");
+private final Counter spannerWriteTotalLatency =
 
 Review comment:
   Would an average writeLatency(if its possible to calculate) be more 
actionable to the user rather than a total write latency? In the sense, if the 
write latency is high then the user can reduce the workers.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384878)

> Set shorter Commit Deadline and handle with backoff/retry
> -
>
> Key: BEAM-9269
> URL: https://issues.apache.org/jira/browse/BEAM-9269
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0, 2.17.0, 2.18.0, 2.19.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Major
>  Labels: google-cloud-spanner
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Default commit deadline in Spanner is 1hr, which can lead to a variety of 
> issues including database overload and session expiry.
> Shorter deadline should be set with backoff/retry when deadline expires, so 
> that the Spanner database does not become overloaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9289) Improve performance for metrics update of samza runner

2020-02-10 Thread Yixing Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yixing Zhang updated BEAM-9289:
---
Description: 
* Reduce metrics update per do function from 4 -> 1.
 * Query for the metrics for the current step to update.

> Improve performance for metrics update of samza runner
> --
>
> Key: BEAM-9289
> URL: https://issues.apache.org/jira/browse/BEAM-9289
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-samza
>Reporter: Yixing Zhang
>Priority: Major
>
> * Reduce metrics update per do function from 4 -> 1.
>  * Query for the metrics for the current step to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9289) Improve performance for metrics update of samza runner

2020-02-10 Thread Yixing Zhang (Jira)
Yixing Zhang created BEAM-9289:
--

 Summary: Improve performance for metrics update of samza runner
 Key: BEAM-9289
 URL: https://issues.apache.org/jira/browse/BEAM-9289
 Project: Beam
  Issue Type: Improvement
  Components: runner-samza
Reporter: Yixing Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8537) Provide WatermarkEstimatorProvider for different types of WatermarkEstimator

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8537?focusedWorklogId=384861=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384861
 ]

ASF GitHub Bot logged work on BEAM-8537:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:19
Start Date: 11/Feb/20 00:19
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #10802: [BEAM-8537] 
Move wrappers of RestrictionTracker out of iobase
URL: https://github.com/apache/beam/pull/10802#discussion_r377393414
 
 

 ##
 File path: sdks/python/apache_beam/runners/sdf_utils.py
 ##
 @@ -0,0 +1,173 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pytype: skip-file
+
+"""Common utility class to help SDK harness to execute an SDF. """
+
+from __future__ import absolute_import
+from __future__ import division
+
+import logging
+import threading
+from builtins import object
+from collections import namedtuple
+from typing import TYPE_CHECKING
+from typing import Any
+from typing import Optional
+from typing import Tuple
+
+from apache_beam.utils import timestamp
+
+if TYPE_CHECKING:
+  from apache_beam.io.iobase import RestrictionTracker
+  from apache_beam.utils.timestamp import Timestamp
+
+_LOGGER = logging.getLogger(__name__)
+
+
+SplitResultPrimary = namedtuple(
+'SplitResultPrimary', 'windowed_value')
 
 Review comment:
   Nothing comes to mind immediately as to what fields we'd want to add here 
(though originally even the residual didn't have anything extra). Mostly I like 
the consistency, so I'd lean towards keeping it as is. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384861)
Time Spent: 12h 40m  (was: 12.5h)

> Provide WatermarkEstimatorProvider for different types of WatermarkEstimator
> 
>
> Key: BEAM-8537
> URL: https://issues.apache.org/jira/browse/BEAM-8537
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> This is a follow up for in-progress PR:  
> https://github.com/apache/beam/pull/9794.
> Current implementation in PR9794 provides a default implementation of 
> WatermarkEstimator. For further work, we want to let WatermarkEstimator to be 
> a pure Interface. We'll provide a WatermarkEstimatorProvider to be able to 
> create a custom WatermarkEstimator per windowed value. It should be similar 
> to how we track restriction for SDF: 
> WatermarkEstimator <---> RestrictionTracker 
> WatermarkEstimatorProvider <---> RestrictionTrackerProvider
> WatermarkEstimatorParam <---> RestrictionDoFnParam



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8537) Provide WatermarkEstimatorProvider for different types of WatermarkEstimator

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8537?focusedWorklogId=384860=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384860
 ]

ASF GitHub Bot logged work on BEAM-8537:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:19
Start Date: 11/Feb/20 00:19
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #10802: [BEAM-8537] 
Move wrappers of RestrictionTracker out of iobase
URL: https://github.com/apache/beam/pull/10802#discussion_r377392592
 
 

 ##
 File path: sdks/python/apache_beam/runners/sdf_utils.py
 ##
 @@ -0,0 +1,173 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pytype: skip-file
+
+"""Common utility class to help SDK harness to execute an SDF. """
+
+from __future__ import absolute_import
+from __future__ import division
+
+import logging
+import threading
+from builtins import object
+from collections import namedtuple
+from typing import TYPE_CHECKING
+from typing import Any
+from typing import Optional
+from typing import Tuple
+
+from apache_beam.utils import timestamp
+
+if TYPE_CHECKING:
+  from apache_beam.io.iobase import RestrictionTracker
+  from apache_beam.utils.timestamp import Timestamp
+
+_LOGGER = logging.getLogger(__name__)
+
+
+SplitResultPrimary = namedtuple(
+'SplitResultPrimary', 'windowed_value')
+
+SplitResultResidual = namedtuple(
+'SplitResultResidual',
+'windowed_value current_watermark deferred_timestamp')
+
+class ThreadsafeRestrictionTracker(object):
 
 Review comment:
   I agree that adding types would be nice, but is probably out of scope. (It 
would make sense to add types on the base class at the same time.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384860)
Time Spent: 12.5h  (was: 12h 20m)

> Provide WatermarkEstimatorProvider for different types of WatermarkEstimator
> 
>
> Key: BEAM-8537
> URL: https://issues.apache.org/jira/browse/BEAM-8537
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> This is a follow up for in-progress PR:  
> https://github.com/apache/beam/pull/9794.
> Current implementation in PR9794 provides a default implementation of 
> WatermarkEstimator. For further work, we want to let WatermarkEstimator to be 
> a pure Interface. We'll provide a WatermarkEstimatorProvider to be able to 
> create a custom WatermarkEstimator per windowed value. It should be similar 
> to how we track restriction for SDF: 
> WatermarkEstimator <---> RestrictionTracker 
> WatermarkEstimatorProvider <---> RestrictionTrackerProvider
> WatermarkEstimatorParam <---> RestrictionDoFnParam



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8537) Provide WatermarkEstimatorProvider for different types of WatermarkEstimator

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8537?focusedWorklogId=384858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384858
 ]

ASF GitHub Bot logged work on BEAM-8537:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:19
Start Date: 11/Feb/20 00:19
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #10802: [BEAM-8537] 
Move wrappers of RestrictionTracker out of iobase
URL: https://github.com/apache/beam/pull/10802#discussion_r377390532
 
 

 ##
 File path: sdks/python/apache_beam/runners/sdf_utils.py
 ##
 @@ -0,0 +1,176 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pytype: skip-file
+
+"""Common utility class to help SDK harness to execute an SDF. """
+
+from __future__ import absolute_import
+from __future__ import division
+
+import logging
+import threading
+from builtins import object
+from typing import TYPE_CHECKING
+from typing import Any
+from typing import NamedTuple
+from typing import Optional
+from typing import Tuple
+from apache_beam.utils.windowed_value import WindowedValue
+
+from apache_beam.utils import timestamp
+
+if TYPE_CHECKING:
+  from apache_beam.io.iobase import RestrictionTracker
+  from apache_beam.utils.timestamp import Timestamp
+
+_LOGGER = logging.getLogger(__name__)
+
+SplitResultPrimary = NamedTuple(
+'SplitResultPrimary', [('windowed_value', WindowedValue)])
 
 Review comment:
   Maybe call this field `primary[_value]` and the other `residual[_value]`?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384858)
Time Spent: 12.5h  (was: 12h 20m)

> Provide WatermarkEstimatorProvider for different types of WatermarkEstimator
> 
>
> Key: BEAM-8537
> URL: https://issues.apache.org/jira/browse/BEAM-8537
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> This is a follow up for in-progress PR:  
> https://github.com/apache/beam/pull/9794.
> Current implementation in PR9794 provides a default implementation of 
> WatermarkEstimator. For further work, we want to let WatermarkEstimator to be 
> a pure Interface. We'll provide a WatermarkEstimatorProvider to be able to 
> create a custom WatermarkEstimator per windowed value. It should be similar 
> to how we track restriction for SDF: 
> WatermarkEstimator <---> RestrictionTracker 
> WatermarkEstimatorProvider <---> RestrictionTrackerProvider
> WatermarkEstimatorParam <---> RestrictionDoFnParam



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8537) Provide WatermarkEstimatorProvider for different types of WatermarkEstimator

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8537?focusedWorklogId=384859=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384859
 ]

ASF GitHub Bot logged work on BEAM-8537:


Author: ASF GitHub Bot
Created on: 11/Feb/20 00:19
Start Date: 11/Feb/20 00:19
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #10802: [BEAM-8537] 
Move wrappers of RestrictionTracker out of iobase
URL: https://github.com/apache/beam/pull/10802#discussion_r377389487
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
 ##
 @@ -906,26 +906,25 @@ def delayed_bundle_application(self,
 # type: (...) -> beam_fn_api_pb2.DelayedBundleApplication
 assert op.input_info is not None
 # TODO(SDF): For non-root nodes, need main_input_coder + residual_coder.
-((element_and_restriction, output_watermark),
- deferred_watermark) = deferred_remainder
-if deferred_watermark:
-  assert isinstance(deferred_watermark, timestamp.Duration)
+element_and_restriction = deferred_remainder.windowed_value
 
 Review comment:
   You can still use tuple unpacking here, e.g. 
   
   `(element_and_restriction, deferred_timestamp, current_watermark) = 
deferred_remainder`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384859)
Time Spent: 12.5h  (was: 12h 20m)

> Provide WatermarkEstimatorProvider for different types of WatermarkEstimator
> 
>
> Key: BEAM-8537
> URL: https://issues.apache.org/jira/browse/BEAM-8537
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> This is a follow up for in-progress PR:  
> https://github.com/apache/beam/pull/9794.
> Current implementation in PR9794 provides a default implementation of 
> WatermarkEstimator. For further work, we want to let WatermarkEstimator to be 
> a pure Interface. We'll provide a WatermarkEstimatorProvider to be able to 
> create a custom WatermarkEstimator per windowed value. It should be similar 
> to how we track restriction for SDF: 
> WatermarkEstimator <---> RestrictionTracker 
> WatermarkEstimatorProvider <---> RestrictionTrackerProvider
> WatermarkEstimatorParam <---> RestrictionDoFnParam



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=384851=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384851
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:50
Start Date: 10/Feb/20 23:50
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on issue #10826: [BEAM-8335] 
Modify the TestStreamPayload to accept an argument of output_tags and…
URL: https://github.com/apache/beam/pull/10826#issuecomment-584415517
 
 
   LGTM
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384851)
Time Spent: 55h 10m  (was: 55h)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 55h 10m
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3221) Model pipeline representation improvements

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3221?focusedWorklogId=384849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384849
 ]

ASF GitHub Bot logged work on BEAM-3221:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:45
Start Date: 10/Feb/20 23:45
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10779: [BEAM-3221, 
BEAM-4180] Clarify documentation for StandardTransforms.Primitives, Pipeline, 
and PTransform.
URL: https://github.com/apache/beam/pull/10779#issuecomment-584414243
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384849)
Time Spent: 5h 20m  (was: 5h 10m)

> Model pipeline representation improvements
> --
>
> Key: BEAM-3221
> URL: https://issues.apache.org/jira/browse/BEAM-3221
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Collections of various (breaking) tweaks to the Runner API, notably the 
> pipeline representation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=384846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384846
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:44
Start Date: 10/Feb/20 23:44
Worklog Time Spent: 10m 
  Work Description: rohdesamuel commented on issue #10826: [BEAM-8335] 
Modify the TestStreamPayload to accept an argument of output_tags and…
URL: https://github.com/apache/beam/pull/10826#issuecomment-584413798
 
 
   R: @davidyan74 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384846)
Time Spent: 55h  (was: 54h 50m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 55h
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9160) Update AWS SDK to support Pod Level Identity

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9160?focusedWorklogId=384844=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384844
 ]

ASF GitHub Bot logged work on BEAM-9160:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:43
Start Date: 10/Feb/20 23:43
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10825: [BEAM-9160] Update 
AWS SDK to support Pod Level Identity
URL: https://github.com/apache/beam/pull/10825#issuecomment-584413619
 
 
   The code changes look fine otherwise.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384844)
Time Spent: 50m  (was: 40m)

> Update AWS SDK to support Pod Level Identity
> 
>
> Key: BEAM-9160
> URL: https://issues.apache.org/jira/browse/BEAM-9160
> Project: Beam
>  Issue Type: Improvement
>  Components: dependencies
>Affects Versions: 2.17.0
>Reporter: Mohamed Noah
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Many organizations have started leveraging pod level identity in Kubernetes. 
> The current version of the AWS SDK packaged with Bean 2.17.0 is out of date 
> and doesn't provide native support to pod level identity access management.
>  
> It is recommended that we introduce support to access AWS resources such as 
> S3 using pod level identity. 
> Current Version of the AWS Java SDK in Beam:
> def aws_java_sdk_version = "1.11.519"
> Proposed AWS Java SDK Version:
> 
>  com.amazonaws
>  aws-java-sdk
>  1.11.710
> 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9160) Update AWS SDK to support Pod Level Identity

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9160?focusedWorklogId=384843=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384843
 ]

ASF GitHub Bot logged work on BEAM-9160:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:43
Start Date: 10/Feb/20 23:43
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10825: [BEAM-9160] Update 
AWS SDK to support Pod Level Identity
URL: https://github.com/apache/beam/pull/10825#issuecomment-584413587
 
 
   Can we run the linkage checker to perform analysis on whether we increased 
the number of linkage errors?
   
   Example PR of running the linkage checker:
   https://github.com/apache/beam/pull/10769
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384843)
Time Spent: 40m  (was: 0.5h)

> Update AWS SDK to support Pod Level Identity
> 
>
> Key: BEAM-9160
> URL: https://issues.apache.org/jira/browse/BEAM-9160
> Project: Beam
>  Issue Type: Improvement
>  Components: dependencies
>Affects Versions: 2.17.0
>Reporter: Mohamed Noah
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Many organizations have started leveraging pod level identity in Kubernetes. 
> The current version of the AWS SDK packaged with Bean 2.17.0 is out of date 
> and doesn't provide native support to pod level identity access management.
>  
> It is recommended that we introduce support to access AWS resources such as 
> S3 using pod level identity. 
> Current Version of the AWS Java SDK in Beam:
> def aws_java_sdk_version = "1.11.519"
> Proposed AWS Java SDK Version:
> 
>  com.amazonaws
>  aws-java-sdk
>  1.11.710
> 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=384842=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384842
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:39
Start Date: 10/Feb/20 23:39
Worklog Time Spent: 10m 
  Work Description: rohdesamuel commented on pull request #10826: 
[BEAM-8335] Modify the TestStreamPayload to accept an argument of output_tags 
and…
URL: https://github.com/apache/beam/pull/10826
 
 
   … modify the TestStreamFileRecord to use TestStreamPayload events.
   
   Change-Id: I84ec1dd4698534c26c3a5219669da5f1a127250a
   
   Adds the output_tags in the TestStreamPayload for allowing converting a 
TestStream to/from proto without reading from its events.
   
   Also adds a file format version for the TestStreamFileHeader to document 
which version for backwards compatibility when decoding.
   
   Also adds a list of "keys" to request which PCollection to read from in the 
TestStreamService.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 

[jira] [Commented] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034021#comment-17034021
 ] 

Luke Cwik commented on BEAM-9288:
-

I filed [https://github.com/google/conscrypt/issues/811] for "auto" discovery 
of package path.

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Assignee: sunjincheng
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3221) Model pipeline representation improvements

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3221?focusedWorklogId=384836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384836
 ]

ASF GitHub Bot logged work on BEAM-3221:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:32
Start Date: 10/Feb/20 23:32
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10779: [BEAM-3221, 
BEAM-4180] Clarify documentation for StandardTransforms.Primitives, Pipeline, 
and PTransform.
URL: https://github.com/apache/beam/pull/10779#issuecomment-584410627
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384836)
Time Spent: 5h 10m  (was: 5h)

> Model pipeline representation improvements
> --
>
> Key: BEAM-3221
> URL: https://issues.apache.org/jira/browse/BEAM-3221
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Collections of various (breaking) tweaks to the Runner API, notably the 
> pipeline representation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9160) Update AWS SDK to support Pod Level Identity

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9160?focusedWorklogId=384835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384835
 ]

ASF GitHub Bot logged work on BEAM-9160:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:30
Start Date: 10/Feb/20 23:30
Worklog Time Spent: 10m 
  Work Description: andeb commented on issue #10825: [BEAM-9160] Update AWS 
SDK to support Pod Level Identity
URL: https://github.com/apache/beam/pull/10825#issuecomment-584409822
 
 
   R: @jbonofre @lukecwik @chamikaramj @timrobertson100
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384835)
Time Spent: 0.5h  (was: 20m)

> Update AWS SDK to support Pod Level Identity
> 
>
> Key: BEAM-9160
> URL: https://issues.apache.org/jira/browse/BEAM-9160
> Project: Beam
>  Issue Type: Improvement
>  Components: dependencies
>Affects Versions: 2.17.0
>Reporter: Mohamed Noah
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Many organizations have started leveraging pod level identity in Kubernetes. 
> The current version of the AWS SDK packaged with Bean 2.17.0 is out of date 
> and doesn't provide native support to pod level identity access management.
>  
> It is recommended that we introduce support to access AWS resources such as 
> S3 using pod level identity. 
> Current Version of the AWS Java SDK in Beam:
> def aws_java_sdk_version = "1.11.519"
> Proposed AWS Java SDK Version:
> 
>  com.amazonaws
>  aws-java-sdk
>  1.11.710
> 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034017#comment-17034017
 ] 

Luke Cwik commented on BEAM-9288:
-

The idea would be to use a solution similar to tcnative where the library path 
can be "parsed" to provide the appropriate package prefix: 
https://github.com/netty/netty-tcnative/blob/9fd38de8a07cc413b83cacf39470f2845a68e26c/openssl-dynamic/src/main/c/jnilib.c#L233

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Assignee: sunjincheng
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9160) Update AWS SDK to support Pod Level Identity

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9160?focusedWorklogId=384832=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384832
 ]

ASF GitHub Bot logged work on BEAM-9160:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:26
Start Date: 10/Feb/20 23:26
Worklog Time Spent: 10m 
  Work Description: andeb commented on issue #10825: [BEAM-9160] Update AWS 
SDK to support Pod Level Identity
URL: https://github.com/apache/beam/pull/10825#issuecomment-584408814
 
 
   R: @aaltay @kennknowles
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384832)
Time Spent: 20m  (was: 10m)

> Update AWS SDK to support Pod Level Identity
> 
>
> Key: BEAM-9160
> URL: https://issues.apache.org/jira/browse/BEAM-9160
> Project: Beam
>  Issue Type: Improvement
>  Components: dependencies
>Affects Versions: 2.17.0
>Reporter: Mohamed Noah
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Many organizations have started leveraging pod level identity in Kubernetes. 
> The current version of the AWS SDK packaged with Bean 2.17.0 is out of date 
> and doesn't provide native support to pod level identity access management.
>  
> It is recommended that we introduce support to access AWS resources such as 
> S3 using pod level identity. 
> Current Version of the AWS Java SDK in Beam:
> def aws_java_sdk_version = "1.11.519"
> Proposed AWS Java SDK Version:
> 
>  com.amazonaws
>  aws-java-sdk
>  1.11.710
> 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9160) Update AWS SDK to support Pod Level Identity

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9160?focusedWorklogId=384831=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384831
 ]

ASF GitHub Bot logged work on BEAM-9160:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:25
Start Date: 10/Feb/20 23:25
Worklog Time Spent: 10m 
  Work Description: andeb commented on pull request #10825: [BEAM-9160] 
Update AWS SDK to support Pod Level Identity
URL: https://github.com/apache/beam/pull/10825
 
 
   Upgraded AWS SDK and added support to [pod 
identity](https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/)
 when running in AWS EKS.
   
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 

[jira] [Commented] (BEAM-9252) Problem shading Beam pipeline with Beam 2.20.0-SNAPSHOT

2020-02-10 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034013#comment-17034013
 ] 

Luke Cwik commented on BEAM-9252:
-

It looks like conscrypt may have been shaded incorrectly and is causing 
conflicts with other versions of conscrypt on the classpath due to JNI. See 
BEAM-9288 for additional details.

> Problem shading Beam pipeline with Beam 2.20.0-SNAPSHOT
> ---
>
> Key: BEAM-9252
> URL: https://issues.apache.org/jira/browse/BEAM-9252
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Affects Versions: 2.20.0
>Reporter: Ismaël Mejía
>Priority: Blocker
> Fix For: 2.20.0
>
>
> I was checking today a pipeline against the latest 2.20.0-SNAPSHOT and I 
> found that it works perfectly with version 2.19.0, but it is failing with a  
> shade related exception that refers to grpc 1.26.0:
> {{[ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-shade-plugin:3.2.1:shade (default) on project 
> EventsToIOs: Error creating shaded jar: Problem shading JAR 
> /home/ismael/.m2/repository/org/apache/beam/beam-vendor-grpc-1_26_0/0.1/beam-vendor-grpc-1_26_0-0.1.jar
>  entry org/apache/beam/vendor/grpc/v1p26p0/org/jboss/modules/Main.class: 
> org.apache.maven.plugin.MojoExecutionException: Error in ASM processing class 
> org/apache/beam/vendor/grpc/v1p26p0/org/jboss/modules/Main.class: 65536 -> 
> [Help 1]}}
> {{There is also a warning that is not present in the build against 2.19.0}}
> {{[WARNING] Discovered module-info.class. Shading will break its strong 
> encapsulation.}}
>  
> I wonder if we are not doing something wrong during our vendoring, can 
> someone take a look please.
> This is relatively easy to reproduce with the beam-samples repo, just clone 
> it and run:
> {noformat}
> git clone https://github.com/jbonofre/beam-samples
> mvn clean verify -Pbeam-release-repo -Dbeam.version=2.20.0-SNAPSHOT
> {noformat}
> Available logs of the latest run:
> [https://github.com/jbonofre/beam-samples/runs/427537544?check_suite_focus=true]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034010#comment-17034010
 ] 

Luke Cwik commented on BEAM-9288:
-

cc: [~sunjincheng121]

 

Conscrypt can be shaded properly by building a special version of the so file 
with a JNI_JARJAR_PREFIX defined. It would be best if conscrypt was updated to 
support auto-detection of post relocation path instead of needing everyone to 
build their own version.

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Assignee: sunjincheng
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Chamikara Madhusanka Jayalath (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Madhusanka Jayalath reassigned BEAM-9288:
---

Assignee: sunjincheng

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Assignee: sunjincheng
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034008#comment-17034008
 ] 

Chamikara Madhusanka Jayalath edited comment on BEAM-9288 at 2/10/20 11:17 PM:
---

Is there a way we can upgrade gRPC without shading Conscrypt ?

 

Jincheng, can you look into this ?


was (Author: chamikara):
Is there a way we can upgrade gRPC without shading Conscrypt ?

 

@sunjincheng121 can you look into this ?

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034008#comment-17034008
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-9288:
-

Is there a way we can upgrade gRPC without shading Conscrypt ?

 

@sunjincheng121 can you look into this ?

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=384826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384826
 ]

ASF GitHub Bot logged work on BEAM-5605:


Author: ASF GitHub Bot
Created on: 10/Feb/20 23:12
Start Date: 10/Feb/20 23:12
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10576: [BEAM-5605] 
Convert all BoundedSources to SplittableDoFns when using beam_fn_api experiment.
URL: https://github.com/apache/beam/pull/10576#discussion_r377372981
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java
 ##
 @@ -94,6 +106,20 @@ private Bounded(@Nullable String name, BoundedSource 
source) {
 public final PCollection expand(PBegin input) {
   source.validate();
 
+  if (ExperimentalOptions.hasExperiment(input.getPipeline().getOptions(), 
"beam_fn_api")) {
 
 Review comment:
   I was able to remove JavaReadViaImpulse. It required fixing up some tests 
that weren't setting the `beam_fn_api` experiment.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384826)
Time Spent: 11h 50m  (was: 11h 40m)

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034007#comment-17034007
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-9288:
-

cc: [~lcwik] [~medb]

> Conscrypt shaded dependency
> ---
>
> Key: BEAM-9288
> URL: https://issues.apache.org/jira/browse/BEAM-9288
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Esun Kim
>Priority: Major
>
> Conscrypt is not designed to be shaded properly mainly because of so files. I 
> happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
> (*2) in it. I think this could make a problem when new Conscrypt is brought 
> by new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this 
> case, it may have a conflict when finding proper so files for Conscrypt. 
> *1: https://issues.apache.org/jira/browse/BEAM-9030
> *2:  
> [https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]
> *3: https://issues.apache.org/jira/browse/BEAM-6136
> *4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]
> *5: https://issues.apache.org/jira/browse/BEAM-8889
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9269) Set shorter Commit Deadline and handle with backoff/retry

2020-02-10 Thread Niel Markwick (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niel Markwick reassigned BEAM-9269:
---

Assignee: Niel Markwick

> Set shorter Commit Deadline and handle with backoff/retry
> -
>
> Key: BEAM-9269
> URL: https://issues.apache.org/jira/browse/BEAM-9269
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0, 2.17.0, 2.18.0, 2.19.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Major
>  Labels: google-cloud-spanner
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Default commit deadline in Spanner is 1hr, which can lead to a variety of 
> issues including database overload and session expiry.
> Shorter deadline should be set with backoff/retry when deadline expires, so 
> that the Spanner database does not become overloaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9268) SpannerIO: Better documentation and warning about creating tables in the pipeline

2020-02-10 Thread Niel Markwick (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niel Markwick resolved BEAM-9268.
-
Resolution: Fixed

> SpannerIO: Better documentation and warning about creating tables in the 
> pipeline
> -
>
> Key: BEAM-9268
> URL: https://issues.apache.org/jira/browse/BEAM-9268
> Project: Beam
>  Issue Type: Improvement
>  Components: io-go-gcp
>Affects Versions: 2.16.0, 2.17.0, 2.18.0, 2.19.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Major
>  Labels: google-cloud-spanner, perfomance
> Fix For: 2.20.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The javadoc for SpannerIO.Write mentions in passing that the transform needs 
> to know the DB schema for optimal performance. If the schema is created 
> within the pipeline, then there is a race between the schema being created 
> and SpannerIO reading it, leading to a potential performance penalty if 
> SpannerIO does not know about the existence of some tables. 
>  
> Javadoc needs to make this clearer and more explicit, and point the user at 
> the Write.withSchemaReadySignal().
>  
> Pipeline needs to raise (rate limited) warnings if it sees writes being made 
> to tables it does not know about (warnings can refer back to javadocs)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9268) SpannerIO: Better documentation and warning about creating tables in the pipeline

2020-02-10 Thread Niel Markwick (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niel Markwick updated BEAM-9268:

Fix Version/s: 2.20.0

> SpannerIO: Better documentation and warning about creating tables in the 
> pipeline
> -
>
> Key: BEAM-9268
> URL: https://issues.apache.org/jira/browse/BEAM-9268
> Project: Beam
>  Issue Type: Improvement
>  Components: io-go-gcp
>Affects Versions: 2.16.0, 2.17.0, 2.18.0, 2.19.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Major
>  Labels: google-cloud-spanner, perfomance
> Fix For: 2.20.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The javadoc for SpannerIO.Write mentions in passing that the transform needs 
> to know the DB schema for optimal performance. If the schema is created 
> within the pipeline, then there is a race between the schema being created 
> and SpannerIO reading it, leading to a potential performance penalty if 
> SpannerIO does not know about the existence of some tables. 
>  
> Javadoc needs to make this clearer and more explicit, and point the user at 
> the Write.withSchemaReadySignal().
>  
> Pipeline needs to raise (rate limited) warnings if it sees writes being made 
> to tables it does not know about (warnings can refer back to javadocs)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3221) Model pipeline representation improvements

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3221?focusedWorklogId=384786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384786
 ]

ASF GitHub Bot logged work on BEAM-3221:


Author: ASF GitHub Bot
Created on: 10/Feb/20 22:46
Start Date: 10/Feb/20 22:46
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10779: [BEAM-3221, 
BEAM-4180] Clarify documentation for StandardTransforms.Primitives, Pipeline, 
and PTransform.
URL: https://github.com/apache/beam/pull/10779#issuecomment-584395599
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384786)
Time Spent: 5h  (was: 4h 50m)

> Model pipeline representation improvements
> --
>
> Key: BEAM-3221
> URL: https://issues.apache.org/jira/browse/BEAM-3221
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Collections of various (breaking) tweaks to the Runner API, notably the 
> pipeline representation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9288) Conscrypt shaded dependency

2020-02-10 Thread Esun Kim (Jira)
Esun Kim created BEAM-9288:
--

 Summary: Conscrypt shaded dependency
 Key: BEAM-9288
 URL: https://issues.apache.org/jira/browse/BEAM-9288
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Esun Kim


Conscrypt is not designed to be shaded properly mainly because of so files. I 
happened to see BEAM-9030 (*1) creating a new vendored gRPC shading Conscrypt 
(*2) in it. I think this could make a problem when new Conscrypt is brought by 
new gcsio depending on gRPC-alts (*4) in a dependency chain. (*5) In this case, 
it may have a conflict when finding proper so files for Conscrypt. 

*1: https://issues.apache.org/jira/browse/BEAM-9030

*2:  
[https://github.com/apache/beam/blob/e24d1e51cbabe27cb3cc381fd95b334db639c45d/buildSrc/src/main/groovy/org/apache/beam/gradle/GrpcVendoring_1_26_0.groovy#L78]

*3: https://issues.apache.org/jira/browse/BEAM-6136

*4: [https://mvnrepository.com/artifact/io.grpc/grpc-alts/1.27.0]

*5: https://issues.apache.org/jira/browse/BEAM-8889

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=384776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384776
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 10/Feb/20 22:23
Start Date: 10/Feb/20 22:23
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #10769: [BEAM-8889] Upgrades 
gcsio to 2.0.0
URL: https://github.com/apache/beam/pull/10769#issuecomment-584387407
 
 
   @veblush Added diagram to clarify "use Apache Beam and GCS's Cooperative 
Locking together".
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384776)
Remaining Estimate: 154h 20m  (was: 154.5h)
Time Spent: 13h 40m  (was: 13.5h)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 13h 40m
>  Remaining Estimate: 154h 20m
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=384775=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384775
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 10/Feb/20 22:22
Start Date: 10/Feb/20 22:22
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #10769: [BEAM-8889] Upgrades 
gcsio to 2.0.0
URL: https://github.com/apache/beam/pull/10769#issuecomment-584308357
 
 
   @veblush I don't approve or disapprove this PR. I don't know how many people 
use Apache Beam and GCS's Cooperative Locking together.
   
   
![image](https://user-images.githubusercontent.com/28604/74195716-df2f2380-4c29-11ea-964b-d4b17b3950ce.png)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384775)
Remaining Estimate: 154.5h  (was: 154h 40m)
Time Spent: 13.5h  (was: 13h 20m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 13.5h
>  Remaining Estimate: 154.5h
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9269) Set shorter Commit Deadline and handle with backoff/retry

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9269?focusedWorklogId=384773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384773
 ]

ASF GitHub Bot logged work on BEAM-9269:


Author: ASF GitHub Bot
Created on: 10/Feb/20 22:04
Start Date: 10/Feb/20 22:04
Worklog Time Spent: 10m 
  Work Description: nithinsujir commented on issue #10752: [BEAM-9269] Add 
commit deadline for Spanner writes.
URL: https://github.com/apache/beam/pull/10752#issuecomment-584380391
 
 
   Looks ok to me.
   
   Allen,
   Can you take a look as well?
   
   
   On Mon, Feb 10, 2020 at 1:43 PM Chamikara Jayalath 
   wrote:
   
   > Thanks Niel.
   >
   > R: @nithinsujir  can you review ?
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or unsubscribe
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384773)
Time Spent: 1h 10m  (was: 1h)

> Set shorter Commit Deadline and handle with backoff/retry
> -
>
> Key: BEAM-9269
> URL: https://issues.apache.org/jira/browse/BEAM-9269
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0, 2.17.0, 2.18.0, 2.19.0
>Reporter: Niel Markwick
>Priority: Major
>  Labels: google-cloud-spanner
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Default commit deadline in Spanner is 1hr, which can lead to a variety of 
> issues including database overload and session expiry.
> Shorter deadline should be set with backoff/retry when deadline expires, so 
> that the Spanner database does not become overloaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9287) Python Validates runner tests for Unified Worker

2020-02-10 Thread Ankur Goenka (Jira)
Ankur Goenka created BEAM-9287:
--

 Summary: Python Validates runner tests for Unified Worker
 Key: BEAM-9287
 URL: https://issues.apache.org/jira/browse/BEAM-9287
 Project: Beam
  Issue Type: Test
  Components: runner-dataflow, testing
Reporter: Ankur Goenka






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=384767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384767
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:47
Start Date: 10/Feb/20 21:47
Worklog Time Spent: 10m 
  Work Description: veblush commented on issue #10769: [BEAM-8889] Upgrades 
gcsio to 2.0.0
URL: https://github.com/apache/beam/pull/10769#issuecomment-584373451
 
 
   @suztomo I don't think that cooplock is used by Beam.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384767)
Remaining Estimate: 154h 40m  (was: 154h 50m)
Time Spent: 13h 20m  (was: 13h 10m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 13h 20m
>  Remaining Estimate: 154h 40m
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8932) Expose complete Cloud Pub/Sub messages through PubsubIO API

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8932?focusedWorklogId=384766=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384766
 ]

ASF GitHub Bot logged work on BEAM-8932:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:44
Start Date: 10/Feb/20 21:44
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10476: 
[BEAM-8932][Cleanup] Move external PubsubIO hooks outside of PubsubIO.
URL: https://github.com/apache/beam/pull/10476#issuecomment-584372372
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384766)
Time Spent: 14h 50m  (was: 14h 40m)

> Expose complete Cloud Pub/Sub messages through PubsubIO API
> ---
>
> Key: BEAM-8932
> URL: https://issues.apache.org/jira/browse/BEAM-8932
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: Major
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> The PubsubIO API only exposes a subset of the fields in the underlying 
> PubsubMessage protocol buffer. To accomodate future feature changes as well 
> as for greater compatability with code using the Cloud Pub/Sub apis, a method 
> to read and write these protocol messages should be exposed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9269) Set shorter Commit Deadline and handle with backoff/retry

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9269?focusedWorklogId=384764=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384764
 ]

ASF GitHub Bot logged work on BEAM-9269:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:43
Start Date: 10/Feb/20 21:43
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10752: [BEAM-9269] Add 
commit deadline for Spanner writes.
URL: https://github.com/apache/beam/pull/10752#issuecomment-584371801
 
 
   Thanks Niel.
   
   R: @nithinsujir can you review ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384764)
Time Spent: 1h  (was: 50m)

> Set shorter Commit Deadline and handle with backoff/retry
> -
>
> Key: BEAM-9269
> URL: https://issues.apache.org/jira/browse/BEAM-9269
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0, 2.17.0, 2.18.0, 2.19.0
>Reporter: Niel Markwick
>Priority: Major
>  Labels: google-cloud-spanner
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Default commit deadline in Spanner is 1hr, which can lead to a variety of 
> issues including database overload and session expiry.
> Shorter deadline should be set with backoff/retry when deadline expires, so 
> that the Spanner database does not become overloaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=384760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384760
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:39
Start Date: 10/Feb/20 21:39
Worklog Time Spent: 10m 
  Work Description: veblush commented on issue #10617: [BEAM-8889] Cleanup 
Beam to GCS connector interfacing code so it uses higher level objects such as 
GoogleCloudStorage.
URL: https://github.com/apache/beam/pull/10617#issuecomment-584370267
 
 
   I don't have a strong opinion over this as far as this option can be 
introduced as an opt-in and later an opt once it's considered reliable enough.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384760)
Remaining Estimate: 154h 50m  (was: 155h)
Time Spent: 13h 10m  (was: 13h)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 13h 10m
>  Remaining Estimate: 154h 50m
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8630) Prototype of BeamSQL Calc using ZetaSQL Expression Evaluator

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8630?focusedWorklogId=384759=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384759
 ]

ASF GitHub Bot logged work on BEAM-8630:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:37
Start Date: 10/Feb/20 21:37
Worklog Time Spent: 10m 
  Work Description: apilloud commented on pull request #10820: [BEAM-8630] 
Validate prepared expression on expand
URL: https://github.com/apache/beam/pull/10820
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384759)
Time Spent: 10h 40m  (was: 10.5h)

> Prototype of BeamSQL Calc using ZetaSQL Expression Evaluator
> 
>
> Key: BEAM-8630
> URL: https://issues.apache.org/jira/browse/BEAM-8630
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Major
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8758) Beam Dependency Update Request: com.google.cloud:google-cloud-spanner

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8758?focusedWorklogId=384758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384758
 ]

ASF GitHub Bot logged work on BEAM-8758:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:36
Start Date: 10/Feb/20 21:36
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #10765: [BEAM-8758] 
Google-cloud-spanner upgrade to 1.49.1
URL: https://github.com/apache/beam/pull/10765#issuecomment-584368783
 
 
   @lukecwik Thank you for review. Will wait for their responses.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384758)
Time Spent: 4h 10m  (was: 4h)

> Beam Dependency Update Request: com.google.cloud:google-cloud-spanner
> -
>
> Key: BEAM-8758
> URL: https://issues.apache.org/jira/browse/BEAM-8758
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
>  - 2019-11-19 21:05:29.289016 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:11:08.926875 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:10:16.400168 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:10:17.656471 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:05:49.080960 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:09:23.346857 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-13 12:09:02.023131 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-20 12:08:38.419575 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.48.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 

[jira] [Work logged] (BEAM-8758) Beam Dependency Update Request: com.google.cloud:google-cloud-spanner

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8758?focusedWorklogId=384753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384753
 ]

ASF GitHub Bot logged work on BEAM-8758:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:26
Start Date: 10/Feb/20 21:26
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10765: [BEAM-8758] 
Google-cloud-spanner upgrade to 1.49.1
URL: https://github.com/apache/beam/pull/10765#issuecomment-584364663
 
 
   If we don't hear from @mairbek, lets move `SpannerConfig#connectToSpanner` 
as I suggested.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384753)
Time Spent: 3h 50m  (was: 3h 40m)

> Beam Dependency Update Request: com.google.cloud:google-cloud-spanner
> -
>
> Key: BEAM-8758
> URL: https://issues.apache.org/jira/browse/BEAM-8758
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
>  - 2019-11-19 21:05:29.289016 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:11:08.926875 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:10:16.400168 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:10:17.656471 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:05:49.080960 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:09:23.346857 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-13 12:09:02.023131 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-20 12:08:38.419575 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.48.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  

[jira] [Work logged] (BEAM-8758) Beam Dependency Update Request: com.google.cloud:google-cloud-spanner

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8758?focusedWorklogId=384755=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384755
 ]

ASF GitHub Bot logged work on BEAM-8758:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:26
Start Date: 10/Feb/20 21:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10765: [BEAM-8758] 
Google-cloud-spanner upgrade to 1.49.1
URL: https://github.com/apache/beam/pull/10765#issuecomment-584364742
 
 
   cc: @nithinsujir
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384755)
Time Spent: 4h  (was: 3h 50m)

> Beam Dependency Update Request: com.google.cloud:google-cloud-spanner
> -
>
> Key: BEAM-8758
> URL: https://issues.apache.org/jira/browse/BEAM-8758
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
>  - 2019-11-19 21:05:29.289016 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:11:08.926875 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:10:16.400168 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:10:17.656471 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:05:49.080960 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:09:23.346857 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-13 12:09:02.023131 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-20 12:08:38.419575 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.48.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-27 12:09:44.298346 
> 

[jira] [Work logged] (BEAM-8758) Beam Dependency Update Request: com.google.cloud:google-cloud-spanner

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8758?focusedWorklogId=384752=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384752
 ]

ASF GitHub Bot logged work on BEAM-8758:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:24
Start Date: 10/Feb/20 21:24
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10765: [BEAM-8758] 
Google-cloud-spanner upgrade to 1.49.1
URL: https://github.com/apache/beam/pull/10765#issuecomment-584363065
 
 
   It seems like we should be able to move `SpannerConfig#connectToSpanner` 
into `SpannerAccessor` making a package private `static SpannerAccessor 
create(SpannerConfig)` method so that we don't expose `SpannerAccessor` which 
would reduce the API surface area.
   
   @mairbek Is there a reason to expose the `SpannerConfig#connectToSpanner`?
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384752)
Time Spent: 3h 40m  (was: 3.5h)

> Beam Dependency Update Request: com.google.cloud:google-cloud-spanner
> -
>
> Key: BEAM-8758
> URL: https://issues.apache.org/jira/browse/BEAM-8758
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
>  - 2019-11-19 21:05:29.289016 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:11:08.926875 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:10:16.400168 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:10:17.656471 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:05:49.080960 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:09:23.346857 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-13 12:09:02.023131 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-20 12:08:38.419575 
> -
> Please consider upgrading the dependency 
> 

[jira] [Work logged] (BEAM-8758) Beam Dependency Update Request: com.google.cloud:google-cloud-spanner

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8758?focusedWorklogId=384751=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384751
 ]

ASF GitHub Bot logged work on BEAM-8758:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:23
Start Date: 10/Feb/20 21:23
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10765: [BEAM-8758] 
Google-cloud-spanner upgrade to 1.49.1
URL: https://github.com/apache/beam/pull/10765#issuecomment-584363065
 
 
   It seems like we should be able to move `SpannerConfig#connectToSpanner` 
into `SpannerAccessor` making a package private `static SpannerAccessor 
create(SpannerConfig)` method so that we don't expose `SpannerAccessor` which 
would reduce the API surface area.
   
   @mairbek (Googler) Is there a reason to expose the 
`SpannerConfig#connectToSpanner`?
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384751)
Time Spent: 3.5h  (was: 3h 20m)

> Beam Dependency Update Request: com.google.cloud:google-cloud-spanner
> -
>
> Key: BEAM-8758
> URL: https://issues.apache.org/jira/browse/BEAM-8758
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
>  - 2019-11-19 21:05:29.289016 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:11:08.926875 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:10:16.400168 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:10:17.656471 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:05:49.080960 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:09:23.346857 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-13 12:09:02.023131 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-20 12:08:38.419575 
> -
> Please consider upgrading the dependency 
> 

[jira] [Work logged] (BEAM-8758) Beam Dependency Update Request: com.google.cloud:google-cloud-spanner

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8758?focusedWorklogId=384750=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384750
 ]

ASF GitHub Bot logged work on BEAM-8758:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:22
Start Date: 10/Feb/20 21:22
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10765: [BEAM-8758] 
Google-cloud-spanner upgrade to 1.49.1
URL: https://github.com/apache/beam/pull/10765#issuecomment-584363065
 
 
   It seems like we should be able to move `SpannerConfig#connectToSpanner` 
into `SpannerAccessor` making a package private `static SpannerAccessor 
create(SpannerConfig)` method so that we don't expose `SpannerAccessor` which 
would reduce the API surface area.
   
   @mairbek Is there a reason to expose the `SpannerConfig#connectToSpanner`?
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384750)
Time Spent: 3h 20m  (was: 3h 10m)

> Beam Dependency Update Request: com.google.cloud:google-cloud-spanner
> -
>
> Key: BEAM-8758
> URL: https://issues.apache.org/jira/browse/BEAM-8758
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
>  - 2019-11-19 21:05:29.289016 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:11:08.926875 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.46.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:10:16.400168 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:10:17.656471 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:05:49.080960 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:09:23.346857 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-13 12:09:02.023131 
> -
> Please consider upgrading the dependency 
> com.google.cloud:google-cloud-spanner. 
> The current version is 1.6.0. The latest version is 1.47.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-20 12:08:38.419575 
> -
> Please consider upgrading the dependency 
> 

[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=384749=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384749
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:22
Start Date: 10/Feb/20 21:22
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10769: [BEAM-8889] 
Upgrades gcsio to 2.0.0
URL: https://github.com/apache/beam/pull/10769#issuecomment-584363007
 
 
   Probably call this out in the dev list ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384749)
Remaining Estimate: 155h  (was: 155h 10m)
Time Spent: 13h  (was: 12h 50m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 13h
>  Remaining Estimate: 155h
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=384748=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384748
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:21
Start Date: 10/Feb/20 21:21
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10617: [BEAM-8889] 
Cleanup Beam to GCS connector interfacing code so it uses higher level objects 
such as GoogleCloudStorage.
URL: https://github.com/apache/beam/pull/10617#issuecomment-584362764
 
 
   That makes sense assuming that we can reasonably make the gRPC/DirectPath 
the default later without a regression in performance or features. @vnorigoog 
@veblush WDYT ?
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384748)
Remaining Estimate: 155h 10m  (was: 155h 20m)
Time Spent: 12h 50m  (was: 12h 40m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: Major
>  Labels: gcs
>   Original Estimate: 168h
>  Time Spent: 12h 50m
>  Remaining Estimate: 155h 10m
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9229) Adding dependency information to Environment proto

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9229?focusedWorklogId=384746=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384746
 ]

ASF GitHub Bot logged work on BEAM-9229:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:19
Start Date: 10/Feb/20 21:19
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #10733: [BEAM-9229] Adding 
dependency information to Environment proto
URL: https://github.com/apache/beam/pull/10733#issuecomment-584359901
 
 
   > Thanks. Could you also include the changes to the artifact staging and/or 
retrieval service? And the expansion service?
   
   I think it would be better to review them separately. The changes to the 
artifact staging and retrieval service proto require the not-too-small changes 
to the Java / Python / Go source codes to be compiled. We can submit this PR 
without breaking the build but not the PR with the service changes.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384746)
Time Spent: 2h 10m  (was: 2h)

> Adding dependency information to Environment proto
> --
>
> Key: BEAM-9229
> URL: https://issues.apache.org/jira/browse/BEAM-9229
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Adding dependency information to Environment proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9229) Adding dependency information to Environment proto

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9229?focusedWorklogId=384743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384743
 ]

ASF GitHub Bot logged work on BEAM-9229:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:13
Start Date: 10/Feb/20 21:13
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #10733: [BEAM-9229] 
Adding dependency information to Environment proto
URL: https://github.com/apache/beam/pull/10733#discussion_r377319480
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -1087,6 +1087,44 @@ message SideInput {
   FunctionSpec window_mapping_fn = 3;
 }
 
+message StandardArtifacts {
+  enum Types {
+// A URN for artifacts stored in a local directory.
+// payload: ArtifactFilePayload.
+FILE = 0 [(beam_urn) = "beam:artifact:file:v1"];
+// A URN for artifacts embedded in ArtifactInformation proto.
+// payload: raw data bytes.
+EMBEDDED = 1 [(beam_urn) = "beam:artifact:embedded:v1"];
+// A URN for artifacts described by HTTP links.
+// payload: a string for an artifact HTTP URL
+HTTP = 2 [(beam_urn) = "beam:artifact:http:v1"];
+// A URN for artifacts hosted on PYPI.
+// artifact_id: a PYPI project name
+// version_range: a PYPI compatible version string
+// payload: None
+PYPI = 3 [(beam_urn) = "beam:artifact:pypi:v1"];
+// A URN for artifacts hosted on Maven central.
+// artifact_id: [maven group id]:[maven artifact id]
+// version_range: a Maven compatible version string
+// payload: None
+MAVEN= 4 [(beam_urn) = "beam:artifact:maven:v1"];
+  }
+}
+
+message ArtifactFilePayload {
+  // A path to an artifact file on a local system.
+  string local_path = 1;
+  // A generated staged name (no path).
+  string staged_name = 2;
+}
+
+message ArtifactInformation {
+  string urn = 1;
+  bytes payload = 2;
+  string artifact_id = 3;
 
 Review comment:
   We also need artifact_id and version_range in local types (local files and 
embedded files) for possible deduplication. If the field is common for all 
types of artifacts I thought it should be the top-level attribute. And yes it'd 
be nice to use a standard format for the top-level version_range arrtibute but 
I don't have a good idea at this moment.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384743)
Time Spent: 2h  (was: 1h 50m)

> Adding dependency information to Environment proto
> --
>
> Key: BEAM-9229
> URL: https://issues.apache.org/jira/browse/BEAM-9229
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Adding dependency information to Environment proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9229) Adding dependency information to Environment proto

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9229?focusedWorklogId=384741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384741
 ]

ASF GitHub Bot logged work on BEAM-9229:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:04
Start Date: 10/Feb/20 21:04
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #10733: [BEAM-9229] 
Adding dependency information to Environment proto
URL: https://github.com/apache/beam/pull/10733#discussion_r377315362
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -1087,6 +1087,44 @@ message SideInput {
   FunctionSpec window_mapping_fn = 3;
 }
 
+message StandardArtifacts {
+  enum Types {
+// A URN for artifacts stored in a local directory.
+// payload: ArtifactFilePayload.
+FILE = 0 [(beam_urn) = "beam:artifact:file:v1"];
+// A URN for artifacts embedded in ArtifactInformation proto.
+// payload: raw data bytes.
+EMBEDDED = 1 [(beam_urn) = "beam:artifact:embedded:v1"];
+// A URN for artifacts described by HTTP links.
+// payload: a string for an artifact HTTP URL
+HTTP = 2 [(beam_urn) = "beam:artifact:http:v1"];
+// A URN for artifacts hosted on PYPI.
+// artifact_id: a PYPI project name
+// version_range: a PYPI compatible version string
+// payload: None
+PYPI = 3 [(beam_urn) = "beam:artifact:pypi:v1"];
+// A URN for artifacts hosted on Maven central.
+// artifact_id: [maven group id]:[maven artifact id]
+// version_range: a Maven compatible version string
+// payload: None
+MAVEN= 4 [(beam_urn) = "beam:artifact:maven:v1"];
+  }
+}
+
+message ArtifactFilePayload {
+  // A path to an artifact file on a local system.
+  string local_path = 1;
+  // A generated staged name (no path).
+  string staged_name = 2;
+}
+
+message ArtifactInformation {
+  string urn = 1;
+  bytes payload = 2;
+  string artifact_id = 3;
+  string version_range = 4;
+}
 
 Review comment:
   Thanks for a good catch. Added staged_name for embedded payload.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384741)
Time Spent: 1h 50m  (was: 1h 40m)

> Adding dependency information to Environment proto
> --
>
> Key: BEAM-9229
> URL: https://issues.apache.org/jira/browse/BEAM-9229
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Adding dependency information to Environment proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8932) Expose complete Cloud Pub/Sub messages through PubsubIO API

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8932?focusedWorklogId=384739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384739
 ]

ASF GitHub Bot logged work on BEAM-8932:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:03
Start Date: 10/Feb/20 21:03
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10476: 
[BEAM-8932][Cleanup] Move external PubsubIO hooks outside of PubsubIO.
URL: https://github.com/apache/beam/pull/10476#issuecomment-584352808
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384739)
Time Spent: 14h 40m  (was: 14.5h)

> Expose complete Cloud Pub/Sub messages through PubsubIO API
> ---
>
> Key: BEAM-8932
> URL: https://issues.apache.org/jira/browse/BEAM-8932
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: Major
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> The PubsubIO API only exposes a subset of the fields in the underlying 
> PubsubMessage protocol buffer. To accomodate future feature changes as well 
> as for greater compatability with code using the Cloud Pub/Sub apis, a method 
> to read and write these protocol messages should be exposed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9229) Adding dependency information to Environment proto

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9229?focusedWorklogId=384738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384738
 ]

ASF GitHub Bot logged work on BEAM-9229:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:03
Start Date: 10/Feb/20 21:03
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #10733: [BEAM-9229] 
Adding dependency information to Environment proto
URL: https://github.com/apache/beam/pull/10733#discussion_r377314762
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -1087,6 +1087,44 @@ message SideInput {
   FunctionSpec window_mapping_fn = 3;
 }
 
+message StandardArtifacts {
+  enum Types {
+// A URN for artifacts stored in a local directory.
+// payload: ArtifactFilePayload.
+FILE = 0 [(beam_urn) = "beam:artifact:file:v1"];
+// A URN for artifacts embedded in ArtifactInformation proto.
+// payload: raw data bytes.
+EMBEDDED = 1 [(beam_urn) = "beam:artifact:embedded:v1"];
+// A URN for artifacts described by HTTP links.
+// payload: a string for an artifact HTTP URL
+HTTP = 2 [(beam_urn) = "beam:artifact:http:v1"];
+// A URN for artifacts hosted on PYPI.
+// artifact_id: a PYPI project name
+// version_range: a PYPI compatible version string
+// payload: None
+PYPI = 3 [(beam_urn) = "beam:artifact:pypi:v1"];
+// A URN for artifacts hosted on Maven central.
+// artifact_id: [maven group id]:[maven artifact id]
+// version_range: a Maven compatible version string
+// payload: None
+MAVEN= 4 [(beam_urn) = "beam:artifact:maven:v1"];
+  }
+}
+
+message ArtifactFilePayload {
+  // A path to an artifact file on a local system.
+  string local_path = 1;
+  // A generated staged name (no path).
+  string staged_name = 2;
+}
+
+message ArtifactInformation {
+  string urn = 1;
 
 Review comment:
   Looks like `urn` and `payload` combination is mostly common in 
`beam_runner_api.proto` since I couldn't find any other usage of `string type 
=`. I think it would be better to follow the same convention here. WDYT?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384738)
Time Spent: 1h 40m  (was: 1.5h)

> Adding dependency information to Environment proto
> --
>
> Key: BEAM-9229
> URL: https://issues.apache.org/jira/browse/BEAM-9229
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Adding dependency information to Environment proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3221) Model pipeline representation improvements

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3221?focusedWorklogId=384734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384734
 ]

ASF GitHub Bot logged work on BEAM-3221:


Author: ASF GitHub Bot
Created on: 10/Feb/20 21:00
Start Date: 10/Feb/20 21:00
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10779: [BEAM-3221, 
BEAM-4180] Clarify documentation for StandardTransforms.Primitives, Pipeline, 
and PTransform.
URL: https://github.com/apache/beam/pull/10779#issuecomment-584351541
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384734)
Time Spent: 4h 50m  (was: 4h 40m)

> Model pipeline representation improvements
> --
>
> Key: BEAM-3221
> URL: https://issues.apache.org/jira/browse/BEAM-3221
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Collections of various (breaking) tweaks to the Runner API, notably the 
> pipeline representation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9281) Update commons-csv to version 1.8

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9281?focusedWorklogId=384730=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384730
 ]

ASF GitHub Bot logged work on BEAM-9281:


Author: ASF GitHub Bot
Created on: 10/Feb/20 20:56
Start Date: 10/Feb/20 20:56
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10818: [BEAM-9281] Update 
commons-csv to version 1.8
URL: https://github.com/apache/beam/pull/10818#issuecomment-584214767
 
 
   Run CommunityMetrics PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384730)
Time Spent: 0.5h  (was: 20m)

> Update commons-csv to version 1.8
> -
>
> Key: BEAM-9281
> URL: https://issues.apache.org/jira/browse/BEAM-9281
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Trivial
> Fix For: 2.20.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9229) Adding dependency information to Environment proto

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9229?focusedWorklogId=384726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384726
 ]

ASF GitHub Bot logged work on BEAM-9229:


Author: ASF GitHub Bot
Created on: 10/Feb/20 20:55
Start Date: 10/Feb/20 20:55
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #10733: [BEAM-9229] 
Adding dependency information to Environment proto
URL: https://github.com/apache/beam/pull/10733#discussion_r377310747
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -1087,6 +1087,44 @@ message SideInput {
   FunctionSpec window_mapping_fn = 3;
 }
 
+message StandardArtifacts {
+  enum Types {
+// A URN for artifacts stored in a local directory.
+// payload: ArtifactFilePayload.
+FILE = 0 [(beam_urn) = "beam:artifact:file:v1"];
+// A URN for artifacts embedded in ArtifactInformation proto.
+// payload: raw data bytes.
+EMBEDDED = 1 [(beam_urn) = "beam:artifact:embedded:v1"];
+// A URN for artifacts described by HTTP links.
+// payload: a string for an artifact HTTP URL
+HTTP = 2 [(beam_urn) = "beam:artifact:http:v1"];
+// A URN for artifacts hosted on PYPI.
+// artifact_id: a PYPI project name
+// version_range: a PYPI compatible version string
+// payload: None
+PYPI = 3 [(beam_urn) = "beam:artifact:pypi:v1"];
+// A URN for artifacts hosted on Maven central.
+// artifact_id: [maven group id]:[maven artifact id]
+// version_range: a Maven compatible version string
+// payload: None
+MAVEN= 4 [(beam_urn) = "beam:artifact:maven:v1"];
+  }
+}
+
+message ArtifactFilePayload {
+  // A path to an artifact file on a local system.
+  string local_path = 1;
 
 Review comment:
   It's assumed to be used when SDK is saving remote artifacts and preparing to 
submit them to the staging service from locally downloaded files. `local_path` 
here is not ubiquitous information. After submitting the artifacts, 
retrieval_token (and staged_name if necessary) should be used instead of 
`local_path`. Is there any specific use-cases you're considering?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384726)
Time Spent: 1.5h  (was: 1h 20m)

> Adding dependency information to Environment proto
> --
>
> Key: BEAM-9229
> URL: https://issues.apache.org/jira/browse/BEAM-9229
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Adding dependency information to Environment proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9231) Annotate as Experimental/Internal missing classes in beam-sdks-java-core

2020-02-10 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9231:
---
Labels:   (was: portability)

> Annotate as Experimental/Internal missing classes in beam-sdks-java-core
> 
>
> Key: BEAM-9231
> URL: https://issues.apache.org/jira/browse/BEAM-9231
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> For some extra context I was studying the evolution of our APIs between 
> versions in particular for beam-sdks-java-core and noticed that some parts 
> were not well classified as Experimental in particular classes (and 
> transforms)  for portability and SplittableDoFn.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9231) Annotate as Experimental/Internal missing classes in beam-sdks-java-core

2020-02-10 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía resolved BEAM-9231.

Fix Version/s: 2.20.0
   Resolution: Fixed

> Annotate as Experimental/Internal missing classes in beam-sdks-java-core
> 
>
> Key: BEAM-9231
> URL: https://issues.apache.org/jira/browse/BEAM-9231
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.20.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> For some extra context I was studying the evolution of our APIs between 
> versions in particular for beam-sdks-java-core and noticed that some parts 
> were not well classified as Experimental in particular classes (and 
> transforms)  for portability and SplittableDoFn.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9231) Annotate as Experimental/Internal missing classes in beam-sdks-java-core

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9231?focusedWorklogId=384722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384722
 ]

ASF GitHub Bot logged work on BEAM-9231:


Author: ASF GitHub Bot
Created on: 10/Feb/20 20:47
Start Date: 10/Feb/20 20:47
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10739: [BEAM-9231] Annotate 
as Experimental/Internal missing classes in beam-sdks-java-core
URL: https://github.com/apache/beam/pull/10739#issuecomment-584346213
 
 
   Thanks for the review Luke! I merged the PR manually to remove @Internal 
from `JvmInitializer`. One final question. What is the reason this class 
belongs in core instead of in harness?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384722)
Time Spent: 7h 10m  (was: 7h)

> Annotate as Experimental/Internal missing classes in beam-sdks-java-core
> 
>
> Key: BEAM-9231
> URL: https://issues.apache.org/jira/browse/BEAM-9231
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Labels: portability
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> For some extra context I was studying the evolution of our APIs between 
> versions in particular for beam-sdks-java-core and noticed that some parts 
> were not well classified as Experimental in particular classes (and 
> transforms)  for portability and SplittableDoFn.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9231) Annotate as Experimental/Internal missing classes in beam-sdks-java-core

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9231?focusedWorklogId=384723=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384723
 ]

ASF GitHub Bot logged work on BEAM-9231:


Author: ASF GitHub Bot
Created on: 10/Feb/20 20:47
Start Date: 10/Feb/20 20:47
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #10739: [BEAM-9231] 
Annotate as Experimental/Internal missing classes in beam-sdks-java-core
URL: https://github.com/apache/beam/pull/10739
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384723)
Time Spent: 7h 20m  (was: 7h 10m)

> Annotate as Experimental/Internal missing classes in beam-sdks-java-core
> 
>
> Key: BEAM-9231
> URL: https://issues.apache.org/jira/browse/BEAM-9231
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Labels: portability
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> For some extra context I was studying the evolution of our APIs between 
> versions in particular for beam-sdks-java-core and noticed that some parts 
> were not well classified as Experimental in particular classes (and 
> transforms)  for portability and SplittableDoFn.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9270) Unable to use ElasticsearchIO for indexpatterns: index-*

2020-02-10 Thread Ludovic Boutros (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033930#comment-17033930
 ] 

Ludovic Boutros commented on BEAM-9270:
---

Hi [~krisnaru], as [~echauchot] already pointed it, there is an ongoing 
complete refactoring of the module which should be able to handle this type of 
usage in a better way.

That said, I did not have any time to test it but would an alias on the indices 
work ? 

> Unable to use ElasticsearchIO for indexpatterns: index-*
> 
>
> Key: BEAM-9270
> URL: https://issues.apache.org/jira/browse/BEAM-9270
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-elasticsearch
>Affects Versions: 2.19.0
>Reporter: Krishnaiah Narukulla
>Priority: Major
>
> ElasticsearchIO input doesnot work with index patterns with wildcard. for 
> example: index-*. I It works well with single index.  It becomes problem if 
> we need to query multiple indices using index pattern.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9229) Adding dependency information to Environment proto

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9229?focusedWorklogId=384716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384716
 ]

ASF GitHub Bot logged work on BEAM-9229:


Author: ASF GitHub Bot
Created on: 10/Feb/20 20:39
Start Date: 10/Feb/20 20:39
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #10733: [BEAM-9229] 
Adding dependency information to Environment proto
URL: https://github.com/apache/beam/pull/10733#discussion_r377303210
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -1087,6 +1087,44 @@ message SideInput {
   FunctionSpec window_mapping_fn = 3;
 }
 
+message StandardArtifacts {
+  enum Types {
+// A URN for artifacts stored in a local directory.
+// payload: ArtifactFilePayload.
+FILE = 0 [(beam_urn) = "beam:artifact:file:v1"];
+// A URN for artifacts embedded in ArtifactInformation proto.
+// payload: raw data bytes.
+EMBEDDED = 1 [(beam_urn) = "beam:artifact:embedded:v1"];
+// A URN for artifacts described by HTTP links.
+// payload: a string for an artifact HTTP URL
+HTTP = 2 [(beam_urn) = "beam:artifact:http:v1"];
+// A URN for artifacts hosted on PYPI.
+// artifact_id: a PYPI project name
+// version_range: a PYPI compatible version string
+// payload: None
+PYPI = 3 [(beam_urn) = "beam:artifact:pypi:v1"];
+// A URN for artifacts hosted on Maven central.
+// artifact_id: [maven group id]:[maven artifact id]
+// version_range: a Maven compatible version string
+// payload: None
+MAVEN= 4 [(beam_urn) = "beam:artifact:maven:v1"];
+  }
+}
+
+message ArtifactFilePayload {
+  // A path to an artifact file on a local system.
+  string local_path = 1;
+  // A generated staged name (no path).
+  string staged_name = 2;
 
 Review comment:
   stage_name is specific to files ready to be staged (local files and embedded 
files).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384716)
Time Spent: 1h 20m  (was: 1h 10m)

> Adding dependency information to Environment proto
> --
>
> Key: BEAM-9229
> URL: https://issues.apache.org/jira/browse/BEAM-9229
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Adding dependency information to Environment proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9229) Adding dependency information to Environment proto

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9229?focusedWorklogId=384715=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384715
 ]

ASF GitHub Bot logged work on BEAM-9229:


Author: ASF GitHub Bot
Created on: 10/Feb/20 20:38
Start Date: 10/Feb/20 20:38
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #10733: [BEAM-9229] 
Adding dependency information to Environment proto
URL: https://github.com/apache/beam/pull/10733#discussion_r377302712
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -1087,6 +1087,44 @@ message SideInput {
   FunctionSpec window_mapping_fn = 3;
 }
 
+message StandardArtifacts {
+  enum Types {
+// A URN for artifacts stored in a local directory.
+// payload: ArtifactFilePayload.
+FILE = 0 [(beam_urn) = "beam:artifact:file:v1"];
+// A URN for artifacts embedded in ArtifactInformation proto.
+// payload: raw data bytes.
+EMBEDDED = 1 [(beam_urn) = "beam:artifact:embedded:v1"];
+// A URN for artifacts described by HTTP links.
+// payload: a string for an artifact HTTP URL
+HTTP = 2 [(beam_urn) = "beam:artifact:http:v1"];
 
 Review comment:
   Make sense. Renamed to 'remote'.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384715)
Time Spent: 1h 10m  (was: 1h)

> Adding dependency information to Environment proto
> --
>
> Key: BEAM-9229
> URL: https://issues.apache.org/jira/browse/BEAM-9229
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Adding dependency information to Environment proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9229) Adding dependency information to Environment proto

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9229?focusedWorklogId=384714=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384714
 ]

ASF GitHub Bot logged work on BEAM-9229:


Author: ASF GitHub Bot
Created on: 10/Feb/20 20:37
Start Date: 10/Feb/20 20:37
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #10733: [BEAM-9229] 
Adding dependency information to Environment proto
URL: https://github.com/apache/beam/pull/10733#discussion_r377302377
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -1087,6 +1087,44 @@ message SideInput {
   FunctionSpec window_mapping_fn = 3;
 }
 
+message StandardArtifacts {
+  enum Types {
+// A URN for artifacts stored in a local directory.
+// payload: ArtifactFilePayload.
+FILE = 0 [(beam_urn) = "beam:artifact:file:v1"];
 
 Review comment:
   Thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384714)
Time Spent: 1h  (was: 50m)

> Adding dependency information to Environment proto
> --
>
> Key: BEAM-9229
> URL: https://issues.apache.org/jira/browse/BEAM-9229
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Adding dependency information to Environment proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8537) Provide WatermarkEstimatorProvider for different types of WatermarkEstimator

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8537?focusedWorklogId=384713=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384713
 ]

ASF GitHub Bot logged work on BEAM-8537:


Author: ASF GitHub Bot
Created on: 10/Feb/20 20:37
Start Date: 10/Feb/20 20:37
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #10802: [BEAM-8537] 
Move wrappers of RestrictionTracker out of iobase
URL: https://github.com/apache/beam/pull/10802#discussion_r377302328
 
 

 ##
 File path: sdks/python/apache_beam/runners/sdf_utils.py
 ##
 @@ -0,0 +1,173 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# pytype: skip-file
+
+"""Common utility class to help SDK harness to execute an SDF. """
+
+from __future__ import absolute_import
+from __future__ import division
+
+import logging
+import threading
+from builtins import object
+from collections import namedtuple
+from typing import TYPE_CHECKING
+from typing import Any
+from typing import Optional
+from typing import Tuple
+
+from apache_beam.utils import timestamp
+
+if TYPE_CHECKING:
+  from apache_beam.io.iobase import RestrictionTracker
+  from apache_beam.utils.timestamp import Timestamp
+
+_LOGGER = logging.getLogger(__name__)
+
+
+SplitResultPrimary = namedtuple(
+'SplitResultPrimary', 'windowed_value')
+
+SplitResultResidual = namedtuple(
+'SplitResultResidual',
+'windowed_value current_watermark deferred_timestamp')
 
 Review comment:
   Changed in 
https://github.com/apache/beam/pull/10802/commits/8aa9821439fbb941c83c61e34b52aedc1404dacc
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384713)
Time Spent: 12h 20m  (was: 12h 10m)

> Provide WatermarkEstimatorProvider for different types of WatermarkEstimator
> 
>
> Key: BEAM-8537
> URL: https://issues.apache.org/jira/browse/BEAM-8537
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> This is a follow up for in-progress PR:  
> https://github.com/apache/beam/pull/9794.
> Current implementation in PR9794 provides a default implementation of 
> WatermarkEstimator. For further work, we want to let WatermarkEstimator to be 
> a pure Interface. We'll provide a WatermarkEstimatorProvider to be able to 
> create a custom WatermarkEstimator per windowed value. It should be similar 
> to how we track restriction for SDF: 
> WatermarkEstimator <---> RestrictionTracker 
> WatermarkEstimatorProvider <---> RestrictionTrackerProvider
> WatermarkEstimatorParam <---> RestrictionDoFnParam



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9240) Check for Nullability in typesEqual() method of FieldType class

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9240?focusedWorklogId=384704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384704
 ]

ASF GitHub Bot logged work on BEAM-9240:


Author: ASF GitHub Bot
Created on: 10/Feb/20 20:25
Start Date: 10/Feb/20 20:25
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #10744: [BEAM-9240]: Check 
for Nullability in typesEqual() method of FieldTyp…
URL: https://github.com/apache/beam/pull/10744#issuecomment-584336724
 
 
   LGTM to treat nullability as a factor of equivalence of fields.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384704)
Time Spent: 1h 20m  (was: 1h 10m)

> Check for Nullability in typesEqual() method of FieldType class
> ---
>
> Key: BEAM-9240
> URL: https://issues.apache.org/jira/browse/BEAM-9240
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.18.0
>Reporter: Rahul Patwari
>Assignee: Rahul Patwari
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> {{If two schemas are created like this:}}
> {{Schema schema1 = Schema.builder().addStringField("col1").build();}}
>  {{Schema schema2 = Schema.builder().addNullableField("col1", 
> FieldType.STRING).build();}}
>  
> {{schema1.typeEquals(schema2) returns "true" even though the schemas differ 
> by Nullability}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9286) Create validation tests for metrics based on MonitoringInfo if applicable

2020-02-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9286?focusedWorklogId=384701=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-384701
 ]

ASF GitHub Bot logged work on BEAM-9286:


Author: ASF GitHub Bot
Created on: 10/Feb/20 20:20
Start Date: 10/Feb/20 20:20
Worklog Time Spent: 10m 
  Work Description: HuangLED commented on issue #10823: [BEAM-9286] Create 
validation runner test for metrics (user counter). 
URL: https://github.com/apache/beam/pull/10823#issuecomment-584334518
 
 
   R: @ardagan, @ajamato 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 384701)
Time Spent: 0.5h  (was: 20m)

> Create validation tests for metrics based on MonitoringInfo if applicable
> -
>
> Key: BEAM-9286
> URL: https://issues.apache.org/jira/browse/BEAM-9286
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Ruoyun Huang
>Assignee: Ruoyun Huang
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Create dedicated validation runner tests for metrics (those based Monitoring 
> Info). 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   >