[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87429=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87429
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 04/Apr/18 05:20
Start Date: 04/Apr/18 05:20
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #5012: [BEAM-3250] Creating 
a Gradle Jenkins config for Flink PostCommit.
URL: https://github.com/apache/beam/pull/5012#issuecomment-378483382
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87429)
Time Spent: 0.5h  (was: 20m)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Alex Amato
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3437) Support schema in PCollections

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3437?focusedWorklogId=87426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87426
 ]

ASF GitHub Bot logged work on BEAM-3437:


Author: ASF GitHub Bot
Created on: 04/Apr/18 04:59
Start Date: 04/Apr/18 04:59
Worklog Time Spent: 10m 
  Work Description: akedin commented on a change in pull request #4964: 
[BEAM-3437] Introduce Schema class, and use it in BeamSQL
URL: https://github.com/apache/beam/pull/4964#discussion_r179025224
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/Column.java
 ##
 @@ -44,7 +46,7 @@ public static Builder builder() {
   @AutoValue.Builder
   public abstract static class Builder {
 public abstract Builder name(String name);
-public abstract Builder coder(Coder coder);
+public abstract Builder typeDescriptor(FieldType fieldType);
 
 Review comment:
   `type(FieldType)`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87426)
Time Spent: 7h  (was: 6h 50m)

> Support schema in PCollections
> --
>
> Key: BEAM-3437
> URL: https://issues.apache.org/jira/browse/BEAM-3437
> Project: Beam
>  Issue Type: Wish
>  Components: beam-model
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> As discussed with some people in the team, it would be great to add schema 
> support in {{PCollections}}. It will allow us:
> 1. To expect some data type in {{PTransforms}}
> 2. Improve some runners with additional features (I'm thinking about Spark 
> runner with data frames for instance).
> A technical draft document has been created: 
> https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit?disco=BhykQIs=5a203b46=comment_email_document
> I also started a PoC on a branch, I will update this Jira with a "discussion" 
> PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3437) Support schema in PCollections

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3437?focusedWorklogId=87424=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87424
 ]

ASF GitHub Bot logged work on BEAM-3437:


Author: ASF GitHub Bot
Created on: 04/Apr/18 04:59
Start Date: 04/Apr/18 04:59
Worklog Time Spent: 10m 
  Work Description: akedin commented on a change in pull request #4964: 
[BEAM-3437] Introduce Schema class, and use it in BeamSQL
URL: https://github.com/apache/beam/pull/4964#discussion_r179022576
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/coders/BigEndianShortCoder.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.coders;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.io.UTFDataFormatException;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/**
+ * A {@link BigEndianShortCoder} encodes {@link Short Shorts} in 4 bytes, 
big-endian.
+ */
+public class BigEndianShortCoder extends AtomicCoder {
+
+  public static BigEndianShortCoder of() {
+return INSTANCE;
+  }
+
+  /
+
+  private static final BigEndianShortCoder INSTANCE = new 
BigEndianShortCoder();
+  private static final TypeDescriptor TYPE_DESCRIPTOR = new 
TypeDescriptor() {};
+
+  private BigEndianShortCoder() {}
+
+  @Override
+  public void encode(Short value, OutputStream outStream) throws IOException {
+if (value == null) {
+  throw new CoderException("cannot encode a null Short");
+}
+new DataOutputStream(outStream).writeShort(value);
+  }
+
+  @Override
+  public Short decode(InputStream inStream)
+  throws IOException, CoderException {
+try {
+  return new DataInputStream(inStream).readShort();
+} catch (EOFException | UTFDataFormatException exn) {
+  // These exceptions correspond to decoding problems, so change
+  // what kind of exception they're branded as.
+  throw new CoderException(exn);
+}
+  }
+
+  @Override
+  public void verifyDeterministic() {}
+
+  /**
+   * {@inheritDoc}
+   *
+   * @return {@code true}. This coder is injective.
+   */
+  @Override
+  public boolean consistentWithEquals() {
+return true;
+  }
+
+  /**
+   * {@inheritDoc}
+   *
+   * @return {@code true}, because {@link #getEncodedElementByteSize} runs in 
constant time.
+   */
+  @Override
+  public boolean isRegisterByteSizeObserverCheap(Short value) {
+return true;
+  }
+
+  @Override
+  public TypeDescriptor getEncodedTypeDescriptor() {
+return TYPE_DESCRIPTOR;
+  }
+
+  /**
+   * {@inheritDoc}
+   *
+   * @return {@code 4}, the size in bytes of an short's big endian encoding.
 
 Review comment:
   `{@code 2}`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87424)
Time Spent: 6h 50m  (was: 6h 40m)

> Support schema in PCollections
> --
>
> Key: BEAM-3437
> URL: https://issues.apache.org/jira/browse/BEAM-3437
> Project: Beam
>  Issue Type: Wish
>  Components: beam-model
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> As discussed with some people in the team, it would be great to add schema 
> support in {{PCollections}}. It will allow us:
> 1. To expect some data type in {{PTransforms}}
> 2. Improve some runners with additional features (I'm thinking about Spark 
> runner with data frames for instance).
> A technical draft document has been created: 
> 

[jira] [Work logged] (BEAM-3437) Support schema in PCollections

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3437?focusedWorklogId=87425=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87425
 ]

ASF GitHub Bot logged work on BEAM-3437:


Author: ASF GitHub Bot
Created on: 04/Apr/18 04:59
Start Date: 04/Apr/18 04:59
Worklog Time Spent: 10m 
  Work Description: akedin commented on a change in pull request #4964: 
[BEAM-3437] Introduce Schema class, and use it in BeamSQL
URL: https://github.com/apache/beam/pull/4964#discussion_r179026015
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/impl/parser/BeamSqlParserTest.java
 ##
 @@ -163,13 +164,13 @@ private static Table mockTable(String name, String type, 
String comment, JSONObj
 .columns(ImmutableList.of(
 Column.builder()
 .name("id")
-.coder(INTEGER)
+.typeDescriptor(TypeName.INT32.type())
 .primaryKey(false)
 .comment("id")
 .build(),
 Column.builder()
 .name("name")
-.coder(VARCHAR)
+.typeDescriptor(CalciteUtils.toFieldType(SqlTypeName.VARCHAR))
 
 Review comment:
   I don't think that CalciteUtils should be used here, even if it's temporary. 
I'd rather have our own `SqlType.VARCHAR = 
TypeName.STRING.withMetadata("VARCHAR")`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87425)
Time Spent: 6h 50m  (was: 6h 40m)

> Support schema in PCollections
> --
>
> Key: BEAM-3437
> URL: https://issues.apache.org/jira/browse/BEAM-3437
> Project: Beam
>  Issue Type: Wish
>  Components: beam-model
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> As discussed with some people in the team, it would be great to add schema 
> support in {{PCollections}}. It will allow us:
> 1. To expect some data type in {{PTransforms}}
> 2. Improve some runners with additional features (I'm thinking about Spark 
> runner with data frames for instance).
> A technical draft document has been created: 
> https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit?disco=BhykQIs=5a203b46=comment_email_document
> I also started a PoC on a branch, I will update this Jira with a "discussion" 
> PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3647) Default Coder/Reading Coder From File

2018-04-03 Thread Kishan Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424999#comment-16424999
 ] 

Kishan Kumar commented on BEAM-3647:


Any Updates

> Default Coder/Reading Coder From File 
> --
>
> Key: BEAM-3647
> URL: https://issues.apache.org/jira/browse/BEAM-3647
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, dsl-sql
>Affects Versions: 2.2.0
>Reporter: Kishan Kumar
>Assignee: Anton Kedin
>Priority: Major
>
> *Requirement*-: Need to Run Template With Same Logics on Different Tables 
> Data.(Example is Given Below)
>  
> *Need*: Default Coder is Required So According to Data It Make All Fields as 
> String and Read Data else Thier must be Dynamic Options to Read Coder From 
> GCS as JSON FILE and Parse Data on Basis of That (But We can Pass Location 
> Using ValueProvider) or SomeWhere Else so At Runtime Using ValueProvider.
>  
>  
> *Examples*: I Have Two Tables 1 is Having Column (NAME, CLASS, ROLL, 
> SUB_PRICE)
> And 2 Table is (NAME, ROLL, SUB, TEST_MARKS)
>  
> On Both Tables, I am Just Sorting Table on Basis Of Roll Number so if We can 
> Read Coder at Run Time The Same Template Can Be Used For Different Tables at 
> Run Time.
>  
> Such Situations Make Our Work Easy and Make Our job Easy.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #5274

2018-04-03 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87415=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87415
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 03:46
Start Date: 04/Apr/18 03:46
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-378471464
 
 
   OK, so the stacktrace version is definitely prettier (see above), but my new 
tests now fail when using GRPC (e.g., `test_error_traceback_includes_user_code 
(apache_beam.runners.portability.fn_api_runner_test.FnApiRunnerTestWithGrpc)`).
   
   I'm not quite sure why -- the stacktrace can still be found in the error log 
when I run the test: 
   
   
   ```
   $ python setup.py test -s 
apache_beam.runners.portability.fn_api_runner_test.FnApiRunnerTestWithGrpc.test_error_traceback_includes_user_code
   
/usr/local/google/home/shoyer/virtual-envs/beam-dev/local/lib/python2.7/site-packages/setuptools/dist.py:397:
 UserWarning: Normalizing '2.5.0.dev' to '2.5.0.dev0'
 normalized_version,
   running test
   /usr/local/google/home/shoyer/open-source/beam/sdks/python/gen_protos.py:50: 
UserWarning: Installing grpcio-tools is recommended for development.
 warnings.warn('Installing grpcio-tools is recommended for development.')
   Searching for futures<4.0.0,>=3.1.1
   Best match: futures 3.2.0
   Processing futures-3.2.0-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/futures-3.2.0-py2.7.egg
   Searching for typing<3.7.0,>=3.6.0
   Best match: typing 3.6.4
   Processing typing-3.6.4-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/typing-3.6.4-py2.7.egg
   Searching for pyvcf<0.7.0,>=0.6.8
   Best match: PyVCF 0.6.8
   Processing PyVCF-0.6.8-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/PyVCF-0.6.8-py2.7.egg
   Searching for pyyaml<4.0.0,>=3.12
   Best match: PyYAML 3.12
   Processing PyYAML-3.12-py2.7-linux-x86_64.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/PyYAML-3.12-py2.7-linux-x86_64.egg
   Searching for pytz>=2018.3
   Best match: pytz 2018.3
   Processing pytz-2018.3-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/pytz-2018.3-py2.7.egg
   Searching for protobuf<4,>=3.5.0.post1
   Best match: protobuf 3.5.2.post1
   Processing protobuf-3.5.2.post1-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/protobuf-3.5.2.post1-py2.7.egg
   Searching for oauth2client<5,>=2.0.1
   Best match: oauth2client 4.1.2
   Processing oauth2client-4.1.2-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/oauth2client-4.1.2-py2.7.egg
   Searching for mock<3.0.0,>=1.0.1
   Best match: mock 2.0.0
   Processing mock-2.0.0-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/mock-2.0.0-py2.7.egg
   Searching for httplib2<0.10,>=0.8
   Best match: httplib2 0.9.2
   Processing httplib2-0.9.2-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/httplib2-0.9.2-py2.7.egg
   Searching for hdfs<3.0.0,>=2.1.0
   Best match: hdfs 2.1.0
   Processing hdfs-2.1.0-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/hdfs-2.1.0-py2.7.egg
   Searching for grpcio<2,>=1.8
   Best match: grpcio 1.10.1rc1
   Processing grpcio-1.10.1rc1-py2.7-linux-x86_64.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/grpcio-1.10.1rc1-py2.7-linux-x86_64.egg
   Searching for dill==0.2.6
   Best match: dill 0.2.6
   Processing dill-0.2.6-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/dill-0.2.6-py2.7.egg
   Searching for crcmod<2.0,>=1.7
   Best match: crcmod 1.7
   Processing crcmod-1.7-py2.7-linux-x86_64.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/crcmod-1.7-py2.7-linux-x86_64.egg
   Searching for avro<2.0.0,>=1.8.1
   Best match: avro 1.8.2
   Processing avro-1.8.2-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/avro-1.8.2-py2.7.egg
   Searching for rsa>=3.1.4
   Best match: rsa 3.4.2
   Processing rsa-3.4.2-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/rsa-3.4.2-py2.7.egg
   Searching for pyasn1>=0.1.7
   Best match: pyasn1 0.4.2
   Processing pyasn1-0.4.2-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/pyasn1-0.4.2-py2.7.egg
   Searching for pyasn1-modules>=0.0.5
   Best match: pyasn1-modules 0.2.1
   Processing pyasn1_modules-0.2.1-py2.7.egg
   
   Using 

[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87414=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87414
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 03:46
Start Date: 04/Apr/18 03:46
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-378471464
 
 
   OK, so the stacktrace version is definitely prettier (see above), but my new 
tests now fail when using GRPC (e.g., `test_error_traceback_includes_user_code 
(apache_beam.runners.portability.fn_api_runner_test.FnApiRunnerTestWithGrpc)`).
   
   I'm not quite sure why -- the stacktrace can still be found in the error log 
when I run the test: 
   
   
   ```
   $ python setup.py test -s 
apache_beam.runners.portability.fn_api_runner_test.FnApiRunnerTestWithGrpc.test_error_traceback_includes_user_code
   
/usr/local/google/home/shoyer/virtual-envs/beam-dev/local/lib/python2.7/site-packages/setuptools/dist.py:397:
 UserWarning: Normalizing '2.5.0.dev' to '2.5.0.dev0'
 normalized_version,
   running test
   /usr/local/google/home/shoyer/open-source/beam/sdks/python/gen_protos.py:50: 
UserWarning: Installing grpcio-tools is recommended for development.
 warnings.warn('Installing grpcio-tools is recommended for development.')
   Searching for futures<4.0.0,>=3.1.1
   Best match: futures 3.2.0
   Processing futures-3.2.0-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/futures-3.2.0-py2.7.egg
   Searching for typing<3.7.0,>=3.6.0
   Best match: typing 3.6.4
   Processing typing-3.6.4-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/typing-3.6.4-py2.7.egg
   Searching for pyvcf<0.7.0,>=0.6.8
   Best match: PyVCF 0.6.8
   Processing PyVCF-0.6.8-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/PyVCF-0.6.8-py2.7.egg
   Searching for pyyaml<4.0.0,>=3.12
   Best match: PyYAML 3.12
   Processing PyYAML-3.12-py2.7-linux-x86_64.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/PyYAML-3.12-py2.7-linux-x86_64.egg
   Searching for pytz>=2018.3
   Best match: pytz 2018.3
   Processing pytz-2018.3-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/pytz-2018.3-py2.7.egg
   Searching for protobuf<4,>=3.5.0.post1
   Best match: protobuf 3.5.2.post1
   Processing protobuf-3.5.2.post1-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/protobuf-3.5.2.post1-py2.7.egg
   Searching for oauth2client<5,>=2.0.1
   Best match: oauth2client 4.1.2
   Processing oauth2client-4.1.2-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/oauth2client-4.1.2-py2.7.egg
   Searching for mock<3.0.0,>=1.0.1
   Best match: mock 2.0.0
   Processing mock-2.0.0-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/mock-2.0.0-py2.7.egg
   Searching for httplib2<0.10,>=0.8
   Best match: httplib2 0.9.2
   Processing httplib2-0.9.2-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/httplib2-0.9.2-py2.7.egg
   Searching for hdfs<3.0.0,>=2.1.0
   Best match: hdfs 2.1.0
   Processing hdfs-2.1.0-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/hdfs-2.1.0-py2.7.egg
   Searching for grpcio<2,>=1.8
   Best match: grpcio 1.10.1rc1
   Processing grpcio-1.10.1rc1-py2.7-linux-x86_64.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/grpcio-1.10.1rc1-py2.7-linux-x86_64.egg
   Searching for dill==0.2.6
   Best match: dill 0.2.6
   Processing dill-0.2.6-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/dill-0.2.6-py2.7.egg
   Searching for crcmod<2.0,>=1.7
   Best match: crcmod 1.7
   Processing crcmod-1.7-py2.7-linux-x86_64.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/crcmod-1.7-py2.7-linux-x86_64.egg
   Searching for avro<2.0.0,>=1.8.1
   Best match: avro 1.8.2
   Processing avro-1.8.2-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/avro-1.8.2-py2.7.egg
   Searching for rsa>=3.1.4
   Best match: rsa 3.4.2
   Processing rsa-3.4.2-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/rsa-3.4.2-py2.7.egg
   Searching for pyasn1>=0.1.7
   Best match: pyasn1 0.4.2
   Processing pyasn1-0.4.2-py2.7.egg
   
   Using 
/usr/local/google/home/shoyer/open-source/beam/sdks/python/.eggs/pyasn1-0.4.2-py2.7.egg
   Searching for pyasn1-modules>=0.0.5
   Best match: pyasn1-modules 0.2.1
   Processing pyasn1_modules-0.2.1-py2.7.egg
   
   Using 

Jenkins build is back to normal : beam_PostCommit_Python_ValidatesRunner_Dataflow #1248

2018-04-03 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87408=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87408
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 02:45
Start Date: 04/Apr/18 02:45
Worklog Time Spent: 10m 
  Work Description: shoyer commented on a change in pull request #4959: 
[BEAM-3956] Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#discussion_r179014087
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner_test.py
 ##
 @@ -265,7 +265,7 @@ def test_windowing(self):
  | beam.Map(lambda k_vs1: (k_vs1[0], sorted(k_vs1[1]
   assert_that(res, equal_to([('k', [1, 2]), ('k', [100, 101, 102])]))
 
-  def test_errors(self):
+  def test_errors_stage(self):
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87408)
Time Spent: 3.5h  (was: 3h 20m)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87407
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 02:44
Start Date: 04/Apr/18 02:44
Worklog Time Spent: 10m 
  Work Description: shoyer commented on a change in pull request #4959: 
[BEAM-3956] Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#discussion_r179014079
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/universal_local_runner_test.py
 ##
 @@ -97,6 +97,22 @@ def raise_error(x):
  | 'StageC' >> beam.Map(raise_error)
  | 'StageD' >> beam.Map(lambda x: x))
 
+  def test_errors_traceback(self):
+# TODO: figure out a way for runner to parse and raise the
 
 Review comment:
   No, it doesn't. I replaced this with raising `unittest.SkipTest`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87407)
Time Spent: 3h 20m  (was: 3h 10m)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3761) Fix Python 3 cmp function

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?focusedWorklogId=87406=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87406
 ]

ASF GitHub Bot logged work on BEAM-3761:


Author: ASF GitHub Bot
Created on: 04/Apr/18 02:43
Start Date: 04/Apr/18 02:43
Worklog Time Spent: 10m 
  Work Description: cgarciae commented on issue #4774: [BEAM-3761]Fix 
Python 3 cmp usage
URL: https://github.com/apache/beam/pull/4774#issuecomment-378462493
 
 
   Also getting this error :(
   I am reverting to python 2 for this but hope it gets fixed soon.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87406)
Time Spent: 7h 20m  (was: 7h 10m)

> Fix Python 3 cmp function
> -
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Various functions don't exist in Python 3 that did in python 2. This Jira is 
> to fix the use of cmp (which often will involve rewriting __cmp__ as well).
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )
>  
> Note once all of the missing names/functions are fixed we can enable F821 in 
> falke8 python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Spark #1547

2018-04-03 Thread Apache Jenkins Server
See 


Changes:

[ccy] Use utcnow() in determining credential refresh time

[chamikara] Revert "[BEAM-2264] Credentials were not being reused between GCS 
calls"

[amyrvold] [BEAM-3989] Delete unused pipeline jobs

[sidhom] [BEAM-3249] Add missing gradle artifact ids

[wcn] Add godoc for exported methods.

--
[...truncated 89.95 KB...]
'apache-beam-testing:bqjob_r50f9b11bfb7cff6_01628e73bd19_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r50f9b11bfb7cff6_01628e73bd19_1 ... (0s) 
Current status: RUNNING 
Waiting on 
bqjob_r50f9b11bfb7cff6_01628e73bd19_1 ... (0s) Current status: DONE   
2018-04-04 02:19:34,335 070335a1 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-04 02:19:57,298 070335a1 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-04 02:19:59,448 070335a1 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r5e25412f356e1fe1_01628e741f42_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r5e25412f356e1fe1_01628e741f42_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r5e25412f356e1fe1_01628e741f42_1 ... (0s) Current status: DONE   
2018-04-04 02:19:59,449 070335a1 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-04 02:20:18,809 070335a1 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-04 02:20:20,947 070335a1 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r5a8a85589577f295_01628e74739f_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r5a8a85589577f295_01628e74739f_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r5a8a85589577f295_01628e74739f_1 ... (0s) Current status: DONE   
2018-04-04 02:20:20,947 070335a1 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-04 02:20:49,479 070335a1 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-04 02:20:51,851 070335a1 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87405=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87405
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 02:20
Start Date: 04/Apr/18 02:20
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-378453120
 
 
   > Thank you for this change! Do you have an example exception message with 
these changes that includes the stack trace?
   
   Sure, let me give a full example. Suppose we run the following code:
   ```
   import apache_beam as beam
   
   def f(x):
 return g(x)
   def g(x):
 return h(x)
   def h(x):
 raise ValueError('internal failure!')
 
   [1, 2, 3] | beam.Map(f)
   ```
   
   With current beam, I get a long traceback, but it only references internal 
details of beam, not anything about my code (it includes none of my internal 
function calls to `f()`, `g()` or `h()`):
   
   
   ```python-traceback
   ---
   ValueErrorTraceback (most recent call last)
in ()
 8   raise ValueError('internal failure!')
 9 
   ---> 10 [1, 2, 3] | beam.Map(f)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/transforms/ptransform.pyc
 in __ror__(self, left, label)
   491 _allocate_materialized_pipeline(p)
   492 materialized_result = 
_AddMaterializationTransforms().visit(result)
   --> 493 p.run().wait_until_finish()
   494 _release_materialized_pipeline(p)
   495 return _FinalizeMaterialization().visit(materialized_result)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   387 if test_runner_api and self._verify_runner_api_compatible():
   388   return Pipeline.from_runner_api(
   --> 389   self.to_runner_api(), self.runner, 
self._options).run(False)
   390 
   391 if self._options.view_as(TypeOptions).runtime_type_check:
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   400   finally:
   401 shutil.rmtree(tmpdir)
   --> 402 return self.runner.run_pipeline(self)
   403 
   404   def __enter__(self):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/direct/direct_runner.pyc
 in run_pipeline(self, pipeline)
   133   runner = BundleBasedDirectRunner()
   134 
   --> 135 return runner.run_pipeline(pipeline)
   136 
   137 
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_pipeline(self, pipeline)
   213 from apache_beam.runners.dataflow.dataflow_runner import 
DataflowRunner
   214 pipeline.visit(DataflowRunner.group_by_key_input_visitor())
   --> 215 return self.run_via_runner_api(pipeline.to_runner_api())
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_via_runner_api(self, pipeline_proto)
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   --> 218 return self.run_stages(*self.create_stages(pipeline_proto))
   219 
   220   def create_stages(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stages(self, pipeline_components, stages, safe_coders)
   835 metrics_by_stage[stage.name] = self.run_stage(
   836 controller, pipeline_components, stage,
   --> 837 pcoll_buffers, safe_coders).process_bundle.metrics
   838 finally:
   839   controller.close()
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stage(self, controller, pipeline_components, stage, pcoll_buffers, 
safe_coders)
   936 return BundleManager(
   937 controller, get_buffer, process_bundle_descriptor,
   --> 938 self._progress_frequency).process_bundle(data_input, 
data_output)
   939 
   940   # These classes are used to interact with the worker.
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in process_bundle(self, inputs, expected_outputs)
  1108 process_bundle=beam_fn_api_pb2.ProcessBundleRequest(
  1109 
process_bundle_descriptor_reference=self._bundle_descriptor.id))
   -> 1110   

Jenkins build is back to normal : beam_PerformanceTests_TFRecordIOIT #330

2018-04-03 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_TextIOIT_HDFS #12

2018-04-03 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_MongoDBIO_IT #7

2018-04-03 Thread Apache Jenkins Server
See 


Changes:

[ccy] Use utcnow() in determining credential refresh time

[chamikara] Revert "[BEAM-2264] Credentials were not being reused between GCS 
calls"

[amyrvold] [BEAM-3989] Delete unused pipeline jobs

[sidhom] [BEAM-3249] Add missing gradle artifact ids

[wcn] Add godoc for exported methods.

--
[...truncated 715.74 KB...]
[INFO] Excluding com.google.api:api-common:jar:1.0.0-rc2 from the shaded jar.
[INFO] Excluding com.google.api:gax:jar:1.3.1 from the shaded jar.
[INFO] Excluding org.threeten:threetenbp:jar:1.3.3 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core-grpc:jar:1.2.0 from the 
shaded jar.
[INFO] Excluding com.google.protobuf:protobuf-java-util:jar:3.2.0 from the 
shaded jar.
[INFO] Excluding com.google.code.gson:gson:jar:2.7 from the shaded jar.
[INFO] Excluding com.google.apis:google-api-services-pubsub:jar:v1-rev10-1.22.0 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-iam-v1:jar:0.1.18 from the 
shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-protobuf:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-auth:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-netty:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http2:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler-proxy:jar:4.1.8.Final from the shaded 
jar.
[INFO] Excluding io.netty:netty-codec-socks:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-buffer:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-common:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-transport:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-resolver:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-all:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-okhttp:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.squareup.okhttp:okhttp:jar:2.5.0 from the shaded jar.
[INFO] Excluding com.squareup.okio:okio:jar:1.6.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-lite:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-nano:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.protobuf.nano:protobuf-javanano:jar:3.0.0-alpha-5 
from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core:jar:1.0.2 from the shaded 
jar.
[INFO] Excluding org.json:json:jar:20160810 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-spanner:jar:0.20.0b-beta from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-spanner-v1:jar:0.1.11b 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:proto-google-cloud-spanner-admin-instance-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-spanner-v1:jar:0.1.11b 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:grpc-google-cloud-spanner-admin-database-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding 
com.google.api.grpc:grpc-google-cloud-spanner-admin-instance-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-longrunning-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-longrunning-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding com.google.cloud.bigtable:bigtable-protos:jar:1.0.0-pre3 from 
the shaded jar.
[INFO] Excluding com.google.cloud.bigtable:bigtable-client-core:jar:1.0.0 from 
the shaded jar.
[INFO] Excluding commons-logging:commons-logging:jar:1.2 from the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-appengine:jar:0.7.0 from 
the shaded jar.
[INFO] Excluding io.opencensus:opencensus-contrib-grpc-util:jar:0.7.0 from the 
shaded jar.
[INFO] Excluding io.opencensus:opencensus-api:jar:0.7.0 from the shaded jar.
[INFO] Excluding io.dropwizard.metrics:metrics-core:jar:3.1.2 from the shaded 
jar.
[INFO] Excluding 

Build failed in Jenkins: beam_PerformanceTests_Python #1103

2018-04-03 Thread Apache Jenkins Server
See 


Changes:

[ccy] Use utcnow() in determining credential refresh time

[chamikara] Revert "[BEAM-2264] Credentials were not being reused between GCS 
calls"

[amyrvold] [BEAM-3989] Delete unused pipeline jobs

[sidhom] [BEAM-3249] Add missing gradle artifact ids

[wcn] Add godoc for exported methods.

--
[...truncated 62.69 KB...]
[INFO] 
[INFO] --- maven-resources-plugin:3.0.2:copy-resources (copy-go-cmd-source) @ 
beam-sdks-go ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 6 resources
[INFO] 
[INFO] --- maven-assembly-plugin:3.1.0:single (export-go-pkg-sources) @ 
beam-sdks-go ---
[INFO] Reading assembly descriptor: descriptor.xml
[INFO] Building zip: 

[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) 
@ beam-sdks-go ---
[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:get (go-get-imports) @ beam-sdks-go ---
[INFO] Prepared command line : bin/go get google.golang.org/grpc 
golang.org/x/oauth2/google google.golang.org/api/storage/v1 
github.com/spf13/cobra cloud.google.com/go/bigquery 
google.golang.org/api/googleapi google.golang.org/api/dataflow/v1b3
[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:build (go-build) @ beam-sdks-go ---
[INFO] Prepared command line : bin/go build -buildmode=default -o 

 github.com/apache/beam/sdks/go/cmd/beamctl
[INFO] The Result file has been successfuly created : 

[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:build (go-build-linux-amd64) @ beam-sdks-go 
---
[INFO] Prepared command line : bin/go build -buildmode=default -o 

 github.com/apache/beam/sdks/go/cmd/beamctl
[INFO] The Result file has been successfuly created : 

[INFO] 
[INFO] --- maven-checkstyle-plugin:3.0.0:check (default) @ beam-sdks-go ---
[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:test (go-test) @ beam-sdks-go ---
[INFO] Prepared command line : bin/go test ./...
[INFO] 
[INFO] -Exec.Out-
[INFO] ?github.com/apache/beam/sdks/go/cmd/beamctl  [no test files]
[INFO] ?github.com/apache/beam/sdks/go/cmd/beamctl/cmd  [no test files]
[INFO] ?github.com/apache/beam/sdks/go/cmd/specialize   [no test files]
[INFO] ?github.com/apache/beam/sdks/go/cmd/symtab   [no test files]
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam 0.038s
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam/artifact0.158s
[INFO] 
[ERROR] 
[ERROR] -Exec.Err-
[ERROR] # github.com/apache/beam/sdks/go/pkg/beam/util/gcsx
[ERROR] github.com/apache/beam/sdks/go/pkg/beam/util/gcsx/gcs.go:46:37: 
undefined: option.WithoutAuthentication
[ERROR] 
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Beam :: Parent .. SUCCESS [ 21.690 s]
[INFO] Apache Beam :: SDKs :: Java :: Build Tools . SUCCESS [ 11.982 s]
[INFO] Apache Beam :: Model ... SUCCESS [  0.259 s]
[INFO] Apache Beam :: Model :: Pipeline ... SUCCESS [ 32.526 s]
[INFO] Apache Beam :: Model :: Job Management . SUCCESS [ 16.206 s]
[INFO] Apache Beam :: Model :: Fn Execution ... SUCCESS [ 11.754 s]
[INFO] Apache Beam :: SDKs  SUCCESS [  0.581 s]
[INFO] Apache Beam :: SDKs :: Go .. FAILURE [ 42.077 s]
[INFO] Apache Beam :: SDKs :: Go :: Container . SKIPPED
[INFO] Apache Beam :: SDKs :: Java  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Core  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Fn Execution  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Extensions .. SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Google Cloud Platform Core 
SKIPPED
[INFO] Apache Beam :: Runners . SKIPPED
[INFO] Apache Beam :: Runners :: Core Construction Java ... SKIPPED
[INFO] Apache Beam :: Runners :: Core Java  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Harness . SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Container ... SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: IO .. SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: IO :: Amazon Web Services SKIPPED
[INFO] Apache Beam :: 

[jira] [Created] (BEAM-3996) Invalid test util ResourceIdTester#ValidateFailureResolvingIds

2018-04-03 Thread Ankur Goenka (JIRA)
Ankur Goenka created BEAM-3996:
--

 Summary: Invalid test util 
ResourceIdTester#ValidateFailureResolvingIds
 Key: BEAM-3996
 URL: https://issues.apache.org/jira/browse/BEAM-3996
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Ankur Goenka
Assignee: Kenneth Knowles


The test util described here will never fail as we wrap fail in a try-catch 
block catching throwable. 

https://github.com/apache/beam/blob/a1ef0aac298e10a04a8ee5afea4765374a9c7508/sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java#L107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3996) Invalid test util ResourceIdTester#ValidateFailureResolvingIds

2018-04-03 Thread Ankur Goenka (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka reassigned BEAM-3996:
--

Assignee: Thomas Groh  (was: Kenneth Knowles)

> Invalid test util ResourceIdTester#ValidateFailureResolvingIds
> --
>
> Key: BEAM-3996
> URL: https://issues.apache.org/jira/browse/BEAM-3996
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ankur Goenka
>Assignee: Thomas Groh
>Priority: Major
>
> The test util described here will never fail as we wrap fail in a try-catch 
> block catching throwable. 
> https://github.com/apache/beam/blob/a1ef0aac298e10a04a8ee5afea4765374a9c7508/sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java#L107



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87403=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87403
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:51
Start Date: 04/Apr/18 01:51
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-378453995
 
 
   @robertwb Indeed, `six.reraise()` is a definite improvement! It results in 
the original stacktrace being directly included as part of the result 
stacktrace:
   
   
   
   ```python-stacktrace
   ---
   ValueErrorTraceback (most recent call last)
in ()
 8   raise ValueError('internal failure!')
 9 
   ---> 10 [1, 2, 3] | beam.Map(f)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/transforms/ptransform.pyc
 in __ror__(self, left, label)
   491 _allocate_materialized_pipeline(p)
   492 materialized_result = 
_AddMaterializationTransforms().visit(result)
   --> 493 p.run().wait_until_finish()
   494 _release_materialized_pipeline(p)
   495 return _FinalizeMaterialization().visit(materialized_result)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   387 if test_runner_api and self._verify_runner_api_compatible():
   388   return Pipeline.from_runner_api(
   --> 389   self.to_runner_api(), self.runner, 
self._options).run(False)
   390 
   391 if self._options.view_as(TypeOptions).runtime_type_check:
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   400   finally:
   401 shutil.rmtree(tmpdir)
   --> 402 return self.runner.run_pipeline(self)
   403 
   404   def __enter__(self):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/direct/direct_runner.pyc
 in run_pipeline(self, pipeline)
   133   runner = BundleBasedDirectRunner()
   134 
   --> 135 return runner.run_pipeline(pipeline)
   136 
   137 
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_pipeline(self, pipeline)
   213 from apache_beam.runners.dataflow.dataflow_runner import 
DataflowRunner
   214 pipeline.visit(DataflowRunner.group_by_key_input_visitor())
   --> 215 return self.run_via_runner_api(pipeline.to_runner_api())
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_via_runner_api(self, pipeline_proto)
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   --> 218 return self.run_stages(*self.create_stages(pipeline_proto))
   219 
   220   def create_stages(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stages(self, pipeline_components, stages, safe_coders)
   835 metrics_by_stage[stage.name] = self.run_stage(
   836 controller, pipeline_components, stage,
   --> 837 pcoll_buffers, safe_coders).process_bundle.metrics
   838 finally:
   839   controller.close()
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stage(self, controller, pipeline_components, stage, pcoll_buffers, 
safe_coders)
   936 return BundleManager(
   937 controller, get_buffer, process_bundle_descriptor,
   --> 938 self._progress_frequency).process_bundle(data_input, 
data_output)
   939 
   940   # These classes are used to interact with the worker.
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in process_bundle(self, inputs, expected_outputs)
  1108 process_bundle=beam_fn_api_pb2.ProcessBundleRequest(
  1109 
process_bundle_descriptor_reference=self._bundle_descriptor.id))
   -> 1110 result_future = 
self._controller.control_handler.push(process_bundle)
   
  1112 with ProgressRequester(
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in push(self, request)
  1001 request.instruction_id = 'control_%s' % self._uid_counter
  1002   logging.debug('CONTROL REQUEST %s', request)
   -> 1003   response = 

[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87402=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87402
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:50
Start Date: 04/Apr/18 01:50
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-378453995
 
 
   @robertwb Indeed, `six.reraise()` is a definite improvement! It results in 
the original stacktrace being directly included as part of the result 
stacktrace:
   
   
   
   ```python-stacktrace
   ---
   ValueErrorTraceback (most recent call last)
in ()
 8   raise ValueError('internal failure!')
 9 
   ---> 10 [1, 2, 3] | beam.Map(f)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/transforms/ptransform.pyc
 in __ror__(self, left, label)
   491 _allocate_materialized_pipeline(p)
   492 materialized_result = 
_AddMaterializationTransforms().visit(result)
   --> 493 p.run().wait_until_finish()
   494 _release_materialized_pipeline(p)
   495 return _FinalizeMaterialization().visit(materialized_result)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   387 if test_runner_api and self._verify_runner_api_compatible():
   388   return Pipeline.from_runner_api(
   --> 389   self.to_runner_api(), self.runner, 
self._options).run(False)
   390 
   391 if self._options.view_as(TypeOptions).runtime_type_check:
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   400   finally:
   401 shutil.rmtree(tmpdir)
   --> 402 return self.runner.run_pipeline(self)
   403 
   404   def __enter__(self):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/direct/direct_runner.pyc
 in run_pipeline(self, pipeline)
   133   runner = BundleBasedDirectRunner()
   134 
   --> 135 return runner.run_pipeline(pipeline)
   136 
   137 
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_pipeline(self, pipeline)
   213 from apache_beam.runners.dataflow.dataflow_runner import 
DataflowRunner
   214 pipeline.visit(DataflowRunner.group_by_key_input_visitor())
   --> 215 return self.run_via_runner_api(pipeline.to_runner_api())
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_via_runner_api(self, pipeline_proto)
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   --> 218 return self.run_stages(*self.create_stages(pipeline_proto))
   219 
   220   def create_stages(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stages(self, pipeline_components, stages, safe_coders)
   835 metrics_by_stage[stage.name] = self.run_stage(
   836 controller, pipeline_components, stage,
   --> 837 pcoll_buffers, safe_coders).process_bundle.metrics
   838 finally:
   839   controller.close()
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stage(self, controller, pipeline_components, stage, pcoll_buffers, 
safe_coders)
   936 return BundleManager(
   937 controller, get_buffer, process_bundle_descriptor,
   --> 938 self._progress_frequency).process_bundle(data_input, 
data_output)
   939 
   940   # These classes are used to interact with the worker.
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in process_bundle(self, inputs, expected_outputs)
  1108 process_bundle=beam_fn_api_pb2.ProcessBundleRequest(
  1109 
process_bundle_descriptor_reference=self._bundle_descriptor.id))
   -> 1110 result_future = 
self._controller.control_handler.push(process_bundle)
   
  1112 with ProgressRequester(
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in push(self, request)
  1001 request.instruction_id = 'control_%s' % self._uid_counter
  1002   logging.debug('CONTROL REQUEST %s', request)
   -> 1003   response = 

[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87401
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:50
Start Date: 04/Apr/18 01:50
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-378453995
 
 
   @robertwb Indeed, `six.reraise()` is a definite improvement! It results in 
the original stacktrace being directly included as part of the result 
stacktrace:
   
   
   ```python-stacktrace
   
   ---
   ValueErrorTraceback (most recent call last)
in ()
 8   raise ValueError('internal failure!')
 9 
   ---> 10 [1, 2, 3] | beam.Map(f)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/transforms/ptransform.pyc
 in __ror__(self, left, label)
   491 _allocate_materialized_pipeline(p)
   492 materialized_result = 
_AddMaterializationTransforms().visit(result)
   --> 493 p.run().wait_until_finish()
   494 _release_materialized_pipeline(p)
   495 return _FinalizeMaterialization().visit(materialized_result)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   387 if test_runner_api and self._verify_runner_api_compatible():
   388   return Pipeline.from_runner_api(
   --> 389   self.to_runner_api(), self.runner, 
self._options).run(False)
   390 
   391 if self._options.view_as(TypeOptions).runtime_type_check:
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   400   finally:
   401 shutil.rmtree(tmpdir)
   --> 402 return self.runner.run_pipeline(self)
   403 
   404   def __enter__(self):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/direct/direct_runner.pyc
 in run_pipeline(self, pipeline)
   133   runner = BundleBasedDirectRunner()
   134 
   --> 135 return runner.run_pipeline(pipeline)
   136 
   137 
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_pipeline(self, pipeline)
   213 from apache_beam.runners.dataflow.dataflow_runner import 
DataflowRunner
   214 pipeline.visit(DataflowRunner.group_by_key_input_visitor())
   --> 215 return self.run_via_runner_api(pipeline.to_runner_api())
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_via_runner_api(self, pipeline_proto)
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   --> 218 return self.run_stages(*self.create_stages(pipeline_proto))
   219 
   220   def create_stages(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stages(self, pipeline_components, stages, safe_coders)
   835 metrics_by_stage[stage.name] = self.run_stage(
   836 controller, pipeline_components, stage,
   --> 837 pcoll_buffers, safe_coders).process_bundle.metrics
   838 finally:
   839   controller.close()
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stage(self, controller, pipeline_components, stage, pcoll_buffers, 
safe_coders)
   936 return BundleManager(
   937 controller, get_buffer, process_bundle_descriptor,
   --> 938 self._progress_frequency).process_bundle(data_input, 
data_output)
   939 
   940   # These classes are used to interact with the worker.
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in process_bundle(self, inputs, expected_outputs)
  1108 process_bundle=beam_fn_api_pb2.ProcessBundleRequest(
  1109 
process_bundle_descriptor_reference=self._bundle_descriptor.id))
   -> 1110 result_future = 
self._controller.control_handler.push(process_bundle)
   
  1112 with ProgressRequester(
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in push(self, request)
  1001 request.instruction_id = 'control_%s' % self._uid_counter
  1002   logging.debug('CONTROL REQUEST %s', request)
   -> 1003   response = 

[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87400=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87400
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:49
Start Date: 04/Apr/18 01:49
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-378453995
 
 
   @robertwb Indeed, `six.reraise()` is a definite improvement! It results in 
the original stacktrace being directly included as part of the result 
stacktrace:
   
   
   ```python-stacktrace
   ---
   ValueErrorTraceback (most recent call last)
in ()
 8   raise ValueError('internal failure!')
 9 
   ---> 10 [1, 2, 3] | beam.Map(f)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/transforms/ptransform.pyc
 in __ror__(self, left, label)
   491 _allocate_materialized_pipeline(p)
   492 materialized_result = 
_AddMaterializationTransforms().visit(result)
   --> 493 p.run().wait_until_finish()
   494 _release_materialized_pipeline(p)
   495 return _FinalizeMaterialization().visit(materialized_result)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   387 if test_runner_api and self._verify_runner_api_compatible():
   388   return Pipeline.from_runner_api(
   --> 389   self.to_runner_api(), self.runner, 
self._options).run(False)
   390 
   391 if self._options.view_as(TypeOptions).runtime_type_check:
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   400   finally:
   401 shutil.rmtree(tmpdir)
   --> 402 return self.runner.run_pipeline(self)
   403 
   404   def __enter__(self):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/direct/direct_runner.pyc
 in run_pipeline(self, pipeline)
   133   runner = BundleBasedDirectRunner()
   134 
   --> 135 return runner.run_pipeline(pipeline)
   136 
   137 
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_pipeline(self, pipeline)
   213 from apache_beam.runners.dataflow.dataflow_runner import 
DataflowRunner
   214 pipeline.visit(DataflowRunner.group_by_key_input_visitor())
   --> 215 return self.run_via_runner_api(pipeline.to_runner_api())
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_via_runner_api(self, pipeline_proto)
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   --> 218 return self.run_stages(*self.create_stages(pipeline_proto))
   219 
   220   def create_stages(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stages(self, pipeline_components, stages, safe_coders)
   835 metrics_by_stage[stage.name] = self.run_stage(
   836 controller, pipeline_components, stage,
   --> 837 pcoll_buffers, safe_coders).process_bundle.metrics
   838 finally:
   839   controller.close()
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stage(self, controller, pipeline_components, stage, pcoll_buffers, 
safe_coders)
   936 return BundleManager(
   937 controller, get_buffer, process_bundle_descriptor,
   --> 938 self._progress_frequency).process_bundle(data_input, 
data_output)
   939 
   940   # These classes are used to interact with the worker.
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in process_bundle(self, inputs, expected_outputs)
  1108 process_bundle=beam_fn_api_pb2.ProcessBundleRequest(
  1109 
process_bundle_descriptor_reference=self._bundle_descriptor.id))
   -> 1110 result_future = 
self._controller.control_handler.push(process_bundle)
   
  1112 with ProgressRequester(
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in push(self, request)
  1001 request.instruction_id = 'control_%s' % self._uid_counter
  1002   logging.debug('CONTROL REQUEST %s', request)
   -> 1003   response = 

[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87398=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87398
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:44
Start Date: 04/Apr/18 01:44
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-378453120
 
 
   > Thank you for this change! Do you have an example exception message with 
these changes that includes the stack trace?
   
   Sure, let me give a full example. Suppose we run the following code:
   ```
   import apache_beam as beam
   
   def f(x):
 return g(x)
   def g(x):
 return h(x)
   def h(x):
 raise ValueError('internal failure!')
 
   [1, 2, 3] | beam.Map(f)
   ```
   
   With current beam, I get a long traceback, but it only references internal 
details of beam, not anything about my code (it includes none of my internal 
function calls to `f()`, `g()` or `h()`):
   
   
   ```python-traceback
   ---
   ValueErrorTraceback (most recent call last)
in ()
 8   raise ValueError('internal failure!')
 9 
   ---> 10 [1, 2, 3] | beam.Map(f)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/transforms/ptransform.pyc
 in __ror__(self, left, label)
   491 _allocate_materialized_pipeline(p)
   492 materialized_result = 
_AddMaterializationTransforms().visit(result)
   --> 493 p.run().wait_until_finish()
   494 _release_materialized_pipeline(p)
   495 return _FinalizeMaterialization().visit(materialized_result)
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   387 if test_runner_api and self._verify_runner_api_compatible():
   388   return Pipeline.from_runner_api(
   --> 389   self.to_runner_api(), self.runner, 
self._options).run(False)
   390 
   391 if self._options.view_as(TypeOptions).runtime_type_check:
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/pipeline.pyc
 in run(self, test_runner_api)
   400   finally:
   401 shutil.rmtree(tmpdir)
   --> 402 return self.runner.run_pipeline(self)
   403 
   404   def __enter__(self):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/direct/direct_runner.pyc
 in run_pipeline(self, pipeline)
   133   runner = BundleBasedDirectRunner()
   134 
   --> 135 return runner.run_pipeline(pipeline)
   136 
   137 
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_pipeline(self, pipeline)
   213 from apache_beam.runners.dataflow.dataflow_runner import 
DataflowRunner
   214 pipeline.visit(DataflowRunner.group_by_key_input_visitor())
   --> 215 return self.run_via_runner_api(pipeline.to_runner_api())
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_via_runner_api(self, pipeline_proto)
   216 
   217   def run_via_runner_api(self, pipeline_proto):
   --> 218 return self.run_stages(*self.create_stages(pipeline_proto))
   219 
   220   def create_stages(self, pipeline_proto):
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stages(self, pipeline_components, stages, safe_coders)
   835 metrics_by_stage[stage.name] = self.run_stage(
   836 controller, pipeline_components, stage,
   --> 837 pcoll_buffers, safe_coders).process_bundle.metrics
   838 finally:
   839   controller.close()
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in run_stage(self, controller, pipeline_components, stage, pcoll_buffers, 
safe_coders)
   936 return BundleManager(
   937 controller, get_buffer, process_bundle_descriptor,
   --> 938 self._progress_frequency).process_bundle(data_input, 
data_output)
   939 
   940   # These classes are used to interact with the worker.
   
   
/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.pyc
 in process_bundle(self, inputs, expected_outputs)
  1108 process_bundle=beam_fn_api_pb2.ProcessBundleRequest(
  1109 
process_bundle_descriptor_reference=self._bundle_descriptor.id))
   -> 1110   

Jenkins build is back to normal : beam_PerformanceTests_AvroIOIT_HDFS #6

2018-04-03 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3256) Add archetype testing/generation to existing GradleBuild PreCommit

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3256?focusedWorklogId=87397=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87397
 ]

ASF GitHub Bot logged work on BEAM-3256:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:26
Start Date: 04/Apr/18 01:26
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #5014: BEAM-3256 Add 
archetype testing/generation to existing GradleBuild Pr…
URL: https://github.com/apache/beam/pull/5014#issuecomment-378450181
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87397)
Time Spent: 0.5h  (was: 20m)

> Add archetype testing/generation to existing GradleBuild PreCommit
> --
>
> Key: BEAM-3256
> URL: https://issues.apache.org/jira/browse/BEAM-3256
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Luke Cwik
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This build currently is not exercising the archetype build and tests 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Java_GradleBuild.groovy
> found here:
> https://github.com/apache/beam/tree/master/sdks/java/maven-archetypes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3965) HDFS read broken in python

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3965?focusedWorklogId=87394=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87394
 ]

ASF GitHub Bot logged work on BEAM-3965:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:18
Start Date: 04/Apr/18 01:18
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #4979: [BEAM-3965] HDFS Read 
fixes
URL: https://github.com/apache/beam/pull/4979#issuecomment-378448917
 
 
   run python postcommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87394)
Time Spent: 2h 40m  (was: 2.5h)

> HDFS read broken in python
> --
>
> Key: BEAM-3965
> URL: https://issues.apache.org/jira/browse/BEAM-3965
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> When running a command like:
> {noformat}
> python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount 
> --output gs://.../py-wordcount-output \
>   --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner 
> --project ... \
>   --temp_location gs://.../temp-hdfs-int --staging_location 
> gs://.../staging-hdfs-int \
>   --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input 
> hdfs://kinglear.txt
> {noformat}
> I get:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
> lines = p | 'read' >> ReadFromText(known_args.input)
>   File "apache_beam/io/textio.py", line 522, in __init__
> skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 117, in __init__
> validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 119, in __init__
> self._validate()
>   File "apache_beam/options/value_provider.py", line 124, in _f
> return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 176, in _validate
> match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 159, in match
> return filesystem.match(patterns, limits)
>   File "apache_beam/io/hadoopfilesystem.py", line 221, in match
> raise BeamIOError('Match operation failed', exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> {'hdfs://kinglear.txt': KeyError('name',)}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3965) HDFS read broken in python

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3965?focusedWorklogId=87392=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87392
 ]

ASF GitHub Bot logged work on BEAM-3965:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:18
Start Date: 04/Apr/18 01:18
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4979: 
[BEAM-3965] HDFS Read fixes
URL: https://github.com/apache/beam/pull/4979#discussion_r179003650
 
 

 ##
 File path: sdks/python/apache_beam/io/hadoopfilesystem.py
 ##
 @@ -188,13 +194,16 @@ def _match(path_pattern, limit):
   """Find all matching paths to the pattern provided."""
   fs = self._hdfs_client.status(path_pattern, strict=False)
   if fs and fs[_FILE_STATUS_TYPE] == _FILE_STATUS_TYPE_FILE:
-file_statuses = [(fs[_FILE_STATUS_PATH_SUFFIX], fs)][:limit]
+file_statuses = [(path_pattern, fs)][:limit]
   else:
-file_statuses = self._hdfs_client.list(path_pattern,
-   status=True)[:limit]
-  metadata_list = [FileMetadata(file_status[1][_FILE_STATUS_NAME],
-file_status[1][_FILE_STATUS_SIZE])
-   for file_status in file_statuses]
+file_statuses = [(self._join(path_pattern, fs[0]), fs[1])
+ for fs in self._hdfs_client.list(path_pattern,
 
 Review comment:
   No glob patterns are supported at all. If a directory is passed it'll list 
its contents.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87392)
Time Spent: 2h 20m  (was: 2h 10m)

> HDFS read broken in python
> --
>
> Key: BEAM-3965
> URL: https://issues.apache.org/jira/browse/BEAM-3965
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> When running a command like:
> {noformat}
> python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount 
> --output gs://.../py-wordcount-output \
>   --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner 
> --project ... \
>   --temp_location gs://.../temp-hdfs-int --staging_location 
> gs://.../staging-hdfs-int \
>   --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input 
> hdfs://kinglear.txt
> {noformat}
> I get:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
> lines = p | 'read' >> ReadFromText(known_args.input)
>   File "apache_beam/io/textio.py", line 522, in __init__
> skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 117, in __init__
> validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 119, in __init__
> self._validate()
>   File "apache_beam/options/value_provider.py", line 124, in _f
> return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 176, in _validate
> match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 159, in match
> return filesystem.match(patterns, limits)
>   File "apache_beam/io/hadoopfilesystem.py", line 221, in match
> raise BeamIOError('Match operation failed', exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> {'hdfs://kinglear.txt': KeyError('name',)}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3965) HDFS read broken in python

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3965?focusedWorklogId=87390=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87390
 ]

ASF GitHub Bot logged work on BEAM-3965:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:18
Start Date: 04/Apr/18 01:18
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4979: 
[BEAM-3965] HDFS Read fixes
URL: https://github.com/apache/beam/pull/4979#discussion_r179004608
 
 

 ##
 File path: sdks/python/apache_beam/io/hadoopfilesystem_test.py
 ##
 @@ -517,6 +511,51 @@ def test_delete_error(self):
 self.assertFalse(self.fs.exists(url2))
 
 
+class HadoopFileSystemRuntimeValueProviderTest(unittest.TestCase):
+  """Tests pipeline_options, as passed via RuntimeValueProvider."""
 
 Review comment:
   I updated the comment. This test tests that options from 
RuntimeValueProvider.runtime_options are understood by HadoopFileSystem.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87390)
Time Spent: 2h  (was: 1h 50m)

> HDFS read broken in python
> --
>
> Key: BEAM-3965
> URL: https://issues.apache.org/jira/browse/BEAM-3965
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When running a command like:
> {noformat}
> python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount 
> --output gs://.../py-wordcount-output \
>   --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner 
> --project ... \
>   --temp_location gs://.../temp-hdfs-int --staging_location 
> gs://.../staging-hdfs-int \
>   --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input 
> hdfs://kinglear.txt
> {noformat}
> I get:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
> lines = p | 'read' >> ReadFromText(known_args.input)
>   File "apache_beam/io/textio.py", line 522, in __init__
> skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 117, in __init__
> validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 119, in __init__
> self._validate()
>   File "apache_beam/options/value_provider.py", line 124, in _f
> return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 176, in _validate
> match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 159, in match
> return filesystem.match(patterns, limits)
>   File "apache_beam/io/hadoopfilesystem.py", line 221, in match
> raise BeamIOError('Match operation failed', exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> {'hdfs://kinglear.txt': KeyError('name',)}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3965) HDFS read broken in python

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3965?focusedWorklogId=87388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87388
 ]

ASF GitHub Bot logged work on BEAM-3965:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:18
Start Date: 04/Apr/18 01:18
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4979: 
[BEAM-3965] HDFS Read fixes
URL: https://github.com/apache/beam/pull/4979#discussion_r179003120
 
 

 ##
 File path: sdks/python/apache_beam/io/hadoopfilesystem.py
 ##
 @@ -48,20 +49,19 @@
 _FILE_CHECKSUM_BYTES = 'bytes'
 _FILE_CHECKSUM_LENGTH = 'length'
 # WebHDFS FileStatus property constants.
-_FILE_STATUS_NAME = 'name'
+_FILE_STATUS_LENGTH = 'length'
 
 Review comment:
   There's already a comment stating: "WebHDFS FileStatus property constants".


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87388)
Time Spent: 1h 50m  (was: 1h 40m)

> HDFS read broken in python
> --
>
> Key: BEAM-3965
> URL: https://issues.apache.org/jira/browse/BEAM-3965
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When running a command like:
> {noformat}
> python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount 
> --output gs://.../py-wordcount-output \
>   --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner 
> --project ... \
>   --temp_location gs://.../temp-hdfs-int --staging_location 
> gs://.../staging-hdfs-int \
>   --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input 
> hdfs://kinglear.txt
> {noformat}
> I get:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
> lines = p | 'read' >> ReadFromText(known_args.input)
>   File "apache_beam/io/textio.py", line 522, in __init__
> skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 117, in __init__
> validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 119, in __init__
> self._validate()
>   File "apache_beam/options/value_provider.py", line 124, in _f
> return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 176, in _validate
> match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 159, in match
> return filesystem.match(patterns, limits)
>   File "apache_beam/io/hadoopfilesystem.py", line 221, in match
> raise BeamIOError('Match operation failed', exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> {'hdfs://kinglear.txt': KeyError('name',)}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3965) HDFS read broken in python

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3965?focusedWorklogId=87393=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87393
 ]

ASF GitHub Bot logged work on BEAM-3965:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:18
Start Date: 04/Apr/18 01:18
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4979: 
[BEAM-3965] HDFS Read fixes
URL: https://github.com/apache/beam/pull/4979#discussion_r179003445
 
 

 ##
 File path: sdks/python/apache_beam/io/hadoopfilesystem.py
 ##
 @@ -106,20 +106,26 @@ def __init__(self, pipeline_options):
 """
 super(HadoopFileSystem, self).__init__(pipeline_options)
 logging.getLogger('hdfs.client').setLevel(logging.WARN)
-
 if pipeline_options is None:
   raise ValueError('pipeline_options is not set')
-hdfs_options = pipeline_options.view_as(HadoopFileSystemOptions)
-if hdfs_options.hdfs_host is None:
+if isinstance(pipeline_options, PipelineOptions):
+  hdfs_options = pipeline_options.view_as(HadoopFileSystemOptions)
+  hdfs_host = hdfs_options.hdfs_host
+  hdfs_port = hdfs_options.hdfs_port
+  hdfs_user = hdfs_options.hdfs_user
+else:
+  hdfs_host = pipeline_options.get('hdfs_host')
+  hdfs_port = pipeline_options.get('hdfs_port')
+  hdfs_user = pipeline_options.get('hdfs_user')
+
+if hdfs_host is None:
 
 Review comment:
   There's some minimal validation here:
   
https://github.com/apache/beam/blob/cf33dba157b6edaeb90f425fdf5dfa820bacc749/sdks/python/apache_beam/options/pipeline_options.py#L419
   
   Not sure I can validate much more than that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87393)
Time Spent: 2.5h  (was: 2h 20m)

> HDFS read broken in python
> --
>
> Key: BEAM-3965
> URL: https://issues.apache.org/jira/browse/BEAM-3965
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When running a command like:
> {noformat}
> python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount 
> --output gs://.../py-wordcount-output \
>   --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner 
> --project ... \
>   --temp_location gs://.../temp-hdfs-int --staging_location 
> gs://.../staging-hdfs-int \
>   --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input 
> hdfs://kinglear.txt
> {noformat}
> I get:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
> lines = p | 'read' >> ReadFromText(known_args.input)
>   File "apache_beam/io/textio.py", line 522, in __init__
> skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 117, in __init__
> validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 119, in __init__
> self._validate()
>   File "apache_beam/options/value_provider.py", line 124, in _f
> return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 176, in _validate
> match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 159, in match
> return filesystem.match(patterns, limits)
>   File "apache_beam/io/hadoopfilesystem.py", line 221, in match
> raise BeamIOError('Match operation failed', exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> {'hdfs://kinglear.txt': KeyError('name',)}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3965) HDFS read broken in python

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3965?focusedWorklogId=87387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87387
 ]

ASF GitHub Bot logged work on BEAM-3965:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:18
Start Date: 04/Apr/18 01:18
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4979: 
[BEAM-3965] HDFS Read fixes
URL: https://github.com/apache/beam/pull/4979#discussion_r178980173
 
 

 ##
 File path: sdks/python/run_postcommit.sh
 ##
 @@ -41,7 +41,7 @@ pip install virtualenv --user
 pip install tox --user
 
 # Tox runs unit tests in a virtual environment
-${LOCAL_PATH}/tox -e ALL -c sdks/python/tox.ini
+${LOCAL_PATH}/tox -e ALL -c sdks/python/tox.ini -v -v
 
 Review comment:
   Specifying the flag twice increases verbosity.
   I'll remove this change as it's irrelevant to this PR. I was using it to 
debug tox on jenkins.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87387)
Time Spent: 1h 40m  (was: 1.5h)

> HDFS read broken in python
> --
>
> Key: BEAM-3965
> URL: https://issues.apache.org/jira/browse/BEAM-3965
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When running a command like:
> {noformat}
> python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount 
> --output gs://.../py-wordcount-output \
>   --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner 
> --project ... \
>   --temp_location gs://.../temp-hdfs-int --staging_location 
> gs://.../staging-hdfs-int \
>   --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input 
> hdfs://kinglear.txt
> {noformat}
> I get:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
> lines = p | 'read' >> ReadFromText(known_args.input)
>   File "apache_beam/io/textio.py", line 522, in __init__
> skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 117, in __init__
> validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 119, in __init__
> self._validate()
>   File "apache_beam/options/value_provider.py", line 124, in _f
> return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 176, in _validate
> match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 159, in match
> return filesystem.match(patterns, limits)
>   File "apache_beam/io/hadoopfilesystem.py", line 221, in match
> raise BeamIOError('Match operation failed', exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> {'hdfs://kinglear.txt': KeyError('name',)}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3965) HDFS read broken in python

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3965?focusedWorklogId=87389=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87389
 ]

ASF GitHub Bot logged work on BEAM-3965:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:18
Start Date: 04/Apr/18 01:18
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4979: 
[BEAM-3965] HDFS Read fixes
URL: https://github.com/apache/beam/pull/4979#discussion_r178981276
 
 

 ##
 File path: sdks/python/apache_beam/io/hadoopfilesystem.py
 ##
 @@ -188,13 +194,16 @@ def _match(path_pattern, limit):
   """Find all matching paths to the pattern provided."""
   fs = self._hdfs_client.status(path_pattern, strict=False)
   if fs and fs[_FILE_STATUS_TYPE] == _FILE_STATUS_TYPE_FILE:
-file_statuses = [(fs[_FILE_STATUS_PATH_SUFFIX], fs)][:limit]
+file_statuses = [(path_pattern, fs)][:limit]
   else:
-file_statuses = self._hdfs_client.list(path_pattern,
-   status=True)[:limit]
-  metadata_list = [FileMetadata(file_status[1][_FILE_STATUS_NAME],
-file_status[1][_FILE_STATUS_SIZE])
-   for file_status in file_statuses]
+file_statuses = [(self._join(path_pattern, fs[0]), fs[1])
+ for fs in self._hdfs_client.list(path_pattern,
+  status=True)[:limit]]
+  metadata_list = [
+  FileMetadata(
+  '%s:/%s' % (self.scheme(), file_status[0]),
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87389)

> HDFS read broken in python
> --
>
> Key: BEAM-3965
> URL: https://issues.apache.org/jira/browse/BEAM-3965
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When running a command like:
> {noformat}
> python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount 
> --output gs://.../py-wordcount-output \
>   --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner 
> --project ... \
>   --temp_location gs://.../temp-hdfs-int --staging_location 
> gs://.../staging-hdfs-int \
>   --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input 
> hdfs://kinglear.txt
> {noformat}
> I get:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
> lines = p | 'read' >> ReadFromText(known_args.input)
>   File "apache_beam/io/textio.py", line 522, in __init__
> skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 117, in __init__
> validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 119, in __init__
> self._validate()
>   File "apache_beam/options/value_provider.py", line 124, in _f
> return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 176, in _validate
> match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 159, in match
> return filesystem.match(patterns, limits)
>   File "apache_beam/io/hadoopfilesystem.py", line 221, in match
> raise BeamIOError('Match operation failed', exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> {'hdfs://kinglear.txt': KeyError('name',)}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3965) HDFS read broken in python

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3965?focusedWorklogId=87391=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87391
 ]

ASF GitHub Bot logged work on BEAM-3965:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:18
Start Date: 04/Apr/18 01:18
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4979: 
[BEAM-3965] HDFS Read fixes
URL: https://github.com/apache/beam/pull/4979#discussion_r179000174
 
 

 ##
 File path: sdks/python/apache_beam/io/filesystems.py
 ##
 @@ -85,7 +86,9 @@ def get_filesystem(path):
   if len(systems) == 0:
 raise ValueError('Unable to get the Filesystem for path %s' % path)
   elif len(systems) == 1:
-return systems[0](pipeline_options=FileSystems._pipeline_options)
+options = (FileSystems._pipeline_options or
+   RuntimeValueProvider.runtime_options)
 
 Review comment:
   RuntimeValueProvider.runtime_options is supported by Dataflow runner (but 
not direct runner, hence we still have the class variable _pipeline_options).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87391)
Time Spent: 2h 10m  (was: 2h)

> HDFS read broken in python
> --
>
> Key: BEAM-3965
> URL: https://issues.apache.org/jira/browse/BEAM-3965
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When running a command like:
> {noformat}
> python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount 
> --output gs://.../py-wordcount-output \
>   --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner 
> --project ... \
>   --temp_location gs://.../temp-hdfs-int --staging_location 
> gs://.../staging-hdfs-int \
>   --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input 
> hdfs://kinglear.txt
> {noformat}
> I get:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
> lines = p | 'read' >> ReadFromText(known_args.input)
>   File "apache_beam/io/textio.py", line 522, in __init__
> skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 117, in __init__
> validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 119, in __init__
> self._validate()
>   File "apache_beam/options/value_provider.py", line 124, in _f
> return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 176, in _validate
> match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 159, in match
> return filesystem.match(patterns, limits)
>   File "apache_beam/io/hadoopfilesystem.py", line 221, in match
> raise BeamIOError('Match operation failed', exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> {'hdfs://kinglear.txt': KeyError('name',)}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87386=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87386
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 01:09
Start Date: 04/Apr/18 01:09
Worklog Time Spent: 10m 
  Work Description: shoyer commented on a change in pull request #4959: 
[BEAM-3956] Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#discussion_r179003884
 
 

 ##
 File path: sdks/python/apache_beam/transforms/ptransform_test.py
 ##
 @@ -2015,8 +2015,8 @@ def test_runtime_type_check_python_type_error(self):
 # Our special type-checking related TypeError shouldn't have been raised.
 # Instead the above pipeline should have triggered a regular Python runtime
 # TypeError.
-self.assertEqual("object of type 'int' has no len() [while running 'Len']",
- e.exception.args[0])
+expected_start = "object of type 'int' has no len() [while running 'Len']"
+self.assertEqual(expected_start, e.exception.args[0][:len(expected_start)])
 
 Review comment:
   Will do. I didn't use `startswith` because I had confused it with `strip` 
(which matches characters, not strings).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87386)
Time Spent: 2h 10m  (was: 2h)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #6364

2018-04-03 Thread Apache Jenkins Server
See 




[beam] branch master updated (f12dc8a -> a1ef0aa)

2018-04-03 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from f12dc8a  Merge pull request #5013: Add godoc for exported methods
 add a82333a  Use utcnow() in determining credential refresh time
 new a1ef0aa  Merge pull request #4997 from charlesccychen/fix-oauth2-now

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/python/apache_beam/internal/gcp/auth.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
To stop receiving notification emails like this one, please contact
chamik...@apache.org.


[beam] 01/01: Merge pull request #4997 from charlesccychen/fix-oauth2-now

2018-04-03 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit a1ef0aac298e10a04a8ee5afea4765374a9c7508
Merge: f12dc8a a82333a
Author: Chamikara Jayalath 
AuthorDate: Tue Apr 3 17:51:36 2018 -0700

Merge pull request #4997 from charlesccychen/fix-oauth2-now

Use utcnow() in determining credential refresh time

 sdks/python/apache_beam/internal/gcp/auth.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
To stop receiving notification emails like this one, please contact
chamik...@apache.org.


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87382=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87382
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 00:48
Start Date: 04/Apr/18 00:48
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #4959: 
[BEAM-3956] Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#discussion_r179001282
 
 

 ##
 File path: sdks/python/apache_beam/runners/common.py
 ##
 @@ -522,17 +522,20 @@ def _reraise_augmented(self, exn):
 step_annotation = " [while running '%s']" % self.step_name
 # To emulate exception chaining (not available in Python 2).
 original_traceback = sys.exc_info()[2]
+stacktrace_text = '\n' + '\n'.join(traceback.format_exception(
+type(exn), exn, original_traceback))
 try:
   # Attempt to construct the same kind of exception
   # with an augmented message.
-  new_exn = type(exn)(exn.args[0] + step_annotation, *exn.args[1:])
+  new_exn = type(exn)(exn.args[0] + step_annotation + stacktrace_text,
+  *exn.args[1:])
   new_exn._tagged_with_step = True  # Could raise attribute error.
 except:  # pylint: disable=bare-except
   # If anything goes wrong, construct a RuntimeError whose message
   # records the original exception's type and message.
   new_exn = RuntimeError(
   traceback.format_exception_only(type(exn), exn)[-1].strip()
-  + step_annotation)
+  + step_annotation + stacktrace_text)
   new_exn._tagged_with_step = True
 six.raise_from(new_exn, original_traceback)
 
 Review comment:
   Rather than string-ifying the traceback, we could use 
six.reraise(type(new_exn), new_exn, original_traceback).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87382)
Time Spent: 2h  (was: 1h 50m)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3774) Update BigQuery jobs to explicitly specify the region

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3774?focusedWorklogId=87361=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87361
 ]

ASF GitHub Bot logged work on BEAM-3774:


Author: ASF GitHub Bot
Created on: 04/Apr/18 00:03
Start Date: 04/Apr/18 00:03
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #5001: [BEAM-3774] Adds 
support for reading from/writing to more BQ geographical locations
URL: https://github.com/apache/beam/pull/5001#issuecomment-378437356
 
 
   Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87361)
Time Spent: 1h 20m  (was: 1h 10m)

> Update BigQuery jobs to explicitly specify the region
> -
>
> Key: BEAM-3774
> URL: https://issues.apache.org/jira/browse/BEAM-3774
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This is needed to support BQ regions other than US and EU. Region can be 
> obtained by a Dataset.get() request so no need to update the user API.
> Both Python and Java SDKs have to be updated.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3774) Update BigQuery jobs to explicitly specify the region

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3774?focusedWorklogId=87359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87359
 ]

ASF GitHub Bot logged work on BEAM-3774:


Author: ASF GitHub Bot
Created on: 04/Apr/18 00:03
Start Date: 04/Apr/18 00:03
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5001: [BEAM-3774] Adds support for reading from/writing to more BQ 
geographical locations
URL: https://github.com/apache/beam/pull/5001#discussion_r178995621
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySource.java
 ##
 @@ -97,13 +109,17 @@ public long getEstimatedSizeBytes(PipelineOptions 
options) throws Exception {
   protected TableReference getTableToExtract(BigQueryOptions bqOptions)
   throws IOException, InterruptedException {
 // 1. Find the location of the query.
-String location = null;
-List referencedTables =
-dryRunQueryIfNeeded(bqOptions).getQuery().getReferencedTables();
+String location = this.location;
 DatasetService tableService = bqServices.getDatasetService(bqOptions);
-if (referencedTables != null && !referencedTables.isEmpty()) {
-  TableReference queryTable = referencedTables.get(0);
-  location = tableService.getTable(queryTable).getLocation();
+if (location == null) {
+  // If location was not provided we try to determine it from the tables 
referenced by the
+  // Query. This will only work for BQ locations US and EU.
+  List referencedTables =
+  dryRunQueryIfNeeded(bqOptions).getQuery().getReferencedTables();
+  if (referencedTables != null && !referencedTables.isEmpty()) {
+TableReference queryTable = referencedTables.get(0);
 
 Review comment:
   Yes. You can run queries over multiple tables/datasets but they have to be 
in the same geographical location.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87359)
Time Spent: 1h 10m  (was: 1h)

> Update BigQuery jobs to explicitly specify the region
> -
>
> Key: BEAM-3774
> URL: https://issues.apache.org/jira/browse/BEAM-3774
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This is needed to support BQ regions other than US and EU. Region can be 
> obtained by a Dataset.get() request so no need to update the user API.
> Both Python and Java SDKs have to be updated.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3256) Add archetype testing/generation to existing GradleBuild PreCommit

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3256?focusedWorklogId=87356=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87356
 ]

ASF GitHub Bot logged work on BEAM-3256:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:58
Start Date: 03/Apr/18 23:58
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #5014: BEAM-3256 Add 
archetype testing/generation to existing GradleBuild Pr…
URL: https://github.com/apache/beam/pull/5014#issuecomment-378436374
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87356)
Time Spent: 20m  (was: 10m)

> Add archetype testing/generation to existing GradleBuild PreCommit
> --
>
> Key: BEAM-3256
> URL: https://issues.apache.org/jira/browse/BEAM-3256
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Luke Cwik
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This build currently is not exercising the archetype build and tests 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Java_GradleBuild.groovy
> found here:
> https://github.com/apache/beam/tree/master/sdks/java/maven-archetypes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87352=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87352
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:57
Start Date: 03/Apr/18 23:57
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #4959: 
[BEAM-3956] Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#discussion_r178990648
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/fn_api_runner_test.py
 ##
 @@ -265,7 +265,7 @@ def test_windowing(self):
  | beam.Map(lambda k_vs1: (k_vs1[0], sorted(k_vs1[1]
   assert_that(res, equal_to([('k', [1, 2]), ('k', [100, 101, 102])]))
 
-  def test_errors(self):
+  def test_errors_stage(self):
 
 Review comment:
   I suggest making test names more descriptive:
   test_exception_message_includes_stage_name,
   test_exception_message_includes_stacktrace.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87352)
Time Spent: 1h 50m  (was: 1h 40m)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3256) Add archetype testing/generation to existing GradleBuild PreCommit

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3256?focusedWorklogId=87354=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87354
 ]

ASF GitHub Bot logged work on BEAM-3256:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:57
Start Date: 03/Apr/18 23:57
Worklog Time Spent: 10m 
  Work Description: yifanzou opened a new pull request #5014: BEAM-3256 Add 
archetype testing/generation to existing GradleBuild Pr…
URL: https://github.com/apache/beam/pull/5014
 
 
   …eCommit
   
   DESCRIPTION HERE
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [ ] Write a pull request description that is detailed enough to 
understand:
  - [ ] What the pull request does
  - [ ] Why it does it
  - [ ] How it does it
  - [ ] Why this approach
- [ ] Each commit in the pull request should have a meaningful subject line 
and body.
- [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87354)
Time Spent: 10m
Remaining Estimate: 0h

> Add archetype testing/generation to existing GradleBuild PreCommit
> --
>
> Key: BEAM-3256
> URL: https://issues.apache.org/jira/browse/BEAM-3256
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Luke Cwik
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This build currently is not exercising the archetype build and tests 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Java_GradleBuild.groovy
> found here:
> https://github.com/apache/beam/tree/master/sdks/java/maven-archetypes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87351=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87351
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:57
Start Date: 03/Apr/18 23:57
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #4959: 
[BEAM-3956] Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#discussion_r178990621
 
 

 ##
 File path: sdks/python/apache_beam/transforms/ptransform_test.py
 ##
 @@ -2015,8 +2015,8 @@ def test_runtime_type_check_python_type_error(self):
 # Our special type-checking related TypeError shouldn't have been raised.
 # Instead the above pipeline should have triggered a regular Python runtime
 # TypeError.
-self.assertEqual("object of type 'int' has no len() [while running 'Len']",
- e.exception.args[0])
+expected_start = "object of type 'int' has no len() [while running 'Len']"
+self.assertEqual(expected_start, e.exception.args[0][:len(expected_start)])
 
 Review comment:
   can we use str.startswith() here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87351)
Time Spent: 1h 40m  (was: 1.5h)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3938) Gradle publish task should authenticate when run from jenkins

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3938?focusedWorklogId=87355=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87355
 ]

ASF GitHub Bot logged work on BEAM-3938:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:57
Start Date: 03/Apr/18 23:57
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5004: 
[BEAM-3938] Add publish gradle task
URL: https://github.com/apache/beam/pull/5004#discussion_r178994544
 
 

 ##
 File path: gradle/publish.gradle
 ##
 @@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+subprojects {
+  apply plugin: 'com.bmuschko.nexus'
 
 Review comment:
   We already have a maven-publish task defined here:
   
https://github.com/apache/beam/blob/889520fcd0ba83f1633cd1f08e33446b65c0e874/build_rules.gradle#L336
   
   Is there a reason why you want to migrate to using the nexus plugin?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87355)
Time Spent: 20m  (was: 10m)

> Gradle publish task should authenticate when run from jenkins
> -
>
> Key: BEAM-3938
> URL: https://issues.apache.org/jira/browse/BEAM-3938
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> ./gradlew publish should be able to write to 
> [https://repository.apache.org/content/repositories/snapshots] when run from 
> jenkins, as the maven 
> [job_beam_Release_NightlySnapshot.groovy|https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_Release_NightlySnapshot.groovy]
>  does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87353=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87353
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:57
Start Date: 03/Apr/18 23:57
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #4959: 
[BEAM-3956] Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#discussion_r178990636
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/portability/universal_local_runner_test.py
 ##
 @@ -97,6 +97,22 @@ def raise_error(x):
  | 'StageC' >> beam.Map(raise_error)
  | 'StageD' >> beam.Map(lambda x: x))
 
+  def test_errors_traceback(self):
+# TODO: figure out a way for runner to parse and raise the
 
 Review comment:
   Does the test cover anything that test_erros_stage does not?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87353)
Time Spent: 1h 50m  (was: 1h 40m)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5273

2018-04-03 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3774) Update BigQuery jobs to explicitly specify the region

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3774?focusedWorklogId=87345=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87345
 ]

ASF GitHub Bot logged work on BEAM-3774:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:46
Start Date: 03/Apr/18 23:46
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #5001: 
[BEAM-3774] Adds support for reading from/writing to more BQ geographical 
locations
URL: https://github.com/apache/beam/pull/5001#discussion_r178992977
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySource.java
 ##
 @@ -97,13 +109,17 @@ public long getEstimatedSizeBytes(PipelineOptions 
options) throws Exception {
   protected TableReference getTableToExtract(BigQueryOptions bqOptions)
   throws IOException, InterruptedException {
 // 1. Find the location of the query.
-String location = null;
-List referencedTables =
-dryRunQueryIfNeeded(bqOptions).getQuery().getReferencedTables();
+String location = this.location;
 DatasetService tableService = bqServices.getDatasetService(bqOptions);
-if (referencedTables != null && !referencedTables.isEmpty()) {
-  TableReference queryTable = referencedTables.get(0);
-  location = tableService.getTable(queryTable).getLocation();
+if (location == null) {
+  // If location was not provided we try to determine it from the tables 
referenced by the
+  // Query. This will only work for BQ locations US and EU.
+  List referencedTables =
+  dryRunQueryIfNeeded(bqOptions).getQuery().getReferencedTables();
+  if (referencedTables != null && !referencedTables.isEmpty()) {
+TableReference queryTable = referencedTables.get(0);
 
 Review comment:
   Queries can run over multiple tables?
   
   It's assumed that a query running over tables in multiple locations is 
forbidden, right?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87345)
Time Spent: 1h  (was: 50m)

> Update BigQuery jobs to explicitly specify the region
> -
>
> Key: BEAM-3774
> URL: https://issues.apache.org/jira/browse/BEAM-3774
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This is needed to support BQ regions other than US and EU. Region can be 
> obtained by a Dataset.get() request so no need to update the user API.
> Both Python and Java SDKs have to be updated.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=87343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87343
 ]

ASF GitHub Bot logged work on BEAM-3856:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:45
Start Date: 03/Apr/18 23:45
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #4939: 
[BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub
URL: https://github.com/apache/beam/pull/4939#issuecomment-378434195
 
 
   R: @robertwb 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87343)
Time Spent: 1h 10m  (was: 1h)

> Add prototype support for Go SDK streaming
> --
>
> Key: BEAM-3856
> URL: https://issues.apache.org/jira/browse/BEAM-3856
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Willy Lulciuc
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=87344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87344
 ]

ASF GitHub Bot logged work on BEAM-3856:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:45
Start Date: 03/Apr/18 23:45
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #4939: 
[BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub
URL: https://github.com/apache/beam/pull/4939#issuecomment-378434195
 
 
   R: @robertwb (please merge)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87344)
Time Spent: 1h 20m  (was: 1h 10m)

> Add prototype support for Go SDK streaming
> --
>
> Key: BEAM-3856
> URL: https://issues.apache.org/jira/browse/BEAM-3856
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Willy Lulciuc
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=87342=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87342
 ]

ASF GitHub Bot logged work on BEAM-3856:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:44
Start Date: 03/Apr/18 23:44
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #4939: 
[BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub
URL: https://github.com/apache/beam/pull/4939#discussion_r178992732
 
 

 ##
 File path: sdks/go/examples/streaming_wordcap/wordcap.go
 ##
 @@ -0,0 +1,81 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// streaming_wordcap is a toy streaming pipeline that uses PubSub. It
+// does the following:
+//(1) create a topic and publish a few messages to it
+//(2) start a streaming pipeline that converts the messages to
+//upper case and logs the result.
+//
+// NOTE: it only runs on Dataflow and must be manually cancelled.
+package main
+
+import (
+   "context"
+   "flag"
+   "os"
+   "strings"
+
+   "github.com/apache/beam/sdks/go/pkg/beam"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/util/stringx"
+   "github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio"
+   "github.com/apache/beam/sdks/go/pkg/beam/log"
+   "github.com/apache/beam/sdks/go/pkg/beam/options/gcpopts"
+   "github.com/apache/beam/sdks/go/pkg/beam/util/pubsubx"
+   "github.com/apache/beam/sdks/go/pkg/beam/x/beamx"
+   "github.com/apache/beam/sdks/go/pkg/beam/x/debug"
+)
+
+var (
+   input = flag.String("input", os.ExpandEnv("$USER-wordcap"), "Pubsub 
input topic.")
+)
+
+var (
+   data = []string{
+   "foo",
+   "bar",
+   "baz",
+   }
+)
+
+func main() {
+   flag.Parse()
+   beam.Init()
+
+   ctx := context.Background()
+   project := gcpopts.GetProject(ctx)
+
+   log.Infof(ctx, "Publishing %v messages to: %v", len(data), *input)
+
+   defer pubsubx.CleanupTopic(ctx, project, *input)
+   sub, err := pubsubx.Publish(ctx, project, *input, data...)
+   if err != nil {
+   log.Fatal(ctx, err)
+   }
+
+   log.Infof(ctx, "Running streaming wordcap with subscription: %v", 
sub.ID())
+
+   p := beam.NewPipeline()
+   s := p.Root()
+
+   col := pubsubio.Read(s, project, *input, 
{Subscription: sub.ID()})
+   str := beam.ParDo(s, stringx.FromBytes, col)
+   cap := beam.ParDo(s, strings.ToUpper, str)
 
 Review comment:
   It does not. Function registration is just a tiny performance optimization 
to avoid hitting the symbol table to find that function on the worker. It has 
no semantic impact. I'd prefer to not do it for the examples to keep them 
simpler.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87342)
Time Spent: 1h  (was: 50m)

> Add prototype support for Go SDK streaming
> --
>
> Key: BEAM-3856
> URL: https://issues.apache.org/jira/browse/BEAM-3856
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Willy Lulciuc
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3774) Update BigQuery jobs to explicitly specify the region

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3774?focusedWorklogId=87337=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87337
 ]

ASF GitHub Bot logged work on BEAM-3774:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:26
Start Date: 03/Apr/18 23:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #5001: [BEAM-3774] Adds 
support for reading from/writing to more BQ geographical locations
URL: https://github.com/apache/beam/pull/5001#issuecomment-378431075
 
 
   Thanks. PTAL.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87337)
Time Spent: 50m  (was: 40m)

> Update BigQuery jobs to explicitly specify the region
> -
>
> Key: BEAM-3774
> URL: https://issues.apache.org/jira/browse/BEAM-3774
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This is needed to support BQ regions other than US and EU. Region can be 
> obtained by a Dataset.get() request so no need to update the user API.
> Both Python and Java SDKs have to be updated.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3774) Update BigQuery jobs to explicitly specify the region

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3774?focusedWorklogId=87335=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87335
 ]

ASF GitHub Bot logged work on BEAM-3774:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:26
Start Date: 03/Apr/18 23:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5001: [BEAM-3774] Adds support for reading from/writing to more BQ 
geographical locations
URL: https://github.com/apache/beam/pull/5001#discussion_r178990101
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 ##
 @@ -170,6 +170,11 @@
  * .fromQuery("SELECT year, mean_temp FROM [samples.weather_stations]"));
  * }
  *
+ * Users can optionally specify a query priority using {@link 
TypedRead#withQueryPriority(
+ * TypedRead.QueryPriority)} and a geographic location where the query will be 
executed using
+ * {@link TypedRead#withQueryLocation(String)}. Query location must be 
specified for jobs that are
+ * not executed in US or EU.
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87335)
Time Spent: 40m  (was: 0.5h)

> Update BigQuery jobs to explicitly specify the region
> -
>
> Key: BEAM-3774
> URL: https://issues.apache.org/jira/browse/BEAM-3774
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This is needed to support BQ regions other than US and EU. Region can be 
> obtained by a Dataset.get() request so no need to update the user API.
> Both Python and Java SDKs have to be updated.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3774) Update BigQuery jobs to explicitly specify the region

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3774?focusedWorklogId=87336=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87336
 ]

ASF GitHub Bot logged work on BEAM-3774:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:26
Start Date: 03/Apr/18 23:26
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5001: [BEAM-3774] Adds support for reading from/writing to more BQ 
geographical locations
URL: https://github.com/apache/beam/pull/5001#discussion_r178990118
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 ##
 @@ -687,7 +696,7 @@ public void validate(PipelineOptions options) {
 new JobConfigurationQuery()
 .setQuery(getQuery().get())
 .setFlattenResults(getFlattenResults())
-.setUseLegacySql(getUseLegacySql()));
+.setUseLegacySql(getUseLegacySql()), getQueryLocation());
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87336)

> Update BigQuery jobs to explicitly specify the region
> -
>
> Key: BEAM-3774
> URL: https://issues.apache.org/jira/browse/BEAM-3774
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This is needed to support BQ regions other than US and EU. Region can be 
> obtained by a Dataset.get() request so no need to update the user API.
> Both Python and Java SDKs have to be updated.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87333
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:23
Start Date: 03/Apr/18 23:23
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #5012: [BEAM-3250] Creating 
a Gradle Jenkins config for Flink PostCommit.
URL: https://github.com/apache/beam/pull/5012#issuecomment-378430574
 
 
   @ajamato 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87333)
Time Spent: 20m  (was: 10m)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Alex Amato
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87332
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:23
Start Date: 03/Apr/18 23:23
Worklog Time Spent: 10m 
  Work Description: youngoli opened a new pull request #5012: [BEAM-3250] 
Creating a Gradle Jenkins config for Flink PostCommit.
URL: https://github.com/apache/beam/pull/5012
 
 
   Started with the Flink PostCommit first because a Gradle config for it was 
already written. This is my attempt at creating a simple Gradle config before 
moving onto the more complex PostCommits. It just executes the Gradle 
:runners:flink:validatesRunner task.
   
   NOTE: I still need to create a seed job and test this on Jenkins. I will 
post a comment confirming that this has been tested once I do so.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [x] Write a pull request description that is detailed enough to 
understand:
  - [x] What the pull request does
  - [x] Why it does it
  - [x] How it does it
  - [x] Why this approach
- [x] Each commit in the pull request should have a meaningful subject line 
and body.
- [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87332)
Time Spent: 10m
Remaining Estimate: 0h

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Alex Amato
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3942) Update performance testing framework to use Gradle.

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3942?focusedWorklogId=87331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87331
 ]

ASF GitHub Bot logged work on BEAM-3942:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:23
Start Date: 03/Apr/18 23:23
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #5003: [BEAM-3942] Update 
performance testing framework to use Gradle
URL: https://github.com/apache/beam/pull/5003#issuecomment-378430503
 
 
   I had some ideas on how to improve this PR: 
https://github.com/lgajowy/beam/pull/3


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87331)
Time Spent: 1h 20m  (was: 1h 10m)

> Update performance testing framework to use Gradle.
> ---
>
> Key: BEAM-3942
> URL: https://issues.apache.org/jira/browse/BEAM-3942
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Łukasz Gajowy
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This requires performing updates to PerfKitBenchmarker and Beam so that we 
> can execute performance tests using Gradle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1247

2018-04-03 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Revert "[BEAM-2264] Credentials were not being reused between GCS 
calls"

--
[...truncated 1009.77 KB...]
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "assert_that/Group/GroupByKey.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s11"
}, 
"serialized_fn": 
"%0AD%22B%0A%1Dref_Coder_GlobalWindowCoder_1%12%21%0A%1F%0A%1D%0A%1Bbeam%3Acoder%3Aglobal_window%3Av1jT%0A%25%0A%23%0A%21beam%3Awindowfn%3Aglobal_windows%3Av0.1%10%01%1A%1Dref_Coder_GlobalWindowCoder_1%22%02%3A%00%28%010%018%01H%01",
 
"user_name": "assert_that/Group/GroupByKey"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s13", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": "_merge_tagged_vals_under_key"
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": 
"assert_that/Group/Map(_merge_tagged_vals_under_key).out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s12"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Group/Map(_merge_tagged_vals_under_key)"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s14", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{

Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #6363

2018-04-03 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=87330=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87330
 ]

ASF GitHub Bot logged work on BEAM-3856:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:14
Start Date: 03/Apr/18 23:14
Worklog Time Spent: 10m 
  Work Description: lostluck commented on a change in pull request #4939: 
[BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub
URL: https://github.com/apache/beam/pull/4939#discussion_r178988229
 
 

 ##
 File path: sdks/go/examples/streaming_wordcap/wordcap.go
 ##
 @@ -0,0 +1,81 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// streaming_wordcap is a toy streaming pipeline that uses PubSub. It
+// does the following:
+//(1) create a topic and publish a few messages to it
+//(2) start a streaming pipeline that converts the messages to
+//upper case and logs the result.
+//
+// NOTE: it only runs on Dataflow and must be manually cancelled.
+package main
+
+import (
+   "context"
+   "flag"
+   "os"
+   "strings"
+
+   "github.com/apache/beam/sdks/go/pkg/beam"
+   "github.com/apache/beam/sdks/go/pkg/beam/core/util/stringx"
+   "github.com/apache/beam/sdks/go/pkg/beam/io/pubsubio"
+   "github.com/apache/beam/sdks/go/pkg/beam/log"
+   "github.com/apache/beam/sdks/go/pkg/beam/options/gcpopts"
+   "github.com/apache/beam/sdks/go/pkg/beam/util/pubsubx"
+   "github.com/apache/beam/sdks/go/pkg/beam/x/beamx"
+   "github.com/apache/beam/sdks/go/pkg/beam/x/debug"
+)
+
+var (
+   input = flag.String("input", os.ExpandEnv("$USER-wordcap"), "Pubsub 
input topic.")
+)
+
+var (
+   data = []string{
+   "foo",
+   "bar",
+   "baz",
+   }
+)
+
+func main() {
+   flag.Parse()
+   beam.Init()
+
+   ctx := context.Background()
+   project := gcpopts.GetProject(ctx)
+
+   log.Infof(ctx, "Publishing %v messages to: %v", len(data), *input)
+
+   defer pubsubx.CleanupTopic(ctx, project, *input)
+   sub, err := pubsubx.Publish(ctx, project, *input, data...)
+   if err != nil {
+   log.Fatal(ctx, err)
+   }
+
+   log.Infof(ctx, "Running streaming wordcap with subscription: %v", 
sub.ID())
+
+   p := beam.NewPipeline()
+   s := p.Root()
+
+   col := pubsubio.Read(s, project, *input, 
{Subscription: sub.ID()})
+   str := beam.ParDo(s, stringx.FromBytes, col)
+   cap := beam.ParDo(s, strings.ToUpper, str)
 
 Review comment:
   
   In PR 5011, we're registering a ton of functions. Does that need to be done 
here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87330)
Time Spent: 50m  (was: 40m)

> Add prototype support for Go SDK streaming
> --
>
> Key: BEAM-3856
> URL: https://issues.apache.org/jira/browse/BEAM-3856
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Willy Lulciuc
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87329
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:12
Start Date: 03/Apr/18 23:12
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #5010: [BEAM-3257] Add Python 
precommit gradle config
URL: https://github.com/apache/beam/pull/5010#issuecomment-378428368
 
 
   run seed job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87329)
Time Spent: 2h 20m  (was: 2h 10m)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3982) Transform libraries not registering their types

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3982?focusedWorklogId=87327=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87327
 ]

ASF GitHub Bot logged work on BEAM-3982:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:09
Start Date: 03/Apr/18 23:09
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #5011: [BEAM-3982] Register 
Go transform types and functions
URL: https://github.com/apache/beam/pull/5011#issuecomment-378427753
 
 
   R: @tgroh (please merge)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87327)
Time Spent: 0.5h  (was: 20m)

> Transform libraries not registering their types
> ---
>
> Key: BEAM-3982
> URL: https://issues.apache.org/jira/browse/BEAM-3982
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Henning Rohde
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The transform libraries in the SDK have structs-as-DoFns, but the types are 
> not registered. While this works in a direct runner, trying to run examples 
> using those libraries fails because the DoFns aren't serializable.
> It's not clear how to make the libraries intrinsically self-policing. At 
> least to ensure the examples work, these failures can be exposed easily using 
> the flags "--runner=dataflow --dry_run" to force the serialization error.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87325=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87325
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:02
Start Date: 03/Apr/18 23:02
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #5010: 
[BEAM-3257] Add Python precommit gradle config
URL: https://github.com/apache/beam/pull/5010#discussion_r178985870
 
 

 ##
 File path: sdks/python/build.gradle
 ##
 @@ -25,75 +25,123 @@ apply plugin: "base"
 task test {}
 check.dependsOn test
 
-task setupTest {
+def envdir = "${project.buildDir}/gradleenv"
+
+task setupVirtualenv {
   doLast {
+exec {
+  commandLine 'virtualenv', "${envdir}"
+}
 exec {
   executable 'sh'
-  args '-c', 'which tox || pip install --user --upgrade tox'
+  args '-c', "source ${envdir}/bin/activate && pip install --upgrade tox"
 }
   }
+  outputs.files("$buildDir/.gradleenv/bin/tox")
 }
 
-task sdist {
+task sdist (dependsOn: 'setupVirtualenv') {
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87325)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87324=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87324
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:02
Start Date: 03/Apr/18 23:02
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #5010: 
[BEAM-3257] Add Python precommit gradle config
URL: https://github.com/apache/beam/pull/5010#discussion_r178985705
 
 

 ##
 File path: .test-infra/jenkins/job_beam_PreCommit_Python_GradleBuild.groovy
 ##
 @@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import common_job_properties
+
+// This is the Python precommit which runs a Gradle build, and the current set
+// of precommit tests.
+job('beam_PreCommit_Python_GradleBuild') {
+  description('Runs Python PreCommit tests for the current GitHub Pull 
Request.')
+
+  // Execute concurrent builds if necessary.
+  concurrentBuild()
+
+  // Set common parameters.
+  common_job_properties.setTopLevelMainJobProperties(
+delegate,
+'master',
+90)
+
+  // Publish all test results to Jenkins
+  publishers {
+archiveJunit('**/build/test-results/**/*.xml')
 
 Review comment:
   removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87324)
Time Spent: 2h  (was: 1h 50m)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87326=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87326
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 23:02
Start Date: 03/Apr/18 23:02
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #5010: 
[BEAM-3257] Add Python precommit gradle config
URL: https://github.com/apache/beam/pull/5010#discussion_r178985158
 
 

 ##
 File path: sdks/python/build.gradle
 ##
 @@ -25,75 +25,123 @@ apply plugin: "base"
 task test {}
 check.dependsOn test
 
-task setupTest {
+def envdir = "${project.buildDir}/gradleenv"
+
+task setupVirtualenv {
   doLast {
+exec {
+  commandLine 'virtualenv', "${envdir}"
+}
 exec {
   executable 'sh'
-  args '-c', 'which tox || pip install --user --upgrade tox'
+  args '-c', "source ${envdir}/bin/activate && pip install --upgrade tox"
 }
   }
+  outputs.files("$buildDir/.gradleenv/bin/tox")
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87326)
Time Spent: 2h 10m  (was: 2h)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Python_Verify #4578

2018-04-03 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87320=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87320
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:42
Start Date: 03/Apr/18 22:42
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5010: 
[BEAM-3257] Add Python precommit gradle config
URL: https://github.com/apache/beam/pull/5010#discussion_r178982930
 
 

 ##
 File path: .test-infra/jenkins/job_beam_PreCommit_Python_GradleBuild.groovy
 ##
 @@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import common_job_properties
+
+// This is the Python precommit which runs a Gradle build, and the current set
+// of precommit tests.
+job('beam_PreCommit_Python_GradleBuild') {
+  description('Runs Python PreCommit tests for the current GitHub Pull 
Request.')
+
+  // Execute concurrent builds if necessary.
+  concurrentBuild()
+
+  // Set common parameters.
+  common_job_properties.setTopLevelMainJobProperties(
+delegate,
+'master',
+90)
+
+  // Publish all test results to Jenkins
+  publishers {
+archiveJunit('**/build/test-results/**/*.xml')
 
 Review comment:
   This doesn't seem to make sense. We should be publishing the python tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87320)
Time Spent: 1h 50m  (was: 1h 40m)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87318=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87318
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:42
Start Date: 03/Apr/18 22:42
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5010: 
[BEAM-3257] Add Python precommit gradle config
URL: https://github.com/apache/beam/pull/5010#discussion_r178982403
 
 

 ##
 File path: sdks/python/build.gradle
 ##
 @@ -25,75 +25,123 @@ apply plugin: "base"
 task test {}
 check.dependsOn test
 
-task setupTest {
+def envdir = "${project.buildDir}/gradleenv"
+
+task setupVirtualenv {
   doLast {
+exec {
+  commandLine 'virtualenv', "${envdir}"
+}
 exec {
   executable 'sh'
-  args '-c', 'which tox || pip install --user --upgrade tox'
+  args '-c', "source ${envdir}/bin/activate && pip install --upgrade tox"
 }
   }
+  outputs.files("$buildDir/.gradleenv/bin/tox")
 }
 
-task sdist {
+task sdist (dependsOn: 'setupVirtualenv') {
 
 Review comment:
   nit: remove the space between task name and arguments since its a method 
call like `sdist(dependsOn: 'setupVirtualenv') {` here and below.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87318)
Time Spent: 1.5h  (was: 1h 20m)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87319=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87319
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:42
Start Date: 03/Apr/18 22:42
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5010: 
[BEAM-3257] Add Python precommit gradle config
URL: https://github.com/apache/beam/pull/5010#discussion_r178981315
 
 

 ##
 File path: sdks/python/build.gradle
 ##
 @@ -25,75 +25,123 @@ apply plugin: "base"
 task test {}
 check.dependsOn test
 
-task setupTest {
+def envdir = "${project.buildDir}/gradleenv"
+
+task setupVirtualenv {
   doLast {
+exec {
+  commandLine 'virtualenv', "${envdir}"
+}
 exec {
   executable 'sh'
-  args '-c', 'which tox || pip install --user --upgrade tox'
+  args '-c', "source ${envdir}/bin/activate && pip install --upgrade tox"
 }
   }
+  outputs.files("$buildDir/.gradleenv/bin/tox")
 
 Review comment:
   re-use the definition of envdir here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87319)
Time Spent: 1h 40m  (was: 1.5h)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3982) Transform libraries not registering their types

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3982?focusedWorklogId=87315=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87315
 ]

ASF GitHub Bot logged work on BEAM-3982:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:30
Start Date: 03/Apr/18 22:30
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #5011: [BEAM-3982] Register 
Go transform types and functions
URL: https://github.com/apache/beam/pull/5011#issuecomment-378419881
 
 
   R: @wcn3 @lostluck 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87315)
Time Spent: 20m  (was: 10m)

> Transform libraries not registering their types
> ---
>
> Key: BEAM-3982
> URL: https://issues.apache.org/jira/browse/BEAM-3982
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Henning Rohde
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The transform libraries in the SDK have structs-as-DoFns, but the types are 
> not registered. While this works in a direct runner, trying to run examples 
> using those libraries fails because the DoFns aren't serializable.
> It's not clear how to make the libraries intrinsically self-policing. At 
> least to ensure the examples work, these failures can be exposed easily using 
> the flags "--runner=dataflow --dry_run" to force the serialization error.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3982) Transform libraries not registering their types

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3982?focusedWorklogId=87313=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87313
 ]

ASF GitHub Bot logged work on BEAM-3982:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:29
Start Date: 03/Apr/18 22:29
Worklog Time Spent: 10m 
  Work Description: herohde opened a new pull request #5011: [BEAM-3982] 
Register Go transform types and functions
URL: https://github.com/apache/beam/pull/5011
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87313)
Time Spent: 10m
Remaining Estimate: 0h

> Transform libraries not registering their types
> ---
>
> Key: BEAM-3982
> URL: https://issues.apache.org/jira/browse/BEAM-3982
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Henning Rohde
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The transform libraries in the SDK have structs-as-DoFns, but the types are 
> not registered. While this works in a direct runner, trying to run examples 
> using those libraries fails because the DoFns aren't serializable.
> It's not clear how to make the libraries intrinsically self-policing. At 
> least to ensure the examples work, these failures can be exposed easily using 
> the flags "--runner=dataflow --dry_run" to force the serialization error.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3993) rat doesn't pass with gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3993?focusedWorklogId=87314=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87314
 ]

ASF GitHub Bot logged work on BEAM-3993:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:29
Start Date: 03/Apr/18 22:29
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5002: 
[BEAM-3993] read gitignore and add it in rat exclusions
URL: https://github.com/apache/beam/pull/5002#discussion_r178980602
 
 

 ##
 File path: build.gradle
 ##
 @@ -107,7 +104,6 @@ rat {
 "sdks/python/NOTICE",
 
 Review comment:
   I cleaned up the duplication in 
https://github.com/rmannibucau/incubator-beam/pull/1, please consider merging 
into your branch.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87314)
Time Spent: 0.5h  (was: 20m)

> rat doesn't pass with gradle
> 
>
> Key: BEAM-3993
> URL: https://issues.apache.org/jira/browse/BEAM-3993
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3990) Dataflow jobs fail with "KeyError: 'location'" when uploading to GCS

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3990?focusedWorklogId=87311=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87311
 ]

ASF GitHub Bot logged work on BEAM-3990:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:21
Start Date: 03/Apr/18 22:21
Worklog Time Spent: 10m 
  Work Description: chamikaramj closed pull request #5000: [BEAM-3990] 
Revert "[BEAM-2264] Credentials were not being reused between GCS calls"
URL: https://github.com/apache/beam/pull/5000
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/io/gcp/gcsio.py 
b/sdks/python/apache_beam/io/gcp/gcsio.py
index c7986cdb672..f687686fd64 100644
--- a/sdks/python/apache_beam/io/gcp/gcsio.py
+++ b/sdks/python/apache_beam/io/gcp/gcsio.py
@@ -146,8 +146,6 @@ class GcsIOError(IOError, retry.PermanentException):
 class GcsIO(object):
   """Google Cloud Storage I/O client."""
 
-  local_state = threading.local()
-
   def __new__(cls, storage_client=None):
 if storage_client:
   # This path is only used for testing.
@@ -157,7 +155,7 @@ def __new__(cls, storage_client=None):
   # creating more than one storage client for each thread, since each
   # initialization requires the relatively expensive step of initializing
   # credentaials.
-  local_state = GcsIO.local_state
+  local_state = threading.local()
   if getattr(local_state, 'gcsio_instance', None) is None:
 credentials = auth.get_service_credentials()
 storage_client = storage.StorageV1(


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87311)
Time Spent: 50m  (was: 40m)

> Dataflow jobs fail with "KeyError: 'location'" when uploading to GCS
> 
>
> Key: BEAM-3990
> URL: https://issues.apache.org/jira/browse/BEAM-3990
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Charles Chen
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Some Dataflow jobs are failing due to following error (in worker logs).
>  
> Error in _start_upload while inserting file 
> gs://cloud-ml-benchmark-output-us-central/df1-cloudml-benchmark-criteo-small-python-033010274088282-presubmit3/033010274088282/temp/df1-cloudml-benchmark-criteo-small-python-033010274088282-presubmit3.1522430898.446147/dax-tmp-2018-03-30_10_28_40-14595186994726940229-S241-1-dc87ef69274882bf/tmp-dc87ef6927488c5a-shard--try-308ae8b3268d12b2-endshard.avro:
>  Traceback (most recent call last): File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/gcsio.py", line 
> 559, in _start_upload self._client.objects.Insert(self._insert_request, 
> upload=self._upload) File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py",
>  line 971, in Insert download=download) File 
> "/usr/local/lib/python2.7/dist-packages/apitools/base/py/base_api.py", line 
> 706, in _RunMethod http_request, client=self.client) File 
> "/usr/local/lib/python2.7/dist-packages/apitools/base/py/transfer.py", line 
> 860, in InitializeUpload url = 
> [http_response.info|https://www.google.com/url?q=http://http_response.info=D=AFQjCNGvYHYJBb_G4YNo3VvGoqX2Gq-6Yw]['location']
>  KeyError: 'location'
>  
> This seems to be due to [https://github.com/apache/beam/pull/4891.] Possibly 
> storage.StorageV1() cannot be shared across multiple requests without 
> additional fixes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Merge pull request #5000 from chamikaramj/revert_gcs_auth

2018-04-03 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 889520fcd0ba83f1633cd1f08e33446b65c0e874
Merge: f0c9ff8 2fd7430
Author: Chamikara Jayalath 
AuthorDate: Tue Apr 3 15:21:02 2018 -0700

Merge pull request #5000 from chamikaramj/revert_gcs_auth

[BEAM-3990] Revert "[BEAM-2264] Credentials were not being reused between 
GCS calls"

 sdks/python/apache_beam/io/gcp/gcsio.py | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
chamik...@apache.org.


[beam] branch master updated (f0c9ff8 -> 889520f)

2018-04-03 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from f0c9ff8  [BEAM-3989] Delete unused pipeline jobs
 add 2fd7430  Revert "[BEAM-2264] Credentials were not being reused between 
GCS calls"
 new 889520f  Merge pull request #5000 from chamikaramj/revert_gcs_auth

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/python/apache_beam/io/gcp/gcsio.py | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
chamik...@apache.org.


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1246

2018-04-03 Thread Apache Jenkins Server
See 


Changes:

[amyrvold] [BEAM-3989] Delete unused pipeline jobs

[sidhom] [BEAM-3249] Add missing gradle artifact ids

--
[...truncated 246.37 KB...]
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "assert_that/Group/GroupByKey.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s10"
}, 
"serialized_fn": 
"%0AD%22B%0A%1Dref_Coder_GlobalWindowCoder_1%12%21%0A%1F%0A%1D%0A%1Bbeam%3Acoder%3Aglobal_window%3Av1jT%0A%25%0A%23%0A%21beam%3Awindowfn%3Aglobal_windows%3Av0.1%10%01%1A%1Dref_Coder_GlobalWindowCoder_1%22%02%3A%00%28%010%018%01H%01",
 
"user_name": "assert_that/Group/GroupByKey"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s12", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": "_merge_tagged_vals_under_key"
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": 
"assert_that/Group/Map(_merge_tagged_vals_under_key).out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s11"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Group/Map(_merge_tagged_vals_under_key)"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s13", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 

[jira] [Work logged] (BEAM-3856) Add prototype support for Go SDK streaming

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3856?focusedWorklogId=87308=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87308
 ]

ASF GitHub Bot logged work on BEAM-3856:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:12
Start Date: 03/Apr/18 22:12
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #4939: 
[BEAM-3856][BEAM-3854] Add prototype of Go streaming on Dataflow with PubSub
URL: https://github.com/apache/beam/pull/4939#issuecomment-378416184
 
 
   R: @lostluck 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87308)
Time Spent: 40m  (was: 0.5h)

> Add prototype support for Go SDK streaming
> --
>
> Key: BEAM-3856
> URL: https://issues.apache.org/jira/browse/BEAM-3856
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Willy Lulciuc
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch go-sdk deleted (was 63cb46c)

2018-04-03 Thread iemejia
This is an automated email from the ASF dual-hosted git repository.

iemejia pushed a change to branch go-sdk
in repository https://gitbox.apache.org/repos/asf/beam.git.


 was 63cb46c  [BEAM-3571] Correct the Go SDK's EventTime encoding

The revisions that were on this branch are still contained in
other references; therefore, this change does not discard any commits
from the repository.

-- 
To stop receiving notification emails like this one, please contact
ieme...@apache.org.


[beam] branch revert-4805-go-sdk deleted (was 56515fa)

2018-04-03 Thread iemejia
This is an automated email from the ASF dual-hosted git repository.

iemejia pushed a change to branch revert-4805-go-sdk
in repository https://gitbox.apache.org/repos/asf/beam.git.


 was 56515fa  Revert "Merge master into go-sdk"

The revisions that were on this branch are still contained in
other references; therefore, this change does not discard any commits
from the repository.

-- 
To stop receiving notification emails like this one, please contact
ieme...@apache.org.


[jira] [Work logged] (BEAM-3437) Support schema in PCollections

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3437?focusedWorklogId=87304=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87304
 ]

ASF GitHub Bot logged work on BEAM-3437:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:03
Start Date: 03/Apr/18 22:03
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #4964: 
[BEAM-3437] Introduce Schema class, and use it in BeamSQL
URL: https://github.com/apache/beam/pull/4964#discussion_r178975398
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/integrationtest/BeamSqlComparisonOperatorsIntegrationTest.java
 ##
 @@ -281,7 +281,7 @@ public void testIsNullAndIsNotNull() throws Exception {
   }
 
   @Override protected PCollection getTestPCollection() {
-RowType type = RowSqlType.builder()
+Schema type = RowSqlType.builder()
 
 Review comment:
   ack


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87304)
Time Spent: 6h 40m  (was: 6.5h)

> Support schema in PCollections
> --
>
> Key: BEAM-3437
> URL: https://issues.apache.org/jira/browse/BEAM-3437
> Project: Beam
>  Issue Type: Wish
>  Components: beam-model
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> As discussed with some people in the team, it would be great to add schema 
> support in {{PCollections}}. It will allow us:
> 1. To expect some data type in {{PTransforms}}
> 2. Improve some runners with additional features (I'm thinking about Spark 
> runner with data frames for instance).
> A technical draft document has been created: 
> https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit?disco=BhykQIs=5a203b46=comment_email_document
> I also started a PoC on a branch, I will update this Jira with a "discussion" 
> PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3437) Support schema in PCollections

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3437?focusedWorklogId=87302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87302
 ]

ASF GitHub Bot logged work on BEAM-3437:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:02
Start Date: 03/Apr/18 22:02
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #4964: 
[BEAM-3437] Introduce Schema class, and use it in BeamSQL
URL: https://github.com/apache/beam/pull/4964#discussion_r178975243
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/coders/BigEndianShortCoder.java
 ##
 @@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.coders;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.io.UTFDataFormatException;
+import org.apache.beam.sdk.values.TypeDescriptor;
+
+/**
+ * A {@link BigEndianShortCoder} encodes {@link Short Shorts} in 4 bytes, 
big-endian.
+ */
+public class BigEndianShortCoder extends AtomicCoder {
+
+  public static BigEndianShortCoder of() {
+return INSTANCE;
+  }
+
+  /
+
+  private static final BigEndianShortCoder INSTANCE = new 
BigEndianShortCoder();
+  private static final TypeDescriptor TYPE_DESCRIPTOR = new 
TypeDescriptor() {};
+
+  private BigEndianShortCoder() {}
+
+  @Override
+  public void encode(Short value, OutputStream outStream) throws IOException {
+if (value == null) {
+  throw new CoderException("cannot encode a null Short");
+}
+new DataOutputStream(outStream).writeShort(value);
 
 Review comment:
   Ack


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87302)
Time Spent: 6h 20m  (was: 6h 10m)

> Support schema in PCollections
> --
>
> Key: BEAM-3437
> URL: https://issues.apache.org/jira/browse/BEAM-3437
> Project: Beam
>  Issue Type: Wish
>  Components: beam-model
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> As discussed with some people in the team, it would be great to add schema 
> support in {{PCollections}}. It will allow us:
> 1. To expect some data type in {{PTransforms}}
> 2. Improve some runners with additional features (I'm thinking about Spark 
> runner with data frames for instance).
> A technical draft document has been created: 
> https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit?disco=BhykQIs=5a203b46=comment_email_document
> I also started a PoC on a branch, I will update this Jira with a "discussion" 
> PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3437) Support schema in PCollections

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3437?focusedWorklogId=87301=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87301
 ]

ASF GitHub Bot logged work on BEAM-3437:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:02
Start Date: 03/Apr/18 22:02
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #4964: 
[BEAM-3437] Introduce Schema class, and use it in BeamSQL
URL: https://github.com/apache/beam/pull/4964#discussion_r178975184
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlArrayTest.java
 ##
 @@ -80,11 +81,11 @@ public void testSelectArrayValue() {
   public void testProjectArrayField() {
 PCollection input = pCollectionOf2Elements();
 
-RowType resultType =
+Schema resultType =
 RowSqlType
 .builder()
 .withIntegerField("f_int")
-.withArrayField("f_stringArr", SqlTypeCoders.VARCHAR)
+.withArrayField("f_stringArr", SqlTypeName.VARCHAR)
 
 Review comment:
   This builder will be removed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87301)
Time Spent: 6h 10m  (was: 6h)

> Support schema in PCollections
> --
>
> Key: BEAM-3437
> URL: https://issues.apache.org/jira/browse/BEAM-3437
> Project: Beam
>  Issue Type: Wish
>  Components: beam-model
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> As discussed with some people in the team, it would be great to add schema 
> support in {{PCollections}}. It will allow us:
> 1. To expect some data type in {{PTransforms}}
> 2. Improve some runners with additional features (I'm thinking about Spark 
> runner with data frames for instance).
> A technical draft document has been created: 
> https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit?disco=BhykQIs=5a203b46=comment_email_document
> I also started a PoC on a branch, I will update this Jira with a "discussion" 
> PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3437) Support schema in PCollections

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3437?focusedWorklogId=87303=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87303
 ]

ASF GitHub Bot logged work on BEAM-3437:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:02
Start Date: 03/Apr/18 22:02
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #4964: 
[BEAM-3437] Introduce Schema class, and use it in BeamSQL
URL: https://github.com/apache/beam/pull/4964#discussion_r178975317
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java
 ##
 @@ -75,191 +83,230 @@ public static Row nullRow(RowType rowType) {
* if type doesn't match.
*/
   public  T getValue(String fieldName) {
-return getValue(getRowType().indexOf(fieldName));
+return getValue(getSchema().indexOf(fieldName));
   }
 
   /**
* Get value by field index, {@link ClassCastException} is thrown
-   * if type doesn't match.
+   * if schema doesn't match.
*/
   @Nullable
   public  T getValue(int fieldIdx) {
 return (T) getValues().get(fieldIdx);
   }
 
   /**
-   * Get a {@link Byte} value by field name, {@link ClassCastException} is 
thrown
-   * if type doesn't match.
+   * Get a {@link TypeName#BYTE} value by field name, {@link 
IllegalStateException} is thrown
+   * if schema doesn't match.
*/
-  public Byte getByte(String fieldName) {
-return getValue(fieldName);
+  public byte getByte(String fieldName) {
+return getByte(getSchema().indexOf(fieldName));
   }
 
   /**
-   * Get a {@link Short} value by field name, {@link ClassCastException} is 
thrown
-   * if type doesn't match.
+   * Get a {@link TypeName#INT16} value by field name, {@link 
IllegalStateException} is thrown
+   * if schema doesn't match.
*/
-  public Short getShort(String fieldName) {
-return getValue(fieldName);
+  public short getInt16(String fieldName) {
+return getInt16(getSchema().indexOf(fieldName));
   }
 
   /**
-   * Get a {@link Integer} value by field name, {@link ClassCastException} is 
thrown
-   * if type doesn't match.
+   * Get a {@link TypeName#INT32} value by field name, {@link 
IllegalStateException} is thrown
+   * if schema doesn't match.
*/
-  public Integer getInteger(String fieldName) {
-return getValue(fieldName);
+  public int getInt32(String fieldName) {
+return getInt32(getSchema().indexOf(fieldName));
   }
 
   /**
-   * Get a {@link Float} value by field name, {@link ClassCastException} is 
thrown
-   * if type doesn't match.
+   * Get a {@link TypeName#INT64} value by field name, {@link 
IllegalStateException} is thrown
+   * if schema doesn't match.
*/
-  public Float getFloat(String fieldName) {
-return getValue(fieldName);
+  public long getInt64(String fieldName) {
+return getInt64(getSchema().indexOf(fieldName));
   }
 
   /**
-   * Get a {@link Double} value by field name, {@link ClassCastException} is 
thrown
-   * if type doesn't match.
+   * Get a {@link TypeName#DECIMAL} value by field name, {@link 
IllegalStateException} is thrown
+   * if schema doesn't match.
*/
-  public Double getDouble(String fieldName) {
-return getValue(fieldName);
+  public BigDecimal getDecimal(String fieldName) {
+return getDecimal(getSchema().indexOf(fieldName));
   }
 
   /**
-   * Get a {@link Long} value by field name, {@link ClassCastException} is 
thrown
-   * if type doesn't match.
+   * Get a {@link TypeName#FLOAT} value by field name, {@link 
IllegalStateException} is thrown
+   * if schema doesn't match.
*/
-  public Long getLong(String fieldName) {
-return getValue(fieldName);
+  public float getFloat(String fieldName) {
+return getFloat(getSchema().indexOf(fieldName));
   }
 
   /**
-   * Get a {@link String} value by field name, {@link ClassCastException} is 
thrown
-   * if type doesn't match.
+   * Get a {@link TypeName#DOUBLE} value by field name, {@link 
IllegalStateException} is thrown
+   * if schema doesn't match.
+   */
+  public double getDouble(String fieldName) {
+return getDouble(getSchema().indexOf(fieldName));
+  }
+
+  /**
+   * Get a {@link TypeName#STRING} value by field name, {@link 
IllegalStateException} is thrown
+   * if schema doesn't match.
*/
   public String getString(String fieldName) {
-return getValue(fieldName);
+return getString(getSchema().indexOf(fieldName));
   }
 
   /**
-   * Get a {@link Date} value by field name, {@link ClassCastException} is 
thrown
-   * if type doesn't match.
+   * Get a {@link TypeName#DATETIME} value by field name, {@link 
IllegalStateException} is thrown
+   * if schema doesn't match.
*/
-  public Date getDate(String fieldName) {
-return getValue(fieldName);
+  public ReadableDateTime getDateTime(String fieldName) {
+return getDateTime(getSchema().indexOf(fieldName));
   }
 
   /**
-   * Get a {@link 

[jira] [Work logged] (BEAM-3437) Support schema in PCollections

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3437?focusedWorklogId=87300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87300
 ]

ASF GitHub Bot logged work on BEAM-3437:


Author: ASF GitHub Bot
Created on: 03/Apr/18 22:01
Start Date: 03/Apr/18 22:01
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #4964: 
[BEAM-3437] Introduce Schema class, and use it in BeamSQL
URL: https://github.com/apache/beam/pull/4964#discussion_r178974984
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlArrayTest.java
 ##
 @@ -80,11 +81,11 @@ public void testSelectArrayValue() {
   public void testProjectArrayField() {
 PCollection input = pCollectionOf2Elements();
 
-RowType resultType =
+Schema resultType =
 RowSqlType
 .builder()
 .withIntegerField("f_int")
-.withArrayField("f_stringArr", SqlTypeCoders.VARCHAR)
+.withArrayField("f_stringArr", SqlTypeName.VARCHAR)
 
 Review comment:
   I'll remove RowSqlTypeBuilder in a follow-in PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87300)
Time Spent: 6h  (was: 5h 50m)

> Support schema in PCollections
> --
>
> Key: BEAM-3437
> URL: https://issues.apache.org/jira/browse/BEAM-3437
> Project: Beam
>  Issue Type: Wish
>  Components: beam-model
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> As discussed with some people in the team, it would be great to add schema 
> support in {{PCollections}}. It will allow us:
> 1. To expect some data type in {{PTransforms}}
> 2. Improve some runners with additional features (I'm thinking about Spark 
> runner with data frames for instance).
> A technical draft document has been created: 
> https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit?disco=BhykQIs=5a203b46=comment_email_document
> I also started a PoC on a branch, I will update this Jira with a "discussion" 
> PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3994) Use typed sinks and sources for FnApiControlClientPoolService

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3994?focusedWorklogId=87297=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87297
 ]

ASF GitHub Bot logged work on BEAM-3994:


Author: ASF GitHub Bot
Created on: 03/Apr/18 21:51
Start Date: 03/Apr/18 21:51
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #5008: 
[BEAM-3994] Use typed client pool sinks and sources
URL: https://github.com/apache/beam/pull/5008#discussion_r178972766
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/QueueControlClientPool.java
 ##
 @@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.control;
+
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.SynchronousQueue;
+
+/** Control client pool backed by a blocking queue. */
+public class QueueControlClientPool
+implements ControlClientPool {
+
+  private final BlockingQueue queue;
+
+  /** Creates a client pool backed by a {@link SynchronousQueue}. */
+  public static QueueControlClientPool createSynchronous() {
+  return new QueueControlClientPool<>(new SynchronousQueue<>(true));
+  }
+
+  /** Creates a client pool backed by an unbounded {@link 
LinkedBlockingQueue}. */
+  public static QueueControlClientPool createLinked() {
 
 Review comment:
   `createBuffering` I like


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87297)
Time Spent: 1.5h  (was: 1h 20m)

> Use typed sinks and sources for FnApiControlClientPoolService
> -
>
> Key: BEAM-3994
> URL: https://issues.apache.org/jira/browse/BEAM-3994
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We operate with blocking queues directly when managing control clients with 
> the FnApiControlClientPoolService. This makes interactions with the client 
> pool difficult to understand. We should instead make client sources and sinks 
> explicit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3994) Use typed sinks and sources for FnApiControlClientPoolService

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3994?focusedWorklogId=87296=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87296
 ]

ASF GitHub Bot logged work on BEAM-3994:


Author: ASF GitHub Bot
Created on: 03/Apr/18 21:51
Start Date: 03/Apr/18 21:51
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #5008: 
[BEAM-3994] Use typed client pool sinks and sources
URL: https://github.com/apache/beam/pull/5008#discussion_r178972737
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/ControlClientPool.java
 ##
 @@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.control;
+
+/** Control client pool that exposes a source and sink of control clients. */
+public interface ControlClientPool {
+
+  /** Source of control clients. */
+  ClientSource getSource();
+
+  /** Sink for control clients. */
+  ClientSink getSink();
+
+  /** A source of control clients. */
+  @FunctionalInterface
+  interface ClientSource {
+  T take() throws Exception;
+  }
+
+  /** A sink for control clients. */
+  @FunctionalInterface
+  interface ClientSink {
 
 Review comment:
   In the intermediate time, move it into `sdks/java/fn-execution`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87296)
Time Spent: 1h 20m  (was: 1h 10m)

> Use typed sinks and sources for FnApiControlClientPoolService
> -
>
> Key: BEAM-3994
> URL: https://issues.apache.org/jira/browse/BEAM-3994
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We operate with blocking queues directly when managing control clients with 
> the FnApiControlClientPoolService. This makes interactions with the client 
> pool difficult to understand. We should instead make client sources and sinks 
> explicit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87294=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87294
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 21:44
Start Date: 03/Apr/18 21:44
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #5010: [BEAM-3257] Add Python 
precommit gradle config
URL: https://github.com/apache/beam/pull/5010#issuecomment-378409367
 
 
   run seed job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87294)
Time Spent: 1h 20m  (was: 1h 10m)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3994) Use typed sinks and sources for FnApiControlClientPoolService

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3994?focusedWorklogId=87291=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87291
 ]

ASF GitHub Bot logged work on BEAM-3994:


Author: ASF GitHub Bot
Created on: 03/Apr/18 21:40
Start Date: 03/Apr/18 21:40
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #5008: 
[BEAM-3994] Use typed client pool sinks and sources
URL: https://github.com/apache/beam/pull/5008#discussion_r178970567
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/ControlClientPool.java
 ##
 @@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.control;
+
+/** Control client pool that exposes a source and sink of control clients. */
+public interface ControlClientPool {
+
+  /** Source of control clients. */
+  ClientSource getSource();
+
+  /** Sink for control clients. */
+  ClientSink getSink();
+
+  /** A source of control clients. */
+  @FunctionalInterface
+  interface ClientSource {
+  T take() throws Exception;
+  }
+
+  /** A sink for control clients. */
+  @FunctionalInterface
+  interface ClientSink {
 
 Review comment:
   Just checked. The problem is `ThrowingConsumer`, which lives under the 
`sdks/java/harness`. Is it safe to add a dependency there or should the class 
be moved elsewhere? This was actually the motivation for 
[BEAM-3966](https://issues.apache.org/jira/browse/BEAM-3966).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87291)
Time Spent: 1h 10m  (was: 1h)

> Use typed sinks and sources for FnApiControlClientPoolService
> -
>
> Key: BEAM-3994
> URL: https://issues.apache.org/jira/browse/BEAM-3994
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We operate with blocking queues directly when managing control clients with 
> the FnApiControlClientPoolService. This makes interactions with the client 
> pool difficult to understand. We should instead make client sources and sinks 
> explicit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87290
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 21:37
Start Date: 03/Apr/18 21:37
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #5010: [BEAM-3257] Add Python 
precommit gradle config
URL: https://github.com/apache/beam/pull/5010#issuecomment-378407647
 
 
   update seed job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87290)
Time Spent: 1h 10m  (was: 1h)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3994) Use typed sinks and sources for FnApiControlClientPoolService

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3994?focusedWorklogId=87285=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87285
 ]

ASF GitHub Bot logged work on BEAM-3994:


Author: ASF GitHub Bot
Created on: 03/Apr/18 21:27
Start Date: 03/Apr/18 21:27
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #5008: 
[BEAM-3994] Use typed client pool sinks and sources
URL: https://github.com/apache/beam/pull/5008#discussion_r178967451
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/QueueControlClientPool.java
 ##
 @@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.control;
+
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.SynchronousQueue;
+
+/** Control client pool backed by a blocking queue. */
+public class QueueControlClientPool
+implements ControlClientPool {
+
+  private final BlockingQueue queue;
+
+  /** Creates a client pool backed by a {@link SynchronousQueue}. */
+  public static QueueControlClientPool createSynchronous() {
+  return new QueueControlClientPool<>(new SynchronousQueue<>(true));
+  }
+
+  /** Creates a client pool backed by an unbounded {@link 
LinkedBlockingQueue}. */
+  public static QueueControlClientPool createLinked() {
 
 Review comment:
   `createBuffering` or `createUnbounded`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87285)
Time Spent: 1h  (was: 50m)

> Use typed sinks and sources for FnApiControlClientPoolService
> -
>
> Key: BEAM-3994
> URL: https://issues.apache.org/jira/browse/BEAM-3994
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We operate with blocking queues directly when managing control clients with 
> the FnApiControlClientPoolService. This makes interactions with the client 
> pool difficult to understand. We should instead make client sources and sinks 
> explicit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3994) Use typed sinks and sources for FnApiControlClientPoolService

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3994?focusedWorklogId=87284=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87284
 ]

ASF GitHub Bot logged work on BEAM-3994:


Author: ASF GitHub Bot
Created on: 03/Apr/18 21:26
Start Date: 03/Apr/18 21:26
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #5008: 
[BEAM-3994] Use typed client pool sinks and sources
URL: https://github.com/apache/beam/pull/5008#discussion_r178967269
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/ControlClientPool.java
 ##
 @@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.fnexecution.control;
+
+/** Control client pool that exposes a source and sink of control clients. */
+public interface ControlClientPool {
+
+  /** Source of control clients. */
+  ClientSource getSource();
+
+  /** Sink for control clients. */
+  ClientSink getSink();
+
+  /** A source of control clients. */
+  @FunctionalInterface
+  interface ClientSource {
+  T take() throws Exception;
+  }
+
+  /** A sink for control clients. */
+  @FunctionalInterface
+  interface ClientSink {
 
 Review comment:
   I originally tried that but one of those classes was in a package not 
accessible here. Where should those live, assuming they should be moved?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87284)
Time Spent: 50m  (was: 40m)

> Use typed sinks and sources for FnApiControlClientPoolService
> -
>
> Key: BEAM-3994
> URL: https://issues.apache.org/jira/browse/BEAM-3994
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> We operate with blocking queues directly when managing control clients with 
> the FnApiControlClientPoolService. This makes interactions with the client 
> pool difficult to understand. We should instead make client sources and sinks 
> explicit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=87283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87283
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 03/Apr/18 21:23
Start Date: 03/Apr/18 21:23
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-378403789
 
 
   @pabloem I looked at the post commit tests. It seems to be failing because 
of a quota issue.
   
   ```
   
   Workflow failed. Causes: Project apache-beam-testing has insufficient 
quota(s) to execute this workflow with 1 instances in region us-central1. Quota 
summary (required/available): 1/1441 instances, 1/0 CPUs, 250/14440 disk GB, 
0/1998 SSD disk GB, 1/63 instance groups, 1/13 managed instance groups, 1/39 
instance templates, 1/293 in-use IP addresses.
   
   Please see https://cloud.google.com/compute/docs/resource-quotas about 
requesting more quota.
   ```
   @alanmyrvold Is it possible to increase the quota for the project?
   
   Also @markflyhigh is currently working on fixing an issue with 
`test_streaming_wordcount_it` (other than the timeout). That would probably fix 
your tests issue as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87283)
Time Spent: 5h 50m  (was: 5h 40m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87280=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87280
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 21:15
Start Date: 03/Apr/18 21:15
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #5010: [BEAM-3257] Update 
Python Gradle tasks to run in a venv.
URL: https://github.com/apache/beam/pull/5010#issuecomment-378401914
 
 
   R: @lukecwik 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87280)
Time Spent: 1h  (was: 50m)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87278=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87278
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 03/Apr/18 21:11
Start Date: 03/Apr/18 21:11
Worklog Time Spent: 10m 
  Work Description: udim opened a new pull request #5010: [BEAM-3257] 
Update Python Gradle tasks to run in a venv.
URL: https://github.com/apache/beam/pull/5010
 
 
   DESCRIPTION HERE
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [ ] Write a pull request description that is detailed enough to 
understand:
  - [ ] What the pull request does
  - [ ] Why it does it
  - [ ] How it does it
  - [ ] Why this approach
- [ ] Each commit in the pull request should have a meaningful subject line 
and body.
- [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87278)
Time Spent: 50m  (was: 40m)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3995) Launch Nexmark suites from gradle and update web page docs

2018-04-03 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-3995:
-

 Summary: Launch Nexmark suites from gradle and update web page docs
 Key: BEAM-3995
 URL: https://issues.apache.org/jira/browse/BEAM-3995
 Project: Beam
  Issue Type: Sub-task
  Components: examples-nexmark, website
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles


Currently our instructions for running Nexmark benchmarks on various runners is 
pretty tightly tied to Maven. We need a good story for running them with gradle 
(or just building an executable with gradle and running that standalone).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5272

2018-04-03 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3989) Maven pipeline jobs consistently failing

2018-04-03 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3989?focusedWorklogId=87275=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87275
 ]

ASF GitHub Bot logged work on BEAM-3989:


Author: ASF GitHub Bot
Created on: 03/Apr/18 20:58
Start Date: 03/Apr/18 20:58
Worklog Time Spent: 10m 
  Work Description: lukecwik closed pull request #5005: [BEAM-3989] Delete 
unused pipeline jobs
URL: https://github.com/apache/beam/pull/5005
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/.test-infra/jenkins/PreCommit_Pipeline.groovy 
b/.test-infra/jenkins/PreCommit_Pipeline.groovy
deleted file mode 100644
index 131c79845ab..000
--- a/.test-infra/jenkins/PreCommit_Pipeline.groovy
+++ /dev/null
@@ -1,129 +0,0 @@
-#!groovy
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-import hudson.model.Result
-
-int NO_BUILD = -1
-
-// These are args for the GitHub Pull Request Builder (ghprb) Plugin. 
Providing these arguments is
-// necessary due to a bug in the ghprb plugin where environment variables are 
not correctly passed
-// to jobs downstream of a Pipeline job.
-// Tracked by https://github.com/jenkinsci/ghprb-plugin/issues/572.
-List ghprbArgs = [
-string(name: 'ghprbGhRepository', value: "${ghprbGhRepository}"),
-string(name: 'ghprbActualCommit', value: "${ghprbActualCommit}"),
-string(name: 'ghprbPullId', value: "${ghprbPullId}")
-]
-
-// This argument is the commit at which to build.
-List commitArg = [string(name: 'sha1', value: 
"origin/pr/${ghprbPullId}/head")]
-
-int javaBuildNum = NO_BUILD
-
-final String JAVA_BUILD_TYPE = "java"
-final String PYTHON_BUILD_TYPE = "python"
-final String ALL_BUILD_TYPE = "all"
-
-def buildTypes = [
-JAVA_BUILD_TYPE,
-PYTHON_BUILD_TYPE,
-ALL_BUILD_TYPE,
-]
-
-String currentBuildType = ALL_BUILD_TYPE
-String commentLower = ghprbCommentBody.toLowerCase()
-
-// Currently if there is nothing selected (e.g. the comment is just "retest 
this please") we select "all" by default.
-// In the future we should provide some mechanism, either via commenting or 
the suite failure message, to enforce
-// selection of one of the build types.
-if (!commentLower.isEmpty()) {
-commentSplit = commentLower.split(' ')
-buildType = commentSplit[commentSplit.length-1]
-if (buildTypes.contains(buildType)) {
-currentBuildType = buildType
-}
-}
-
-// This (and the below) define "Stages" of a pipeline. These stages run 
serially, and inside can
-// have "parallel" blocks which execute several work steps concurrently. This 
work is limited to
-// simple operations -- more complicated operations need to be performed on an 
actual node. In this
-// case we are using the pipeline to trigger downstream builds.
-stage('Build') {
-parallel (
-java: {
-if (currentBuildType == JAVA_BUILD_TYPE || currentBuildType == 
ALL_BUILD_TYPE) {
-def javaBuild = build job: 'beam_Java_Build', parameters: 
commitArg + ghprbArgs
-if (javaBuild.getResult() == Result.SUCCESS.toString()) {
-javaBuildNum = javaBuild.getNumber()
-}
-} else {
-echo 'Skipping Java due to comment selecting non-Java 
execution: ' + ghprbCommentBody
-}
-},
-python_unit: { // Python doesn't have a build phase, so we include 
this here.
-if (currentBuildType == PYTHON_BUILD_TYPE || currentBuildType == 
ALL_BUILD_TYPE) {
-try {
-build job: 'beam_Python_UnitTest', parameters: commitArg + 
ghprbArgs
-} catch (Exception e) {
-echo 'Python build failed: ' + e.toString()
-}
-} else {
-echo 'Skipping Python due to comment selecting non-Python 
execution: ' + ghprbCommentBody
-}
-}
-)
-}
-
-// This 

[beam] branch master updated (f5fc543 -> f0c9ff8)

2018-04-03 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from f5fc543  [BEAM-3249] Add missing gradle artifact ids
 add a25157b  [BEAM-3989] Delete unused pipeline jobs
 new f0c9ff8  [BEAM-3989] Delete unused pipeline jobs

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .test-infra/jenkins/PreCommit_Pipeline.groovy  | 129 -
 .../jenkins/job_beam_PreCommit_Pipeline.groovy |  84 --
 2 files changed, 213 deletions(-)
 delete mode 100644 .test-infra/jenkins/PreCommit_Pipeline.groovy
 delete mode 100644 .test-infra/jenkins/job_beam_PreCommit_Pipeline.groovy

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


  1   2   3   >