date:20180416

[jira] [Comment Edited] (BEAM-4096) BigQueryIO ValueProvider support for Method and Triggering Frequency

2018-04-16 Thread Jan Peuker (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440364#comment-16440364
 ] 

Jan Peuker edited comment on BEAM-4096 at 4/17/18 5:40 AM:
---

Hi this is Jan, all set up with Jira now.

Small addition here: We also need to change withNumFileShards to a 
ValueProvider which is a required option right now. The default 1000 mentioned 
in the JavaDoc is incorrect and tends to cause OutOfMemoryError in 
DataflowRunner. From my current, naive, benchmarks it seems a more sensible 
suggestion for most cases seems to have 100 shards (easy to calculate shard on 
powers of 10 and reaches common chunk sizes earlier).


was (Author: janpeuker):
Hi this is Jan, all set up with Jira now.

Small addition here: We also need to change withNumFileShards to a 
ValueProvider which is a required option right now. The default 1000 mentioned 
in the JavaDoc is incorrect and tends to cause OutOfMemoryError in 
DataflowRunner. From my current, native, benchmarks it seems a more sensible 
suggestion for most cases seems to have 100 shards (easy to calculate shard on 
powers of 2 and reaches common chunk sizes earlier).

> BigQueryIO ValueProvider support for Method and Triggering Frequency
> 
>
> Key: BEAM-4096
> URL: https://issues.apache.org/jira/browse/BEAM-4096
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Ryan McDowell
>Priority: Minor
> Fix For: 2.5.0
>
>
> Enhance BigQueryIO to accept ValueProviders for:
>  * withMethod(..)
>  * withTriggeringFrequency(..)
>  * withNumFileShards(..)
> It would allow Dataflow templates to accept these parameters at runtime 
> instead of being hardcoded. This opens up the ability to create Dataflow 
> templates which allow users to flip back-and-forth between batch and 
> streaming inserts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3327) Add abstractions to manage Environment Instance lifecycles.

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3327?focusedWorklogId=91623=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91623
 ]

ASF GitHub Bot logged work on BEAM-3327:


Author: ASF GitHub Bot
Created on: 17/Apr/18 05:57
Start Date: 17/Apr/18 05:57
Worklog Time Spent: 10m 
  Work Description: axelmagn commented on issue #5152: [BEAM-3327] Harness 
Manager Interfaces
URL: https://github.com/apache/beam/pull/5152#issuecomment-381854681
 
 
   R: @tgroh 
   CC: @bsidhom 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91623)
Time Spent: 6.5h  (was: 6h 20m)

> Add abstractions to manage Environment Instance lifecycles.
> ---
>
> Key: BEAM-3327
> URL: https://issues.apache.org/jira/browse/BEAM-3327
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Axel Magnuson
>Priority: Major
>  Labels: portability
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> This permits remote stage execution for arbitrary environments



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3327) Add abstractions to manage Environment Instance lifecycles.

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3327?focusedWorklogId=91621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91621
 ]

ASF GitHub Bot logged work on BEAM-3327:


Author: ASF GitHub Bot
Created on: 17/Apr/18 05:57
Start Date: 17/Apr/18 05:57
Worklog Time Spent: 10m 
  Work Description: axelmagn commented on issue #5152: [BEAM-3327] Harness 
Manager Interfaces
URL: https://github.com/apache/beam/pull/5152#issuecomment-381854681
 
 
   R: tgroh@
   cc: bsidhom@


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91621)
Time Spent: 6h 20m  (was: 6h 10m)

> Add abstractions to manage Environment Instance lifecycles.
> ---
>
> Key: BEAM-3327
> URL: https://issues.apache.org/jira/browse/BEAM-3327
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Axel Magnuson
>Priority: Major
>  Labels: portability
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> This permits remote stage execution for arbitrary environments



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3327) Add abstractions to manage Environment Instance lifecycles.

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3327?focusedWorklogId=91620=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91620
 ]

ASF GitHub Bot logged work on BEAM-3327:


Author: ASF GitHub Bot
Created on: 17/Apr/18 05:56
Start Date: 17/Apr/18 05:56
Worklog Time Spent: 10m 
  Work Description: axelmagn opened a new pull request #5152: [BEAM-3327] 
Harness Manager Interfaces
URL: https://github.com/apache/beam/pull/5152
 
 
   These are some interfaces that will be used on the worker to manage the 
lifetimes of remote environments and the related RPC services.  The key 
addition is of `SdkHarnessManager`, which is responsible for managing these 
resources and can provide a `RemoteEnvironment` to runner operators.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [x] Write a pull request description that is detailed enough to 
understand:
  - [x] What the pull request does
  - [x] Why it does it
  - [x] How it does it
  - [x] Why this approach
- [x] Each commit in the pull request should have a meaningful subject line 
and body.
- [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91620)
Time Spent: 6h 10m  (was: 6h)

> Add abstractions to manage Environment Instance lifecycles.
> ---
>
> Key: BEAM-3327
> URL: https://issues.apache.org/jira/browse/BEAM-3327
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Axel Magnuson
>Priority: Major
>  Labels: portability
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> This permits remote stage execution for arbitrary environments



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91388
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 16:36
Start Date: 16/Apr/18 16:36
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-381668215
 
 
   Retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91388)
Time Spent: 7h 20m  (was: 7h 10m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Apex_Gradle #106

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[github] Add region to dataflowOptions as well.

[tgroh] Use Explicit PipelineOptions in Native Evaluators

--
[...truncated 23.85 MB...]
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.engine.StreamingContainer 
processHeartbeatResponse
INFO: Undeploy request: [12, 13, 14]
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.engine.StreamingContainer 
undeploy
INFO: Undeploy complete.
Apr 16, 2018 4:44:42 PM com.datatorrent.bufferserver.server.Server$3 run
INFO: Removing ln 
LogicalNode@4c03a164identifier=tcp://localhost:50583/14.output.15, 
upstream=14.output.15, group=stream8/17.data1, partitions=[], 
iterator=com.datatorrent.bufferserver.internal.DataList$DataListIterator@7d7ecffc{da=com.datatorrent.bufferserver.internal.DataList$Block@fa87c85{identifier=14.output.15,
 data=1048576, readingOffset=0, writingOffset=33, 
starting_window=5ad4d2f60001, ending_window=5ad4d2f60001, refCount=2, 
uniqueIdentifier=0, next=null, future=null}}} from dl 
com.datatorrent.bufferserver.internal.DataList@60305b36 {14.output.15}
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.engine.StreamingContainer 
processHeartbeatResponse
INFO: Undeploy request: [4]
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.engine.StreamingContainer 
undeploy
INFO: Undeploy complete.
Apr 16, 2018 4:44:42 PM com.datatorrent.bufferserver.server.Server$3 run
INFO: Removing ln 
LogicalNode@76559460identifier=tcp://localhost:50583/4.output.4, 
upstream=4.output.4, group=stream0/5.input, partitions=[], 
iterator=com.datatorrent.bufferserver.internal.DataList$DataListIterator@16d28135{da=com.datatorrent.bufferserver.internal.DataList$Block@1d58bd3c{identifier=4.output.4,
 data=1048576, readingOffset=0, writingOffset=33, 
starting_window=5ad4d2f60001, ending_window=5ad4d2f60001, refCount=2, 
uniqueIdentifier=0, next=null, future=null}}} from dl 
com.datatorrent.bufferserver.internal.DataList@51d8a646 {4.output.4}
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.engine.StreamingContainer 
processHeartbeatResponse
INFO: Undeploy request: [11]
Apr 16, 2018 4:44:42 PM com.datatorrent.stram.engine.StreamingContainer 
undeploy
INFO: Undeploy complete.
Apr 16, 2018 4:44:42 PM com.datatorrent.bufferserver.server.Server$3 run
INFO: Removing ln 
LogicalNode@57476114identifier=tcp://localhost:50583/11.output.11, 
upstream=11.output.11, group=stream6/12.input, partitions=[], 
iterator=com.datatorrent.bufferserver.internal.DataList$DataListIterator@33d3bea0{da=com.datatorrent.bufferserver.internal.DataList$Block@310a4ac6{identifier=11.output.11,
 data=1048576, readingOffset=0, writingOffset=33, 
starting_window=5ad4d2f60001, ending_window=5ad4d2f60001, refCount=2, 
uniqueIdentifier=0, next=null, future=null}}} from dl 
com.datatorrent.bufferserver.internal.DataList@6963a086 {11.output.11}
Apr 16, 2018 4:44:43 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 4:44:43 PM com.datatorrent.stram.engine.StreamingContainer 
processHeartbeatResponse
INFO: Undeploy request: [1]
Apr 16, 2018 4:44:43 PM com.datatorrent.stram.engine.StreamingContainer 
undeploy
INFO: Undeploy complete.
Apr 16, 2018 4:44:43 PM com.datatorrent.bufferserver.server.Server$3 run
INFO: Removing

[jira] [Work logged] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4056?focusedWorklogId=91407=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91407
 ]

ASF GitHub Bot logged work on BEAM-4056:


Author: ASF GitHub Bot
Created on: 16/Apr/18 17:18
Start Date: 16/Apr/18 17:18
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on issue #5118: [BEAM-4056] Identify 
side inputs by transform id and local name
URL: https://github.com/apache/beam/pull/5118#issuecomment-381681942
 
 
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91407)
Time Spent: 2h 10m  (was: 2h)

> Identify Side Inputs by PTransform ID and local name
> 
>
> Key: BEAM-4056
> URL: https://issues.apache.org/jira/browse/BEAM-4056
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> This is necessary in order to correctly identify side inputs during all 
> phases of portable pipeline execution (fusion, translation, and SDK 
> execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3310) Push metrics to a backend in an runner agnostic way

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3310?focusedWorklogId=91405=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91405
 ]

ASF GitHub Bot logged work on BEAM-3310:


Author: ASF GitHub Bot
Created on: 16/Apr/18 17:17
Start Date: 16/Apr/18 17:17
Worklog Time Spent: 10m 
  Work Description: swegner commented on issue #4548: [BEAM-3310] Metrics 
pusher
URL: https://github.com/apache/beam/pull/4548#issuecomment-381681741
 
 
   > Can we move forward and merge this PR as it is and consider it as the 
default implementation that java sdks and java runners are free to use (like 
the philosophy of runner-core-java module)?
   
   What do you mean by making it the default? I don't think this replaces 
existing metrics functionality; for example existing infrastructure includes 
the ability to support committed values, and querying metrics results via the 
`PipelineResult` interface. However I'm not opposed to this going in as 
additional opt-in functionality.
   
   > @swegner you gave LGTM, so I think we can merge.
   
   The `MetricsPusher` component looks good to me; I haven't reviewed 
Spark/Flink runner integration. Also note that I am not a Beam Committer, so 
final LGTM will have to come from someone who is. (@lukecwik is a committer, 
but is currently on vacation)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91405)
Time Spent: 4h 40m  (was: 4.5h)

> Push metrics to a backend in an runner agnostic way
> ---
>
> Key: BEAM-3310
> URL: https://issues.apache.org/jira/browse/BEAM-3310
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> The idea is to avoid relying on the runners to provide access to the metrics 
> (either at the end of the pipeline or while it runs) because they don't have 
> all the same capabilities towards metrics (e.g. spark runner configures sinks 
>  like csv, graphite or in memory sinks using the spark engine conf). The 
> target is to push the metrics in the common runner code so that no matter the 
> chosen runner, a user can get his metrics out of beam.
> Here is the link to the discussion thread on the dev ML: 
> https://lists.apache.org/thread.html/01a80d62f2df6b84bfa41f05e15fda900178f882877c294fed8be91e@%3Cdev.beam.apache.org%3E
> And the design doc:
> https://s.apache.org/runner_independent_metrics_extraction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Apex_Gradle #107

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[tgroh] Update Dataflow Development Container Version

--
[...truncated 27.98 MB...]
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 1 sending EndOfStream
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 2 sending EndOfStream
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 3 sending EndOfStream
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 4 sending EndOfStream
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 5 sending EndOfStream
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 6 sending EndOfStream
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 7 sending EndOfStream
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 8 sending EndOfStream
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 9 sending EndOfStream
Apr 16, 2018 5:38:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 12 sending EndOfStream
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 13 sending EndOfStream
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 14 sending EndOfStream
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 15 sending EndOfStream
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 16 sending EndOfStream
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 17 sending EndOfStream
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 18 sending EndOfStream
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 19 sending EndOfStream
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.engine.StreamingContainer 
processHeartbeatResponse
INFO: Undeploy request: [13]
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.engine.StreamingContainer 
undeploy
INFO: Undeploy complete.
Apr 16, 2018 5:38:22 PM com.datatorrent.bufferserver.server.Server$3 run
INFO: Removing ln 
LogicalNode@52b50842identifier=tcp://localhost:41500/13.out.13, 
upstream=13.out.13, group=stream15/14.input, partitions=[], 
iterator=com.datatorrent.bufferserver.internal.DataList$DataListIterator@4f15a734{da=com.datatorrent.bufferserver.internal.DataList$Block@20235a94{identifier=13.out.13,
 data=1048576, readingOffset=0, writingOffset=257, 
starting_window=5ad4df8b0001, ending_window=5ad4df8b0006, refCount=2, 
uniqueIdentifier=0, next=null, future=null}}} from dl 
com.datatorrent.bufferserver.internal.DataList@17ce6873 {13.out.13}
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 5:38:22 PM com.datatorrent.stram.engine.StreamingContainer 
processHeartbeatResponse
INFO: Undeploy request: [16, 17, 18, 19, 15]
Apr 16, 2018

Build failed in Jenkins: beam_PerformanceTests_AvroIOIT_HDFS #57

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[tgroh] Update Dataflow Development Container Version

[github] Add region to dataflowOptions as well.

[tgroh] Use Explicit PipelineOptions in Native Evaluators

--
[...truncated 404.17 KB...]
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:248)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:235)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy60.create(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy61.create(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:248)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:235)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn$DoFnInvoker.invokeProcessElement(Unknown
 Source)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
at 
com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:323)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
at 
com.google.cloud.dataflow.worker.AssignWindowsParDoFnFactory$AssignWindowsParDoFn.processElement(AssignWindowsParDoFnFactory.java:118)
at

[jira] [Updated] (BEAM-4033) Move java example precommits to execute within runners similar to how validates runner works

2018-04-16 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4033:
--
Parent Issue: BEAM-4045  (was: BEAM-3249)

> Move java example precommits to execute within runners similar to how 
> validates runner works
> 
>
> Key: BEAM-4033
> URL: https://issues.apache.org/jira/browse/BEAM-4033
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Luke Cwik
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Jenkins build is back to normal : beam_PerformanceTests_TextIOIT_HDFS #63

2018-04-16 Thread Apache Jenkins Server

See

Build failed in Jenkins: beam_PerformanceTests_Spark #1598

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[tgroh] Update Dataflow Development Container Version

[github] Add region to dataflowOptions as well.

[tgroh] Use Explicit PipelineOptions in Native Evaluators

--
[...truncated 85.70 KB...]
'apache-beam-testing:bqjob_r1daf3b5f171b5784_0162cfadf94f_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r1daf3b5f171b5784_0162cfadf94f_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r1daf3b5f171b5784_0162cfadf94f_1 ... (0s) Current status: DONE   
2018-04-16 18:18:30,344 131bb083 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-16 18:18:50,988 131bb083 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-16 18:18:53,773 131bb083 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r1c467a6082083a7f_0162cfae561b_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r1c467a6082083a7f_0162cfae561b_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r1c467a6082083a7f_0162cfae561b_1 ... (0s) Current status: DONE   
2018-04-16 18:18:53,774 131bb083 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-16 18:19:20,516 131bb083 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-16 18:19:23,410 131bb083 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r4a6bcbb06170de4e_0162cfaec8a1_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r4a6bcbb06170de4e_0162cfaec8a1_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r4a6bcbb06170de4e_0162cfaec8a1_1 ... (0s) Current status: DONE   
2018-04-16 18:19:23,411 131bb083 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-16 18:19:47,020 131bb083 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-16 18:19:50,092 131bb083 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job

[jira] [Work logged] (BEAM-3310) Push metrics to a backend in an runner agnostic way

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3310?focusedWorklogId=91302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91302
 ]

ASF GitHub Bot logged work on BEAM-3310:


Author: ASF GitHub Bot
Created on: 16/Apr/18 11:55
Start Date: 16/Apr/18 11:55
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4548: [BEAM-3310] Metrics 
pusher
URL: https://github.com/apache/beam/pull/4548#issuecomment-381532767
 
 
   Since the discussed missing parts will be dealt in different issues I think 
we can merge this. Anyone else @swegner @lukecwik agrees? I will merge if no 
disagreement. In the meantime @echauchot can you please rebase/squash as much 
commits as it makes sense (to avoid having 'Fix rat' and 'clean' like commits) 
and alos to get the green on jenkins.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91302)
Time Spent: 4.5h  (was: 4h 20m)

> Push metrics to a backend in an runner agnostic way
> ---
>
> Key: BEAM-3310
> URL: https://issues.apache.org/jira/browse/BEAM-3310
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The idea is to avoid relying on the runners to provide access to the metrics 
> (either at the end of the pipeline or while it runs) because they don't have 
> all the same capabilities towards metrics (e.g. spark runner configures sinks 
>  like csv, graphite or in memory sinks using the spark engine conf). The 
> target is to push the metrics in the common runner code so that no matter the 
> chosen runner, a user can get his metrics out of beam.
> Here is the link to the discussion thread on the dev ML: 
> https://lists.apache.org/thread.html/01a80d62f2df6b84bfa41f05e15fda900178f882877c294fed8be91e@%3Cdev.beam.apache.org%3E
> And the design doc:
> https://s.apache.org/runner_independent_metrics_extraction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (BEAM-4016) Direct runner incorrect lifecycle, @SplitRestriction should execute after @Setup on SplittableDoFn

2018-04-16 Thread Romain Manni-Bucau (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439412#comment-16439412
 ] 

Romain Manni-Bucau commented on BEAM-4016:
--

PS: don't forget the teardown mapping for any instance (or use the same caching 
hack direct runer has)

> Direct runner incorrect lifecycle, @SplitRestriction should execute after 
> @Setup on SplittableDoFn
> --
>
> Key: BEAM-4016
> URL: https://issues.apache.org/jira/browse/BEAM-4016
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.4.0
>Reporter: Ismaël Mejía
>Assignee: Thomas Groh
>Priority: Major
> Attachments: sdf-splitrestriction-lifeycle-test.patch
>
>
> The method annotated with @SplitRestriction is the method where we can define 
> the RestrictionTrackers (splits) in advance in a SDF. It makes sense to 
> execute this after the @Setup method given that usually connections are 
> established at Setup and can be used to ask the different data stores about 
> the partitioning strategy. I added a test for this in the 
> SplittableDoFnTest.SDFWithLifecycle test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (BEAM-4088) ExecutorServiceParallelExecutorTest#ensureMetricsThreadDoesntLeak in PR #4965 does not pass in gradle

2018-04-16 Thread Etienne Chauchot (JIRA)

Etienne Chauchot created BEAM-4088:
--

 Summary: 
ExecutorServiceParallelExecutorTest#ensureMetricsThreadDoesntLeak in PR #4965 
does not pass in gradle
 Key: BEAM-4088
 URL: https://issues.apache.org/jira/browse/BEAM-4088
 Project: Beam
  Issue Type: Sub-task
  Components: testing
Reporter: Etienne Chauchot
Assignee: Romain Manni-Bucau


This test always fail using gradle but always pass using maven



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4049) Improve write throughput of CassandraIO

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4049?focusedWorklogId=91333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91333
 ]

ASF GitHub Bot logged work on BEAM-4049:


Author: ASF GitHub Bot
Created on: 16/Apr/18 13:19
Start Date: 16/Apr/18 13:19
Worklog Time Spent: 10m 
  Work Description: adejanovski commented on issue #5112: [BEAM-4049] 
Improve CassandraIO write throughput by performing async queries
URL: https://github.com/apache/beam/pull/5112#issuecomment-381596378
 
 
   Thanks @iemejia !


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91333)
Time Spent: 3.5h  (was: 3h 20m)

> Improve write throughput of CassandraIO
> ---
>
> Key: BEAM-4049
> URL: https://issues.apache.org/jira/browse/BEAM-4049
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-cassandra
>Affects Versions: 2.4.0
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: performance
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The CassandraIO currently uses the mapper to perform writes in a synchronous 
> fashion. 
> This implies that writes are serialized and is a very suboptimal way of 
> writing to Cassandra.
> The IO should use the saveAsync() method instead of save() and should wait 
> for completion each time 100 queries are in flight, in order to avoid 
> overwhelming clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2990) support data type MAP

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2990?focusedWorklogId=91401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91401
 ]

ASF GitHub Bot logged work on BEAM-2990:


Author: ASF GitHub Bot
Created on: 16/Apr/18 17:03
Start Date: 16/Apr/18 17:03
Worklog Time Spent: 10m 
  Work Description: XuMingmin commented on issue #5079: [BEAM-2990] support 
MAP in SQL schema
URL: https://github.com/apache/beam/pull/5079#issuecomment-381677564
 
 
   Appreciate @reuvenlax , squash and merging


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91401)
Time Spent: 6h  (was: 5h 50m)

> support data type MAP
> -
>
> Key: BEAM-2990
> URL: https://issues.apache.org/jira/browse/BEAM-2990
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> support Non-scalar types:
> MAP   Collection of keys mapped to values
> ARRAY Ordered, contiguous collection that may contain duplicates



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PerformanceTests_MongoDBIO_IT #58

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[tgroh] Update Dataflow Development Container Version

[github] Add region to dataflowOptions as well.

[tgroh] Use Explicit PipelineOptions in Native Evaluators

--
[...truncated 53.20 KB...]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches ReadPreferenceServerSelector{readPreference=primary}. 
Client view of cluster state is {type=UNKNOWN, 
servers=[{address=35.224.18.180:27017, type=UNKNOWN, state=CONNECTING, 
exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, 
caused by {java.net.SocketTimeoutException: connect timed out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:89)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
at 
com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
at com.mongodb.Mongo.execute(Mongo.java:772)
at com.mongodb.Mongo$2.execute(Mongo.java:759)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.splitAndValidate(WorkerCustomSources.java:275)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplitTyped(WorkerCustomSources.java:197)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplitWithApiLimit(WorkerCustomSources.java:181)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplit(WorkerCustomSources.java:160)
at 
com.google.cloud.dataflow.worker.WorkerCustomSourceOperationExecutor.execute(WorkerCustomSourceOperationExecutor.java:75)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:381)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:353)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:284)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches ReadPreferenceServerSelector{readPreference=primary}. 
Client view of cluster state is {type=UNKNOWN, 
servers=[{address=35.224.18.180:27017, type=UNKNOWN, state=CONNECTING, 
exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, 
caused by {java.net.SocketTimeoutException: connect timed out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:89)
at

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91389=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91389
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 16:37
Start Date: 16/Apr/18 16:37
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-381656696
 
 
   Run Python PostCommit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91389)
Time Spent: 7.5h  (was: 7h 20m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (BEAM-4089) Report Nexmark runs to BQ dashboard for anomoly detection

2018-04-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439706#comment-16439706
 ] 

Kenneth Knowles commented on BEAM-4089:
---

Assigned to you just because you already made most progress on the runs. We can 
sync up on it.

> Report Nexmark runs to BQ dashboard for anomoly detection
> -
>
> Key: BEAM-4089
> URL: https://issues.apache.org/jira/browse/BEAM-4089
> Project: Beam
>  Issue Type: New Feature
>  Components: examples-nexmark
>Reporter: Kenneth Knowles
>Assignee: Etienne Chauchot
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark_Gradle #106

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[tgroh] Update Dataflow Development Container Version

--
[...truncated 1.24 MB...]
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:627)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:626)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:828)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$.getOrCreate(JavaStreamingContext.scala:626)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext.getOrCreate(JavaStreamingContext.scala)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:169)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:123)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:83)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:346)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:328)
at 
org.apache.beam.runners.spark.translation.streaming.CreateStreamTest.testFirstElementLate(CreateStreamTest.java:240)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:317)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy3.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:108)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #83

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[github] Add region to dataflowOptions as well.

[tgroh] Use Explicit PipelineOptions in Native Evaluators

--
[...truncated 19.48 MB...]
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/GroupByKey as step s16
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues as step s17
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow) as step s18
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly as step s19
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParMultiDo(ToIsmRecordForMapLike) as step s20
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForSize as step s21
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParDo(ToIsmMetadataRecordForSize) as step s22
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForKeys as step s23
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParDo(ToIsmMetadataRecordForKey) as step s24
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/Flatten.PCollections as step s25
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/CreateDataflowView as step s26
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Partition 
input as step s27
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Group by 
partition as step s28
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Batch 
mutations together as step s29
Apr 16, 2018 5:22:56 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Write 
mutations to Spanner as step s30
Apr 16, 2018 5:22:56 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testwrite-jenkins-0416172246-25558d21/output/results/staging/
Apr 16, 2018 5:22:56 PM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <80355 bytes, hash 3ZGb2onzcFAGzUL0iD3mMw> to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testwrite-jenkins-0416172246-25558d21/output/results/staging/pipeline-3ZGb2onzcFAGzUL0iD3mMw.pb

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_OUT
Dataflow SDK version: 2.5.0-SNAPSHOT

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_ERROR
Apr 16, 2018 5:22:58 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-04-16_10_22_57-12146727514874577247?project=apache-beam-testing

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_OUT
Submitted job: 2018-04-16_10_22_57-12146727514874577247

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_ERROR
Apr 16, 2018 5:22:58 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To cancel the job using the 'gcloud' tool,

Build failed in Jenkins: beam_PerformanceTests_HadoopInputFormat #149

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[tgroh] Update Dataflow Development Container Version

[github] Add region to dataflowOptions as well.

[tgroh] Use Explicit PipelineOptions in Native Evaluators

--
[...truncated 148.52 KB...]
[INFO] Excluding com.google.cloud.bigdataoss:gcsio:jar:1.4.5 from the shaded 
jar.
[INFO] Excluding 
com.google.apis:google-api-services-cloudresourcemanager:jar:v1-rev6-1.22.0 
from the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.5.0-SNAPSHOT from 
the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.5.0-SNAPSHOT from the 
shaded jar.
[INFO] Excluding io.grpc:grpc-core:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.errorprone:error_prone_annotations:jar:2.0.15 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-context:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.instrumentation:instrumentation-api:jar:0.3.0 from 
the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-bigquery:jar:v2-rev374-1.22.0 from the 
shaded jar.
[INFO] Excluding com.google.api:gax-grpc:jar:0.20.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.api:api-common:jar:1.0.0-rc2 from the shaded jar.
[INFO] Excluding com.google.api:gax:jar:1.3.1 from the shaded jar.
[INFO] Excluding org.threeten:threetenbp:jar:1.3.3 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core-grpc:jar:1.2.0 from the 
shaded jar.
[INFO] Excluding com.google.apis:google-api-services-pubsub:jar:v1-rev10-1.22.0 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-iam-v1:jar:0.1.18 from the 
shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-protobuf:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-auth:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-netty:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http2:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler-proxy:jar:4.1.8.Final from the shaded 
jar.
[INFO] Excluding io.netty:netty-codec-socks:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core:jar:1.0.2 from the shaded 
jar.
[INFO] Excluding org.json:json:jar:20160810 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-spanner:jar:0.20.0b-beta from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-spanner-v1:jar:0.1.11b 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:proto-google-cloud-spanner-admin-instance-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-spanner-v1:jar:0.1.11b 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:grpc-google-cloud-spanner-admin-database-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding 
com.google.api.grpc:grpc-google-cloud-spanner-admin-instance-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-longrunning-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-longrunning-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding com.google.cloud.bigtable:bigtable-protos:jar:1.0.0-pre3 from 
the shaded jar.
[INFO] Excluding com.google.cloud.bigtable:bigtable-client-core:jar:1.0.0 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-appengine:jar:0.7.0 from 
the shaded jar.
[INFO] Excluding io.opencensus:opencensus-contrib-grpc-util:jar:0.7.0 from the 
shaded jar.
[INFO] Excluding io.opencensus:opencensus-api:jar:0.7.0 from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:proto-google-cloud-spanner-admin-database-v1:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-all:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-okhttp:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.squareup.okhttp:okhttp:jar:2.5.0 from the shaded jar.
[INFO] Excluding com.squareup.okio:okio:jar:1.6.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-lite:jar:1.2.0 from the shaded jar.

Build failed in Jenkins: beam_PerformanceTests_Python #1156

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[tgroh] Update Dataflow Development Container Version

[github] Add region to dataflowOptions as well.

[tgroh] Use Explicit PipelineOptions in Native Evaluators

--
[...truncated 71.67 KB...]
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec   
0.038s
[INFO] ?
github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec/optimized [no 
test files]
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam/core/runtime/graphx 
0.045s
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/core/runtime/graphx/v1  
[no test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/core/runtime/harness
[no test files]
[INFO] ?
github.com/apache/beam/sdks/go/pkg/beam/core/runtime/harness/init   [no 
test files]
[INFO] ?
github.com/apache/beam/sdks/go/pkg/beam/core/runtime/harness/session[no 
test files]
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam/core/typex  0.025s
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/core/util/dot   [no 
test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/core/util/hooks [no 
test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/core/util/ioutilx   
[no test files]
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam/core/util/protox
0.042s
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/core/util/reflectx  
[no test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/core/util/symtab
[no test files]
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam/io/bigqueryio   0.052s
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/io/textio   [no 
test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/io/textio/gcs   [no 
test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/io/textio/local [no 
test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/log [no test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/model   [no test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/model/fnexecution_v1
[no test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/model/jobmanagement_v1  
[no test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/model/pipeline_v1   
[no test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/options/gcpopts [no 
test files]
[INFO] ?github.com/apache/beam/sdks/go/pkg/beam/options/jobopts [no 
test files]
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam/provision   0.059s
[INFO] 
[ERROR] 
[ERROR] -Exec.Err-
[ERROR] # github.com/apache/beam/sdks/go/pkg/beam/runners/dataflow
[ERROR] 
github.com/apache/beam/sdks/go/pkg/beam/runners/dataflow/dataflow.go:171:12: 
unknown field 'Region' in struct literal of type dataflowOptions
[ERROR] 
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Beam :: Parent .. SUCCESS [  8.932 s]
[INFO] Apache Beam :: SDKs :: Java :: Build Tools . SUCCESS [  8.486 s]
[INFO] Apache Beam :: Model ... SUCCESS [  0.198 s]
[INFO] Apache Beam :: Model :: Pipeline ... SUCCESS [ 24.023 s]
[INFO] Apache Beam :: Model :: Job Management . SUCCESS [ 11.817 s]
[INFO] Apache Beam :: Model :: Fn Execution ... SUCCESS [ 14.075 s]
[INFO] Apache Beam :: SDKs  SUCCESS [  0.383 s]
[INFO] Apache Beam :: SDKs :: Go .. FAILURE [02:21 min]
[INFO] Apache Beam :: SDKs :: Go :: Container . SKIPPED
[INFO] Apache Beam :: SDKs :: Java  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Core  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Fn Execution  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Extensions .. SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Google Cloud Platform Core 
SKIPPED
[INFO] Apache Beam :: Runners . SKIPPED
[INFO] Apache Beam :: Runners :: Core Construction Java ... SKIPPED
[INFO] Apache Beam :: Runners :: Core Java  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Harness . SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Container ... SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: IO .. SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: IO :: Amazon Web Services SKIPPED
[INFO] Apache Beam :: Runners :: Local Java Core .. SKIPPED
[INFO] Apache Beam :: Runners :: Direct Java .. SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: IO :: AMQP .. SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: IO :: Common  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: IO :: Cassandra .

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91426=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91426
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 18:26
Start Date: 16/Apr/18 18:26
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-381702920
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91426)
Time Spent: 8h  (was: 7h 50m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4070) Disable cython profiling by default

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4070?focusedWorklogId=91428=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91428
 ]

ASF GitHub Bot logged work on BEAM-4070:


Author: ASF GitHub Bot
Created on: 16/Apr/18 18:29
Start Date: 16/Apr/18 18:29
Worklog Time Spent: 10m 
  Work Description: boyuanzz opened a new pull request #5134: [BEAM-4070]: 
Make cython: profile=False by default
URL: https://github.com/apache/beam/pull/5134
 
 
   Make cython profiling disabled by default
   [BEAM-4070](https://issues.apache.org/jira/browse/BEAM-4070)
   
   R: @aaltay 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91428)
Time Spent: 10m
Remaining Estimate: 0h

> Disable cython profiling by default
> ---
>
> Key: BEAM-4070
> URL: https://issues.apache.org/jira/browse/BEAM-4070
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Enabling cython profiling adds some overhead.
> http://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark_Gradle #105

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[github] Add region to dataflowOptions as well.

[tgroh] Use Explicit PipelineOptions in Native Evaluators

--
[...truncated 1.24 MB...]
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:627)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:626)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:828)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$.getOrCreate(JavaStreamingContext.scala:626)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext.getOrCreate(JavaStreamingContext.scala)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:169)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:123)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:83)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:346)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:328)
at 
org.apache.beam.runners.spark.translation.streaming.CreateStreamTest.testFirstElementLate(CreateStreamTest.java:240)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:317)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy3.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:108)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at

[jira] [Created] (BEAM-4089) Report Nexmark runs to BQ dashboard for anomoly detection

2018-04-16 Thread Kenneth Knowles (JIRA)

Kenneth Knowles created BEAM-4089:
-

 Summary: Report Nexmark runs to BQ dashboard for anomoly detection
 Key: BEAM-4089
 URL: https://issues.apache.org/jira/browse/BEAM-4089
 Project: Beam
  Issue Type: New Feature
  Components: examples-nexmark
Reporter: Kenneth Knowles
Assignee: Etienne Chauchot






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1368

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[tgroh] Update Dataflow Development Container Version

--
[...truncated 4.35 KB...]
export PS1
fi
basename "$VIRTUAL_ENV"

# Make sure to unalias pydoc if it's already there
alias pydoc 2>/dev/null >/dev/null && unalias pydoc

pydoc () {
python -m pydoc "$@"
}

# This should detect bash and zsh, which have a hash command that must
# be called to get it to forget past commands.  Without forgetting
# past commands the $PATH changes we made may not be respected
if [ -n "${BASH-}" ] || [ -n "${ZSH_VERSION-}" ] ; then
hash -r 2>/dev/null
fi
cd sdks/python
pip install -e .[gcp,test]
Obtaining 
file://
Collecting avro<2.0.0,>=1.8.1 (from apache-beam==2.5.0.dev0)
:339:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting crcmod<2.0,>=1.7 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting dill==0.2.6 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting grpcio<2,>=1.8 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
  Using cached 
https://files.pythonhosted.org/packages/0d/54/b647a6323be6526be27b2c90bb042769f1a7a6e59bd1a5f2eeb795bfece4/grpcio-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting hdfs<3.0.0,>=2.1.0 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer

Build failed in Jenkins: beam_PostCommit_Python_Verify #4703

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[tgroh] Update Dataflow Development Container Version

--
Started by GitHub push by tgroh
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 96615a1c5dfc2e67c2d408cc4f589952036988e8 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 96615a1c5dfc2e67c2d408cc4f589952036988e8
Commit message: "Merge pull request #5085: Update Dataflow Development 
Container Version"
 > git rev-list --no-walk 0ccdd54e6fc82b484cd25b49d60892a807d4c7e6 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PostCommit_Python_Verify] $ /bin/bash -xe 
/tmp/jenkins8870824114056765258.sh
+ cd src
+ bash sdks/python/run_postcommit.sh

# pip install --user installation location.
LOCAL_PATH=$HOME/.local/bin/

# Remove any tox cache from previous workspace
# TODO(udim): Remove this line and add '-r' to tox invocation instead.
rm -rf sdks/python/target/.tox

# INFRA does not install these packages
pip install --user --upgrade virtualenv tox
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:339: 
SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:137: 
InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:137: 
InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Requirement already up-to-date: virtualenv in 
/home/jenkins/.local/lib/python2.7/site-packages (15.2.0)
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:137: 
InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Requirement already up-to-date: tox in 
/home/jenkins/.local/lib/python2.7/site-packages (3.0.0)
Requirement not upgraded as not directly required: py>=1.4.17 in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (1.5.3)
Requirement not upgraded as not directly required: pluggy<1.0,>=0.3.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (0.6.0)
Requirement not upgraded as not directly required: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (1.11.0)
cheetah 2.4.4 requires Markdown>=2.0.1, which is not installed.
apache-beam 2.5.0.dev0 requires hdfs<3.0.0,>=2.1.0, which is not installed.
apache-beam 2.5.0.dev0 requires pytz>=2018.3, which is not installed.
apache-beam 2.5.0.dev0 has requirement grpcio<2,>=1.8, but you'll have grpcio 
1.4.0 which is incompatible.

# Tox runs unit tests in a virtual environment
${LOCAL_PATH}/tox -e ALL -c sdks/python/tox.ini
GLOB sdist-make:

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91421=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91421
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 17:55
Start Date: 16/Apr/18 17:55
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-381693446
 
 
   Retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91421)
Time Spent: 7h 50m  (was: 7h 40m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91420=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91420
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 17:55
Start Date: 16/Apr/18 17:55
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-381668215
 
 
   Retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91420)
Time Spent: 7h 40m  (was: 7.5h)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Jenkins build is back to normal : beam_PerformanceTests_XmlIOIT_HDFS #56

2018-04-16 Thread Apache Jenkins Server

See

Jenkins build is back to normal : beam_PerformanceTests_XmlIOIT #152

2018-04-16 Thread Apache Jenkins Server

See

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1366

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[github] Add region to dataflowOptions as well.

[tgroh] Use Explicit PipelineOptions in Native Evaluators

--
[...truncated 4.35 KB...]
export PS1
fi
basename "$VIRTUAL_ENV"

# Make sure to unalias pydoc if it's already there
alias pydoc 2>/dev/null >/dev/null && unalias pydoc

pydoc () {
python -m pydoc "$@"
}

# This should detect bash and zsh, which have a hash command that must
# be called to get it to forget past commands.  Without forgetting
# past commands the $PATH changes we made may not be respected
if [ -n "${BASH-}" ] || [ -n "${ZSH_VERSION-}" ] ; then
hash -r 2>/dev/null
fi
cd sdks/python
pip install -e .[gcp,test]
Obtaining 
file://
Collecting avro<2.0.0,>=1.8.1 (from apache-beam==2.5.0.dev0)
:339:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting crcmod<2.0,>=1.7 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting dill==0.2.6 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting grpcio<2,>=1.8 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
  Using cached 
https://files.pythonhosted.org/packages/0d/54/b647a6323be6526be27b2c90bb042769f1a7a6e59bd1a5f2eeb795bfece4/grpcio-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting hdfs<3.0.0,>=2.1.0 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL

Build failed in Jenkins: beam_PostCommit_Python_Verify #4702

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[github] Add region to dataflowOptions as well.

[tgroh] Use Explicit PipelineOptions in Native Evaluators

--
Started by GitHub push by tgroh
Started by GitHub push by tgroh
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 0ccdd54e6fc82b484cd25b49d60892a807d4c7e6 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 0ccdd54e6fc82b484cd25b49d60892a807d4c7e6
Commit message: "Merge pull request #5125: Use Explicit PipelineOptions in 
Native Evaluators"
 > git rev-list --no-walk 9e3e9c4d0a0dc1574c8956c7f8379b37ba262cb2 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PostCommit_Python_Verify] $ /bin/bash -xe 
/tmp/jenkins3781398443548815187.sh
+ cd src
+ bash sdks/python/run_postcommit.sh

# pip install --user installation location.
LOCAL_PATH=$HOME/.local/bin/

# Remove any tox cache from previous workspace
# TODO(udim): Remove this line and add '-r' to tox invocation instead.
rm -rf sdks/python/target/.tox

# INFRA does not install these packages
pip install --user --upgrade virtualenv tox
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:339: 
SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:137: 
InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:137: 
InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Requirement already up-to-date: virtualenv in 
/home/jenkins/.local/lib/python2.7/site-packages (15.2.0)
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:137: 
InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Requirement already up-to-date: tox in 
/home/jenkins/.local/lib/python2.7/site-packages (3.0.0)
Requirement not upgraded as not directly required: py>=1.4.17 in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (1.5.3)
Requirement not upgraded as not directly required: pluggy<1.0,>=0.3.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (0.6.0)
Requirement not upgraded as not directly required: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (1.11.0)
cheetah 2.4.4 requires Markdown>=2.0.1, which is not installed.
apache-beam 2.5.0.dev0 requires hdfs<3.0.0,>=2.1.0, which is not installed.
apache-beam 2.5.0.dev0 requires pytz>=2018.3, which is not installed.
apache-beam 2.5.0.dev0 has requirement grpcio<2,>=1.8, but you'll have grpcio 
1.4.0 which is incompatible.

# Tox runs unit tests in a virtual environment

Build failed in Jenkins: beam_PostRelease_NightlySnapshot #196

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 2.87 MB...]
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-compiler-manager/2.8.2/plexus-compiler-manager-2.8.2.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/commons-io/commons-io/2.5/commons-io-2.5.jar
 (209 kB at 5.5 MB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-compiler-javac/2.8.2/plexus-compiler-javac-2.8.2.jar
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/ow2/asm/asm/6.0_BETA/asm-6.0_BETA.jar 
(56 kB at 1.1 MB/s)
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-compiler-manager/2.8.2/plexus-compiler-manager-2.8.2.jar
 (4.7 kB at 87 kB/s)
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-compiler-api/2.8.2/plexus-compiler-api-2.8.2.jar
 (26 kB at 480 kB/s)
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-compiler-javac/2.8.2/plexus-compiler-javac-2.8.2.jar
 (20 kB at 340 kB/s)
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/com/thoughtworks/qdox/qdox/2.0-M7/qdox-2.0-M7.jar
 (315 kB at 5.1 MB/s)
[INFO] Changes detected - recompiling the module!
[WARNING] File encoding has not been set, using platform encoding UTF-8, i.e. 
build is platform dependent!
[INFO] Compiling 31 source files to 
/tmp/groovy-generated-2797868348171501982-tmpdir/word-count-beam/target/classes
[INFO] 
/tmp/groovy-generated-2797868348171501982-tmpdir/word-count-beam/src/main/java/org/apache/beam/examples/complete/game/utils/WriteToText.java:
 
/tmp/groovy-generated-2797868348171501982-tmpdir/word-count-beam/src/main/java/org/apache/beam/examples/complete/game/utils/WriteToText.java
 uses unchecked or unsafe operations.
[INFO] 
/tmp/groovy-generated-2797868348171501982-tmpdir/word-count-beam/src/main/java/org/apache/beam/examples/complete/game/utils/WriteToText.java:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ word-count-beam ---
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-toolchain/2.2.1/maven-toolchain-2.2.1.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-toolchain/2.2.1/maven-toolchain-2.2.1.pom
 (3.3 kB at 134 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-api/2.2.1/maven-reporting-api-2.2.1.pom
 (1.9 kB at 81 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting/2.2.1/maven-reporting-2.2.1.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting/2.2.1/maven-reporting-2.2.1.pom
 (1.4 kB at 63 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-sink-api/1.1/doxia-sink-api-1.1.pom
 (2.0 kB at 85 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia/1.1/doxia-1.1.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia/1.1/doxia-1.1.pom
 (15 kB at 632 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-logging-api/1.1/doxia-logging-api-1.1.pom
 (1.6 kB at 68 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-container-default/1.0-alpha-30/plexus-container-default-1.0-alpha-30.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-container-default/1.0-alpha-30/plexus-container-default-1.0-alpha-30.pom
 (3.5 kB at 151 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-containers/1.0-alpha-30/plexus-containers-1.0-alpha-30.pom
[INFO] Downloaded from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-containers/1.0-alpha-30/plexus-containers-1.0-alpha-30.pom
 (1.9 kB at 82 kB/s)
[INFO] Downloading from central: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/1.4.5/plexus-utils-1.4.5.pom
[INFO] Downloaded from central:

Jenkins build is back to normal : beam_PerformanceTests_Compressed_TextIOIT_HDFS #56

2018-04-16 Thread Apache Jenkins Server

See

Build failed in Jenkins: beam_PerformanceTests_XmlIOIT #151

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 28.45 KB...]
[INFO] Excluding 
com.google.apis:google-api-services-storage:jar:v1-rev71-1.22.0 from the shaded 
jar.
[INFO] Excluding com.google.auth:google-auth-library-credentials:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-oauth2-http:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.cloud.bigdataoss:util:jar:1.4.5 from the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-java6:jar:1.22.0 from 
the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-jackson2:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client-java6:jar:1.22.0 
from the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.5.0-SNAPSHOT from 
the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.5.0-SNAPSHOT from the 
shaded jar.
[INFO] Excluding io.grpc:grpc-core:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.errorprone:error_prone_annotations:jar:2.0.15 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-context:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.instrumentation:instrumentation-api:jar:0.3.0 from 
the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-bigquery:jar:v2-rev374-1.22.0 from the 
shaded jar.
[INFO] Excluding com.google.api:gax-grpc:jar:0.20.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.api:api-common:jar:1.0.0-rc2 from the shaded jar.
[INFO] Excluding com.google.auto.value:auto-value:jar:1.5.3 from the shaded jar.
[INFO] Excluding com.google.api:gax:jar:1.3.1 from the shaded jar.
[INFO] Excluding org.threeten:threetenbp:jar:1.3.3 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core-grpc:jar:1.2.0 from the 
shaded jar.
[INFO] Excluding com.google.protobuf:protobuf-java-util:jar:3.2.0 from the 
shaded jar.
[INFO] Excluding com.google.code.gson:gson:jar:2.7 from the shaded jar.
[INFO] Excluding com.google.apis:google-api-services-pubsub:jar:v1-rev10-1.22.0 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-iam-v1:jar:0.1.18 from the 
shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-protobuf:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-auth:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-netty:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http2:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler-proxy:jar:4.1.8.Final from the shaded 
jar.
[INFO] Excluding io.netty:netty-codec-socks:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-buffer:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-common:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-transport:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-resolver:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-all:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-okhttp:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.squareup.okhttp:okhttp:jar:2.5.0 from the shaded jar.
[INFO] Excluding com.squareup.okio:okio:jar:1.6.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-lite:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-nano:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.protobuf.nano:protobuf-javanano:jar:3.0.0-alpha-5 
from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core:jar:1.0.2 from the shaded 
jar.
[INFO] Excluding org.json:json:jar:20160810 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-spanner:jar:0.20.0b-beta from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-spanner-v1:jar:0.1.11b 
from the shaded jar.
[INFO] Excluding

[jira] [Commented] (BEAM-4016) Direct runner incorrect lifecycle, @SplitRestriction should execute after @Setup on SplittableDoFn

2018-04-16 Thread JIRA


[ 
https://issues.apache.org/jira/browse/BEAM-4016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439429#comment-16439429
 ] 

Ismaël Mejía commented on BEAM-4016:


I tried that workaround but it ends up calling the setup method more times than 
expected. Any other ideas, can you take a look? or help me reassign to someone 
who can (I have the impression [~tgroh] is not available, and I cannot take a 
serious look into this for now).

> Direct runner incorrect lifecycle, @SplitRestriction should execute after 
> @Setup on SplittableDoFn
> --
>
> Key: BEAM-4016
> URL: https://issues.apache.org/jira/browse/BEAM-4016
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.4.0
>Reporter: Ismaël Mejía
>Assignee: Thomas Groh
>Priority: Major
> Attachments: sdf-splitrestriction-lifeycle-test.patch
>
>
> The method annotated with @SplitRestriction is the method where we can define 
> the RestrictionTrackers (splits) in advance in a SDF. It makes sense to 
> execute this after the @Setup method given that usually connections are 
> established at Setup and can be used to ask the different data stores about 
> the partitioning strategy. I added a test for this in the 
> SplittableDoFnTest.SDFWithLifecycle test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Python_Verify #4701

2018-04-16 Thread Apache Jenkins Server

See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 9e3e9c4d0a0dc1574c8956c7f8379b37ba262cb2 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 9e3e9c4d0a0dc1574c8956c7f8379b37ba262cb2
Commit message: "This closes #5028"
 > git rev-list --no-walk 9e3e9c4d0a0dc1574c8956c7f8379b37ba262cb2 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PostCommit_Python_Verify] $ /bin/bash -xe 
/tmp/jenkins1068231309557989319.sh
+ cd src
+ bash sdks/python/run_postcommit.sh

# pip install --user installation location.
LOCAL_PATH=$HOME/.local/bin/

# Remove any tox cache from previous workspace
# TODO(udim): Remove this line and add '-r' to tox invocation instead.
rm -rf sdks/python/target/.tox

# INFRA does not install these packages
pip install --user --upgrade virtualenv tox
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:339: 
SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:137: 
InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Requirement already up-to-date: virtualenv in 
/home/jenkins/.local/lib/python2.7/site-packages (15.2.0)
Requirement already up-to-date: tox in 
/home/jenkins/.local/lib/python2.7/site-packages (3.0.0)
Requirement not upgraded as not directly required: py>=1.4.17 in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (1.5.3)
Requirement not upgraded as not directly required: pluggy<1.0,>=0.3.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (0.6.0)
Requirement not upgraded as not directly required: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (1.11.0)
cheetah 2.4.4 requires Markdown>=2.0.1, which is not installed.
apache-beam 2.5.0.dev0 requires hdfs<3.0.0,>=2.1.0, which is not installed.
apache-beam 2.5.0.dev0 requires pytz>=2018.3, which is not installed.
apache-beam 2.5.0.dev0 has requirement grpcio<2,>=1.8, but you'll have grpcio 
1.4.0 which is incompatible.

# Tox runs unit tests in a virtual environment
${LOCAL_PATH}/tox -e ALL -c sdks/python/tox.ini
GLOB sdist-make: 

ERROR: invocation failed (exit code 1), logfile: 

ERROR: actionid: tox
msg: packaging
cmdargs: ['/usr/bin/python', 
local('
 'sdist', '--formats=zip', '--dist-dir', 
local('

/usr/local/lib/python2.7/dist-packages/setuptools/dist.py:397: UserWarning: 
Normalizing '2.5.0.dev' to '2.5.0.dev0'
  normalized_version,
Regenerating common_urns module.
running sdist
:51:
 UserWarning: Installing grpcio-tools is recommended for development.
  warnings.warn('Installing grpcio-tools is recommended for development.')
WARNING:root:Installing grpcio-tools into

[jira] [Work logged] (BEAM-4049) Improve write throughput of CassandraIO

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4049?focusedWorklogId=91362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91362
 ]

ASF GitHub Bot logged work on BEAM-4049:


Author: ASF GitHub Bot
Created on: 16/Apr/18 15:16
Start Date: 16/Apr/18 15:16
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5112: [BEAM-4049] Improve 
CassandraIO write throughput by performing async queries
URL: https://github.com/apache/beam/pull/5112#issuecomment-381622622
 
 
   @adejanovski 
   > I'm not sure why the tests are failing in Jenkins now since I just changed 
the comment style on CONCURRENT_ASYNC_QUERIES.
   Maybe it's unrelated to my push ?
   
   I don't know if it is the case here, but please know for the future that 
sometimes the jenkins build hits the misconfigured beam5 machine; so errors 
might be unrelated to the code sometimes


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91362)
Time Spent: 3h 50m  (was: 3h 40m)

> Improve write throughput of CassandraIO
> ---
>
> Key: BEAM-4049
> URL: https://issues.apache.org/jira/browse/BEAM-4049
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-cassandra
>Affects Versions: 2.4.0
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: performance
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The CassandraIO currently uses the mapper to perform writes in a synchronous 
> fashion. 
> This implies that writes are serialized and is a very suboptimal way of 
> writing to Cassandra.
> The IO should use the saveAsync() method instead of save() and should wait 
> for completion each time 100 queries are in flight, in order to avoid 
> overwhelming clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3914) 'Unzip' flattens before performing fusion

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3914?focusedWorklogId=91379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91379
 ]

ASF GitHub Bot logged work on BEAM-3914:


Author: ASF GitHub Bot
Created on: 16/Apr/18 16:10
Start Date: 16/Apr/18 16:10
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #4977: [BEAM-3914] 
Deduplicate Unzipped Flattens after Pipeline Fusion
URL: https://github.com/apache/beam/pull/4977#issuecomment-381660211
 
 
   Looks like this needs a rebase.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91379)
Time Spent: 1h 50m  (was: 1h 40m)

> 'Unzip' flattens before performing fusion
> -
>
> Key: BEAM-3914
> URL: https://issues.apache.org/jira/browse/BEAM-3914
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>Priority: Major
>  Labels: portability
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This consists of duplicating nodes downstream of a flatten that exist within 
> an environment, and reintroducing the flatten immediately upstream of a 
> runner-executed transform (the flatten should be executed within the runner)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam] branch master updated (0ccdd54 -> 96615a1)

2018-04-16 Thread tgroh

This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 0ccdd54  Merge pull request #5125: Use Explicit PipelineOptions in 
Native Evaluators
 add 748f9e5  Update Dataflow Development Container Version
 new 96615a1  Merge pull request #5085: Update Dataflow Development 
Container Version

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 runners/google-cloud-dataflow-java/build.gradle | 2 +-
 runners/google-cloud-dataflow-java/pom.xml  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.

[beam] 01/01: Merge pull request #5085: Update Dataflow Development Container Version

2018-04-16 Thread tgroh

This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 96615a1c5dfc2e67c2d408cc4f589952036988e8
Merge: 0ccdd54 748f9e5
Author: Thomas Groh 
AuthorDate: Mon Apr 16 09:16:12 2018 -0700

Merge pull request #5085: Update Dataflow Development Container Version

 runners/google-cloud-dataflow-java/build.gradle | 2 +-
 runners/google-cloud-dataflow-java/pom.xml  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1367

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 4.30 KB...]
PS1="(`basename \"$VIRTUAL_ENV\"`) $PS1"
fi
export PS1
fi
basename "$VIRTUAL_ENV"

# Make sure to unalias pydoc if it's already there
alias pydoc 2>/dev/null >/dev/null && unalias pydoc

pydoc () {
python -m pydoc "$@"
}

# This should detect bash and zsh, which have a hash command that must
# be called to get it to forget past commands.  Without forgetting
# past commands the $PATH changes we made may not be respected
if [ -n "${BASH-}" ] || [ -n "${ZSH_VERSION-}" ] ; then
hash -r 2>/dev/null
fi
cd sdks/python
pip install -e .[gcp,test]
Obtaining 
file://
Collecting avro<2.0.0,>=1.8.1 (from apache-beam==2.5.0.dev0)
:339:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting crcmod<2.0,>=1.7 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting dill==0.2.6 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting grpcio<2,>=1.8 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
  Using cached 
https://files.pythonhosted.org/packages/0d/54/b647a6323be6526be27b2c90bb042769f1a7a6e59bd1a5f2eeb795bfece4/grpcio-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting hdfs<3.0.0,>=2.1.0 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting httplib2<0.10,>=0.8 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to

[jira] [Work logged] (BEAM-3792) Python submits portable pipelines to the Flink-served endpoint.

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3792?focusedWorklogId=91381=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91381
 ]

ASF GitHub Bot logged work on BEAM-3792:


Author: ASF GitHub Bot
Created on: 16/Apr/18 16:13
Start Date: 16/Apr/18 16:13
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #4811: [BEAM-3792] Allow 
manual specification of external address for ULR.
URL: https://github.com/apache/beam/pull/4811#issuecomment-381660921
 
 
   Jenkins: retest this please. 
   
   Error warning: no files found matching 'protoc_deps.py' seems unrelated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91381)
Time Spent: 40m  (was: 0.5h)

> Python submits portable pipelines to the Flink-served endpoint.
> ---
>
> Key: BEAM-3792
> URL: https://issues.apache.org/jira/browse/BEAM-3792
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4049) Improve write throughput of CassandraIO

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4049?focusedWorklogId=91351=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91351
 ]

ASF GitHub Bot logged work on BEAM-4049:


Author: ASF GitHub Bot
Created on: 16/Apr/18 14:40
Start Date: 16/Apr/18 14:40
Worklog Time Spent: 10m 
  Work Description: echauchot commented on issue #5112: [BEAM-4049] Improve 
CassandraIO write throughput by performing async queries
URL: https://github.com/apache/beam/pull/5112#issuecomment-381622622
 
 
   @adejanovski 
   > I'm not sure why the tests are failing in Jenkins now since I just changed 
the comment style on CONCURRENT_ASYNC_QUERIES.
   Maybe it's unrelated to my push ?
   I don't know if it is the case here, but please know for the future that 
sometimes the jenkins build hits the misconfigured beam5 machine; so errors 
might be unrelated to the code sometimes


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91351)
Time Spent: 3h 40m  (was: 3.5h)

> Improve write throughput of CassandraIO
> ---
>
> Key: BEAM-4049
> URL: https://issues.apache.org/jira/browse/BEAM-4049
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-cassandra
>Affects Versions: 2.4.0
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: performance
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> The CassandraIO currently uses the mapper to perform writes in a synchronous 
> fashion. 
> This implies that writes are serialized and is a very suboptimal way of 
> writing to Cassandra.
> The IO should use the saveAsync() method instead of save() and should wait 
> for completion each time 100 queries are in flight, in order to avoid 
> overwhelming clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91376=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91376
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 16:00
Start Date: 16/Apr/18 16:00
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-381656696
 
 
   Run Python PostCommit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91376)
Time Spent: 7h 10m  (was: 7h)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3914) 'Unzip' flattens before performing fusion

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3914?focusedWorklogId=91380=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91380
 ]

ASF GitHub Bot logged work on BEAM-3914:


Author: ASF GitHub Bot
Created on: 16/Apr/18 16:12
Start Date: 16/Apr/18 16:12
Worklog Time Spent: 10m 
  Work Description: tgroh commented on issue #4977: [BEAM-3914] Deduplicate 
Unzipped Flattens after Pipeline Fusion
URL: https://github.com/apache/beam/pull/4977#issuecomment-381660805
 
 
   Done. Comment update was the only thing off.
   
   On Mon, Apr 16, 2018 at 9:11 AM Robert Bradshaw 
   wrote:
   
   > Looks like this needs a rebase.
   >
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub
   > , or mute
   > the thread
   > 

   > .
   >
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91380)
Time Spent: 2h  (was: 1h 50m)

> 'Unzip' flattens before performing fusion
> -
>
> Key: BEAM-3914
> URL: https://issues.apache.org/jira/browse/BEAM-3914
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>Priority: Major
>  Labels: portability
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> This consists of duplicating nodes downstream of a flatten that exist within 
> an environment, and reintroducing the flatten immediately upstream of a 
> runner-executed transform (the flatten should be executed within the runner)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1365

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 1.02 KB...]
 > git checkout -f 9e3e9c4d0a0dc1574c8956c7f8379b37ba262cb2
Commit message: "This closes #5028"
 > git rev-list --no-walk 9e3e9c4d0a0dc1574c8956c7f8379b37ba262cb2 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PostCommit_Python_ValidatesRunner_Dataflow] $ /bin/bash -xe 
/tmp/jenkins6376754621892826483.sh
+ cd src
+ bash sdks/python/run_validatesrunner.sh

# pip install --user installation location.
LOCAL_PATH=$HOME/.local/bin/

# INFRA does not install virtualenv
pip install virtualenv --user
Requirement already satisfied: virtualenv in 
/home/jenkins/.local/lib/python2.7/site-packages (15.2.0)
cheetah 2.4.4 requires Markdown>=2.0.1, which is not installed.
apache-beam 2.5.0.dev0 requires hdfs<3.0.0,>=2.1.0, which is not installed.
apache-beam 2.5.0.dev0 requires pytz>=2018.3, which is not installed.
apache-beam 2.5.0.dev0 has requirement grpcio<2,>=1.8, but you'll have grpcio 
1.4.0 which is incompatible.

# Virtualenv for the rest of the script to run setup & e2e tests
${LOCAL_PATH}/virtualenv sdks/python
New python executable in 

Installing setuptools, pip, wheel...done.
. sdks/python/bin/activate
# This file must be used with "source bin/activate" *from bash*
# you cannot run it directly

deactivate () {
unset -f pydoc >/dev/null 2>&1

# reset old environment variables
# ! [ -z ${VAR+_} ] returns true if VAR is declared at all
if ! [ -z "${_OLD_VIRTUAL_PATH+_}" ] ; then
PATH="$_OLD_VIRTUAL_PATH"
export PATH
unset _OLD_VIRTUAL_PATH
fi
if ! [ -z "${_OLD_VIRTUAL_PYTHONHOME+_}" ] ; then
PYTHONHOME="$_OLD_VIRTUAL_PYTHONHOME"
export PYTHONHOME
unset _OLD_VIRTUAL_PYTHONHOME
fi

# This should detect bash and zsh, which have a hash command that must
# be called to get it to forget past commands.  Without forgetting
# past commands the $PATH changes we made may not be respected
if [ -n "${BASH-}" ] || [ -n "${ZSH_VERSION-}" ] ; then
hash -r 2>/dev/null
fi

if ! [ -z "${_OLD_VIRTUAL_PS1+_}" ] ; then
PS1="$_OLD_VIRTUAL_PS1"
export PS1
unset _OLD_VIRTUAL_PS1
fi

unset VIRTUAL_ENV
if [ ! "${1-}" = "nondestructive" ] ; then
# Self destruct!
unset -f deactivate
fi
}

# unset irrelevant variables
deactivate nondestructive

VIRTUAL_ENV="
export VIRTUAL_ENV

_OLD_VIRTUAL_PATH="$PATH"
PATH="$VIRTUAL_ENV/bin:$PATH"
export PATH

# unset PYTHONHOME if set
if ! [ -z "${PYTHONHOME+_}" ] ; then
_OLD_VIRTUAL_PYTHONHOME="$PYTHONHOME"
unset PYTHONHOME
fi

if [ -z "${VIRTUAL_ENV_DISABLE_PROMPT-}" ] ; then
_OLD_VIRTUAL_PS1="$PS1"
if [ "x" != x ] ; then
PS1="$PS1"
else
PS1="(`basename \"$VIRTUAL_ENV\"`) $PS1"
fi
export PS1
fi
basename "$VIRTUAL_ENV"

# Make sure to unalias pydoc if it's already there
alias pydoc 2>/dev/null >/dev/null && unalias pydoc

pydoc () {
python -m pydoc "$@"
}

# This should detect bash and zsh, which have a hash command that must
# be called to get it to forget past commands.  Without forgetting
# past commands the $PATH changes we made may not be respected
if [ -n "${BASH-}" ] || [ -n "${ZSH_VERSION-}" ] ; then
hash -r 2>/dev/null
fi
cd sdks/python
pip install -e .[gcp,test]
Obtaining 
file://
Collecting avro<2.0.0,>=1.8.1 (from apache-beam==2.5.0.dev0)
:339:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning
:137:
 InsecurePlatformWarning: A true SSLContext object is not

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91375=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91375
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 16:00
Start Date: 16/Apr/18 16:00
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-381656611
 
 
   Rebased change.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91375)
Time Spent: 7h  (was: 6h 50m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (BEAM-3425) CassandraIO fails to estimate size: Codec not found for requested operation: [varchar <-> java.lang.Long]

2018-04-16 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-3425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-3425:
--

Assignee: Alexander Dejanovski  (was: Alexey Romanenko)

> CassandraIO fails to estimate size: Codec not found for requested operation: 
> [varchar <-> java.lang.Long]
> -
>
> Key: BEAM-3425
> URL: https://issues.apache.org/jira/browse/BEAM-3425
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-cassandra
>Reporter: Eugene Kirpichov
>Assignee: Alexander Dejanovski
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> See exception in 
> https://stackoverflow.com/questions/48090668/how-to-increase-dataflow-read-parallelism-from-cassandra/48131264#48131264
>  .
> The exception comes from 
> https://github.com/apache/beam/blob/master/sdks/java/io/cassandra/src/main/java/org/apache/beam/sdk/io/cassandra/CassandraServiceImpl.java#L279
>  , where I suppose "range_start" and "range_end" are really varchar, but the 
> code expects them to be long.
> Indeed they are varchar: 
> https://github.com/apache/cassandra/blob/4c80eeece37d79f434078224a0504400ae10a20d/src/java/org/apache/cassandra/db/SystemKeyspace.java#L238
>  and have been for at least the past 3 years.
> However really they seem to be storing longs: 
> https://github.com/apache/cassandra/blob/95b43b195e4074533100f863344c182a118a8b6c/src/java/org/apache/cassandra/hadoop/cql3/CqlInputFormat.java#L229
> So I guess all that needs to be fixed is adding a Long.parseLong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (BEAM-3424) CassandraIO uses 1 split if can't estimate size

2018-04-16 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-3424:
--

Assignee: Alexander Dejanovski  (was: Alexey Romanenko)

> CassandraIO uses 1 split if can't estimate size
> ---
>
> Key: BEAM-3424
> URL: https://issues.apache.org/jira/browse/BEAM-3424
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-cassandra
>Reporter: Eugene Kirpichov
>Assignee: Alexander Dejanovski
>Priority: Major
>
> See 
> https://stackoverflow.com/questions/48090668/how-to-increase-dataflow-read-parallelism-from-cassandra?noredirect=1#comment83227824_48090668
>  . When CassandraIO can't estimate size, it falls back to a single split:
> https://github.com/apache/beam/blob/master/sdks/java/io/cassandra/src/main/java/org/apache/beam/sdk/io/cassandra/CassandraServiceImpl.java#L196
> A single split is very poor for performance. We should fall back to a 
> different value. Not sure what a good value would be; probably the largest 
> value that still doesn't introduce too much per-split overhead? E.g. would 
> there be any downside to just changing that number to 100?
> Alternatively/additionally, like in DatastoreIO, CassandraIO could accept 
> requested number of splits as a parameter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PerformanceTests_Python #1155

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 627.11 KB...]
[INFO] Apache Beam :: SDKs :: Java :: Fn Execution  SUCCESS [  3.251 s]
[INFO] Apache Beam :: SDKs :: Java :: Extensions .. SUCCESS [  0.045 s]
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Google Cloud Platform Core 
SUCCESS [  3.611 s]
[INFO] Apache Beam :: Runners . SUCCESS [  0.067 s]
[INFO] Apache Beam :: Runners :: Core Construction Java ... SUCCESS [  3.991 s]
[INFO] Apache Beam :: Runners :: Core Java  SUCCESS [  4.274 s]
[INFO] Apache Beam :: SDKs :: Java :: Harness . SUCCESS [  6.733 s]
[INFO] Apache Beam :: SDKs :: Java :: Container ... SUCCESS [  9.042 s]
[INFO] Apache Beam :: SDKs :: Java :: IO .. SUCCESS [  0.093 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Amazon Web Services SUCCESS [  
2.817 s]
[INFO] Apache Beam :: Runners :: Local Java Core .. SUCCESS [  1.267 s]
[INFO] Apache Beam :: Runners :: Direct Java .. SUCCESS [  8.819 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: AMQP .. SUCCESS [  2.444 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Common  SUCCESS [  1.671 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Cassandra . SUCCESS [  2.278 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Elasticsearch . SUCCESS [  1.936 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Elasticsearch-Tests SUCCESS [  
0.851 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Elasticsearch-Tests :: Common 
SUCCESS [  0.675 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Elasticsearch-Tests :: 2.x SUCCESS 
[  2.148 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Elasticsearch-Tests :: 5.x SUCCESS 
[  2.513 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: XML ... SUCCESS [  3.008 s]
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Protobuf SUCCESS [  1.695 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Google Cloud Platform SUCCESS [  
4.923 s]
[INFO] Apache Beam :: Runners :: Google Cloud Dataflow  SUCCESS [  8.662 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: File-based-io-tests SUCCESS [  
2.047 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Hadoop Common . SUCCESS [  4.018 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Hadoop File System SUCCESS [  2.854 
s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: JDBC .. SUCCESS [  3.313 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Hadoop Input Format SUCCESS [ 
17.108 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: HBase . SUCCESS [  5.582 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: HCatalog .. SUCCESS [  9.408 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: JMS ... SUCCESS [  2.000 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Kafka . SUCCESS [  2.366 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Kinesis ... SUCCESS [  2.224 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: MongoDB ... SUCCESS [  2.713 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: MQTT .. SUCCESS [  2.055 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Redis . SUCCESS [  1.708 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Solr .. SUCCESS [  4.601 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Tika .. SUCCESS [  3.420 s]
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes  SUCCESS [  0.054 s]
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes :: Starter SUCCESS [  
6.341 s]
[INFO] Apache Beam :: Examples  SUCCESS [  0.047 s]
[INFO] Apache Beam :: Examples :: Java  SUCCESS [  2.949 s]
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes :: Examples SUCCESS [ 
26.420 s]
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Jackson SUCCESS [  1.392 s]
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Join library SUCCESS [  
1.488 s]
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Sketching SUCCESS [  1.983 
s]
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Sorter  SUCCESS [  2.216 s]
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: SQL ... SUCCESS [ 14.444 s]
[INFO] Apache Beam :: SDKs :: Java :: Nexmark . SUCCESS [ 14.548 s]
[INFO] Apache Beam :: SDKs :: Python .. FAILURE [ 10.332 s]
[INFO] Apache Beam :: SDKs :: Python :: Container . SKIPPED
[INFO] Apache Beam :: Runners :: Java Fn Execution  SKIPPED
[INFO] Apache Beam :: Runners :: Java Local Artifact Service SKIPPED
[INFO] Apache Beam :: Runners :: Reference  SKIPPED
[INFO] Apache Beam :: Runners :: Reference :: Java  SKIPPED
[INFO] Apache Beam :: Runners :: Reference :: Job Orchestrator SKIPPED
[INFO] Apache Beam :: Runners :: Flink  SKIPPED
[INFO] Apache Beam :: Runners :: Gearpump . SKIPPED
[INFO] Apache Beam :: Runners :: Spark  SKIPPED
[INFO] Apache Beam ::

[jira] [Assigned] (BEAM-3485) CassandraIO.read() splitting produces invalid queries

2018-04-16 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-3485:
--

Assignee: Alexander Dejanovski  (was: Alexey Romanenko)

> CassandraIO.read() splitting produces invalid queries
> -
>
> Key: BEAM-3485
> URL: https://issues.apache.org/jira/browse/BEAM-3485
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-cassandra
>Reporter: Eugene Kirpichov
>Assignee: Alexander Dejanovski
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> See 
> [https://stackoverflow.com/questions/48090668/how-to-increase-dataflow-read-parallelism-from-cassandra/48131264?noredirect=1#comment83548442_48131264]
> As the question author points out, the error is likely that token($pk) should 
> be token(pk). This was likely masked by BEAM-3424 and BEAM-3425, and the 
> splitting code path effectively was never invoked, and was broken from the 
> first PR - so there are likely other bugs.
> When testing this issue, we must ensure good code coverage in an IT against a 
> real Cassandra instance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4049) Improve write throughput of CassandraIO

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4049?focusedWorklogId=91332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91332
 ]

ASF GitHub Bot logged work on BEAM-4049:


Author: ASF GitHub Bot
Created on: 16/Apr/18 13:18
Start Date: 16/Apr/18 13:18
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #5112: [BEAM-4049] Improve 
CassandraIO write throughput by performing async queries
URL: https://github.com/apache/beam/pull/5112#issuecomment-381592757
 
 
   @adejanovski If you are doing the retries feature independently please 
create a new JIRA for it and assign it to yourself. You are now a Contributor 
so you can take JIRAs. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91332)
Time Spent: 3h 20m  (was: 3h 10m)

> Improve write throughput of CassandraIO
> ---
>
> Key: BEAM-4049
> URL: https://issues.apache.org/jira/browse/BEAM-4049
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-cassandra
>Affects Versions: 2.4.0
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: performance
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The CassandraIO currently uses the mapper to perform writes in a synchronous 
> fashion. 
> This implies that writes are serialized and is a very suboptimal way of 
> writing to Cassandra.
> The IO should use the saveAsync() method instead of save() and should wait 
> for completion each time 100 queries are in flight, in order to avoid 
> overwhelming clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (BEAM-4049) Improve write throughput of CassandraIO

2018-04-16 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-4049:
--

Assignee: Alexander Dejanovski  (was: Jean-Baptiste Onofré)

> Improve write throughput of CassandraIO
> ---
>
> Key: BEAM-4049
> URL: https://issues.apache.org/jira/browse/BEAM-4049
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-cassandra
>Affects Versions: 2.4.0
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: performance
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The CassandraIO currently uses the mapper to perform writes in a synchronous 
> fashion. 
> This implies that writes are serialized and is a very suboptimal way of 
> writing to Cassandra.
> The IO should use the saveAsync() method instead of save() and should wait 
> for completion each time 100 queries are in flight, in order to avoid 
> overwhelming clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PerformanceTests_XmlIOIT_HDFS #55

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 1.29 KB...]
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins9184067530796523348.sh
+ gcloud container clusters get-credentials io-datastores --zone=us-central1-a 
--verbosity=debug
DEBUG: Running gcloud.container.clusters.get-credentials with 
Namespace(__calliope_internal_deepest_parser=ArgumentParser(prog='gcloud.container.clusters.get-credentials',
 usage=None, description='Updates a kubeconfig file with appropriate 
credentials to point\nkubectl at a Container Engine Cluster. By default, 
credentials\nare written to HOME/.kube/config. You can provide an 
alternate\npath by setting the KUBECONFIG environment variable.\n\nSee 
[](https://cloud.google.com/container-engine/docs/kubectl) for\nkubectl 
documentation.', version=None, formatter_class=, conflict_handler='error', add_help=False), 
account=None, api_version=None, authority_selector=None, 
authorization_token_file=None, 
calliope_command=, command_path=['gcloud', 'container', 'clusters', 
'get-credentials'], configuration=None, credential_file_override=None, 
document=None, flatten=None, format=None, h=None, help=None, http_timeout=None, 
log_http=None, name='io-datastores', project=None, quiet=None, 
trace_email=None, trace_log=None, trace_token=None, user_output_enabled=None, 
verbosity='debug', version=None, zone='us-central1-a').
Fetching cluster endpoint and auth data.
DEBUG: Saved kubeconfig to /home/jenkins/.kube/config
kubeconfig entry generated for io-datastores.
INFO: Display format "default".
DEBUG: SDK update checks are disabled.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins5987039330468863500.sh
+ cp /home/jenkins/.kube/config 

[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins4892271180735815115.sh
+ kubectl 
--kubeconfig=
 create namespace filebasedioithdfs-1523872865929
namespace "filebasedioithdfs-1523872865929" created
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins5792711702106109411.sh
++ kubectl config current-context
+ kubectl 
--kubeconfig=
 config set-context gke_apache-beam-testing_us-central1-a_io-datastores 
--namespace=filebasedioithdfs-1523872865929
Context "gke_apache-beam-testing_us-central1-a_io-datastores" modified.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins5953107014775745853.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins369241192006060471.sh
+ rm -rf .env
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins577970666729839840.sh
+ virtualenv .env --system-site-packages
New python executable in .env/bin/python
Installing setuptools, pip...done.
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins4105648201075090520.sh
+ .env/bin/pip install --upgrade setuptools pip
Downloading/unpacking setuptools from 
https://pypi.python.org/packages/20/d7/04a0b689d3035143e2ff288f4b9ee4bf6ed80585cc121c90bfd85a1a8c2e/setuptools-39.0.1-py2.py3-none-any.whl#md5=ca299c7acd13a72e1171a3697f2b99bc
Downloading/unpacking pip from 
https://pypi.python.org/packages/62/a1/0d452b6901b0157a0134fd27ba89bf95a857fbda64ba52e1ca2cf61d8412/pip-10.0.0-py2.py3-none-any.whl#md5=be3e30acf78a44cd750bf2db0912c701
Installing collected packages: setuptools, pip
  Found existing installation: setuptools 2.2
Uninstalling setuptools:
  Successfully uninstalled setuptools
  Found existing installation: pip 1.5.4
Uninstalling pip:
  Successfully uninstalled pip
Successfully installed setuptools pip
Cleaning up...
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins1961378382924338376.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_XmlIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins4421833054656047899.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14)) (0.1.4)
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15)) (2.9.5)
Requirement already satisfied: setuptools in

Build failed in Jenkins: beam_PerformanceTests_AvroIOIT_HDFS #56

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 269.23 KB...]
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:248)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:235)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy60.create(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy61.create(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:248)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:235)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn$DoFnInvoker.invokeProcessElement(Unknown
 Source)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
at 
org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
at 
com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:323)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
at 
com.google.cloud.dataflow.worker.AssignWindowsParDoFnFactory$AssignWindowsParDoFn.processElement(AssignWindowsParDoFnFactory.java:118)
at 
com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
at 
com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
at

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark_Gradle #104

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 1.24 MB...]
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:627)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:626)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:828)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$.getOrCreate(JavaStreamingContext.scala:626)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext.getOrCreate(JavaStreamingContext.scala)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:169)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:123)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:83)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:346)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:328)
at 
org.apache.beam.runners.spark.translation.streaming.CreateStreamTest.testFirstElementLate(CreateStreamTest.java:240)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:317)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy3.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:108)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at

[jira] [Work logged] (BEAM-4049) Improve write throughput of CassandraIO

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4049?focusedWorklogId=91329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91329
 ]

ASF GitHub Bot logged work on BEAM-4049:


Author: ASF GitHub Bot
Created on: 16/Apr/18 13:07
Start Date: 16/Apr/18 13:07
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #5112: [BEAM-4049] Improve 
CassandraIO write throughput by performing async queries
URL: https://github.com/apache/beam/pull/5112#issuecomment-381592757
 
 
   @adejanovski If you are doing the retries feature independently please 
create a new JIRA for it. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91329)
Time Spent: 3h  (was: 2h 50m)

> Improve write throughput of CassandraIO
> ---
>
> Key: BEAM-4049
> URL: https://issues.apache.org/jira/browse/BEAM-4049
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-cassandra
>Affects Versions: 2.4.0
>Reporter: Alexander Dejanovski
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Labels: performance
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The CassandraIO currently uses the mapper to perform writes in a synchronous 
> fashion. 
> This implies that writes are serialized and is a very suboptimal way of 
> writing to Cassandra.
> The IO should use the saveAsync() method instead of save() and should wait 
> for completion each time 100 queries are in flight, in order to avoid 
> overwhelming clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4049) Improve write throughput of CassandraIO

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4049?focusedWorklogId=91330=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91330
 ]

ASF GitHub Bot logged work on BEAM-4049:


Author: ASF GitHub Bot
Created on: 16/Apr/18 13:07
Start Date: 16/Apr/18 13:07
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on issue #5112: [BEAM-4049] 
Improve CassandraIO write throughput by performing async queries
URL: https://github.com/apache/beam/pull/5112#issuecomment-381592793
 
 
   @adejanovski Yes, I think it's not related to your changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91330)
Time Spent: 3h 10m  (was: 3h)

> Improve write throughput of CassandraIO
> ---
>
> Key: BEAM-4049
> URL: https://issues.apache.org/jira/browse/BEAM-4049
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-cassandra
>Affects Versions: 2.4.0
>Reporter: Alexander Dejanovski
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Labels: performance
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The CassandraIO currently uses the mapper to perform writes in a synchronous 
> fashion. 
> This implies that writes are serialized and is a very suboptimal way of 
> writing to Cassandra.
> The IO should use the saveAsync() method instead of save() and should wait 
> for completion each time 100 queries are in flight, in order to avoid 
> overwhelming clusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PerformanceTests_Spark #1597

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 91.98 KB...]
'apache-beam-testing:bqjob_r4541cdaff52f81a_0162ce6577ad_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-04-16 12:19:41,300 9ed0bdd9 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-16 12:20:02,562 9ed0bdd9 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-16 12:20:04,878 9ed0bdd9 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: Upload complete.
Waiting on bqjob_r1b5038623d3c0691_0162ce65d4bb_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r1b5038623d3c0691_0162ce65d4bb_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r1b5038623d3c0691_0162ce65d4bb_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-04-16 12:20:04,879 9ed0bdd9 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-16 12:20:22,039 9ed0bdd9 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-16 12:20:24,497 9ed0bdd9 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: Upload complete.
Waiting on bqjob_r14121db797895d9_0162ce6620ca_1 ... (0s) Current status: 
RUNNING 
Waiting on bqjob_r14121db797895d9_0162ce6620ca_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r14121db797895d9_0162ce6620ca_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-04-16 12:20:24,498 9ed0bdd9 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-16 12:20:46,368 9ed0bdd9 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-16 12:20:48,891 9ed0bdd9 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: Upload complete.
Waiting on bqjob_r64251ac2c29597f5_0162ce667fe3_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r64251ac2c29597f5_0162ce667fe3_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r64251ac2c29597f5_0162ce667fe3_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR:

[jira] [Work logged] (BEAM-3942) Update performance testing framework to use Gradle.

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3942?focusedWorklogId=91340=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91340
 ]

ASF GitHub Bot logged work on BEAM-3942:


Author: ASF GitHub Bot
Created on: 16/Apr/18 13:39
Start Date: 16/Apr/18 13:39
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #5003: [BEAM-3942] Update 
performance testing framework to use Gradle
URL: https://github.com/apache/beam/pull/5003#issuecomment-381602540
 
 
   Things that changed (see the commit descriptions for more details): 
   
   1. Filesystem support was added (crucial for recently added hdfs IOITs)
   2. The direct runner dependency was deleted from file-based-io-tests module 
(it is added either way by "packageIntegrationTests" task and caused problems 
(couldn't compile)
   3. Moved _integrationTest_ and _packageIntegrationTests_ tasks to different 
closures to make integration test configuration more flexible and avoid cyclic 
dependencies in some cases. 
   
   Due to some problems i had after merging master recently to this branch, I 
also made sure that all configurations needed by current Jenkins jobs work. I 
also updated the 
"[errata](https://docs.google.com/document/d/1CJZURnqCabc_GA-qgC3JYKQdhh00T2l5FzFS0DF12wU/edit)"
 document. 
   
   Could you guys take a look again? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91340)
Time Spent: 10h 40m  (was: 10.5h)

> Update performance testing framework to use Gradle.
> ---
>
> Key: BEAM-3942
> URL: https://issues.apache.org/jira/browse/BEAM-3942
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Łukasz Gajowy
>Priority: Major
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> This requires performing updates to PerfKitBenchmarker and Beam so that we 
> can execute performance tests using Gradle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[beam] branch master updated (96615a1 -> 5f75e14)

2018-04-16 Thread tgroh

This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 96615a1  Merge pull request #5085: Update Dataflow Development 
Container Version
 add 1073baa  Add region to dataflowOptions struct.
 new 5f75e14  Merge pull request #5133 from jasonkuster/patch-8

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/go/pkg/beam/runners/dataflow/dataflow.go | 1 +
 1 file changed, 1 insertion(+)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.

[beam] 01/01: Merge pull request #5133 from jasonkuster/patch-8

2018-04-16 Thread tgroh

This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 5f75e14b007af6ca676c73dab0b9025008ad3816
Merge: 96615a1 1073baa
Author: Thomas Groh 
AuthorDate: Mon Apr 16 11:35:53 2018 -0700

Merge pull request #5133 from jasonkuster/patch-8

Add region to dataflowOptions struct.

 sdks/go/pkg/beam/runners/dataflow/dataflow.go | 1 +
 1 file changed, 1 insertion(+)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91441=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91441
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 19:30
Start Date: 16/Apr/18 19:30
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181856996
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/executor.py
 ##
 @@ -290,70 +293,87 @@ def __init__(self, transform_evaluator_registry, 
evaluation_context,
 self._retry_count = 0
 self._max_retries_per_bundle = TransformExecutor._MAX_RETRY_PER_BUNDLE
 
-  def call(self):
+  def call(self, state_sampler):
 self._call_count += 1
 assert self._call_count <= (1 + len(self._applied_ptransform.side_inputs))
 metrics_container = MetricsContainer(self._applied_ptransform.full_label)
-scoped_metrics_container = ScopedMetricsContainer(metrics_container)
-
-for side_input in self._applied_ptransform.side_inputs:
-  # Find the projection of main's window onto the side input's window.
-  window_mapping_fn = side_input._view_options().get(
-  'window_mapping_fn', sideinputs._global_window_mapping_fn)
-  main_onto_side_window = window_mapping_fn(self._latest_main_input_window)
-  block_until = main_onto_side_window.end
-
-  if side_input not in self._side_input_values:
-value = self._evaluation_context.get_value_or_block_until_ready(
-side_input, self, block_until)
-if not value:
-  # Monitor task will reschedule this executor once the side input is
-  # available.
-  return
-self._side_input_values[side_input] = value
-side_input_values = [self._side_input_values[side_input]
- for side_input in 
self._applied_ptransform.side_inputs]
-
-while self._retry_count < self._max_retries_per_bundle:
-  try:
-self.attempt_call(metrics_container,
-  scoped_metrics_container,
-  side_input_values)
-break
-  except Exception as e:
-self._retry_count += 1
-logging.error(
-'Exception at bundle %r, due to an exception.\n %s',
-self._input_bundle, traceback.format_exc())
-if self._retry_count == self._max_retries_per_bundle:
-  logging.error('Giving up after %s attempts.',
-self._max_retries_per_bundle)
-  self._completion_callback.handle_exception(self, e)
+start_state = state_sampler.scoped_state(
+self._applied_ptransform.full_label,
+'start',
+metrics_container=metrics_container)
+process_state = state_sampler.scoped_state(
+self._applied_ptransform.full_label,
+'process',
+metrics_container=metrics_container)
+finish_state = state_sampler.scoped_state(
+self._applied_ptransform.full_label,
+'finish',
+metrics_container=metrics_container)
+
+with start_state:
+  for side_input in self._applied_ptransform.side_inputs:
+# Find the projection of main's window onto the side input's window.
+window_mapping_fn = side_input._view_options().get(
+'window_mapping_fn', sideinputs._global_window_mapping_fn)
+main_onto_side_window = window_mapping_fn(
+self._latest_main_input_window)
+block_until = main_onto_side_window.end
+
+if side_input not in self._side_input_values:
+  value = self._evaluation_context.get_value_or_block_until_ready(
+  side_input, self, block_until)
+  if not value:
+# Monitor task will reschedule this executor once the side input is
+# available.
+return
+  self._side_input_values[side_input] = value
+  side_input_values = [
+  self._side_input_values[side_input]
+  for side_input in self._applied_ptransform.side_inputs]
+
+  while self._retry_count < self._max_retries_per_bundle:
+try:
+  self.attempt_call(metrics_container,
+side_input_values,
+process_state,
+finish_state)
+  break
+except Exception as e:
+  self._retry_count += 1
+  logging.error(
+  'Exception at bundle %r, due to an exception.\n %s',
+  self._input_bundle, traceback.format_exc())
+  if self._retry_count == self._max_retries_per_bundle:
+logging.error('Giving up after %s attempts.',
+  self._max_retries_per_bundle)
+self._completion_callback.handle_exception(self, e)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91442=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91442
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 19:30
Start Date: 16/Apr/18 19:30
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181858411
 
 

 ##
 File path: sdks/python/apache_beam/transforms/util.py
 ##
 @@ -221,6 +220,9 @@ def __init__(self,
 self._clock = clock
 self._data = []
 self._ignore_next_timing = False
+
+from apache_beam.metrics import Metrics
 
 Review comment:
   Undo this change. Transforms should be free to use metrics (at least the 
public metrics write API). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91442)
Time Spent: 8h 20m  (was: 8h 10m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91443
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 19:30
Start Date: 16/Apr/18 19:30
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181855332
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/direct_runner.py
 ##
 @@ -338,6 +337,7 @@ def run_pipeline(self, pipeline):
 from apache_beam.runners.direct.transform_evaluator import \
   TransformEvaluatorRegistry
 from apache_beam.testing.test_stream import TestStream
+from apache_beam.metrics.execution import MetricsEnvironment
 
 Review comment:
   Undo this change? Metrics should not be changed to depend on anything in the 
runners package. (Either that or the lazy import should be made there.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91443)
Time Spent: 8.5h  (was: 8h 20m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91444=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91444
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 19:30
Start Date: 16/Apr/18 19:30
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181858771
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/direct_runner.py
 ##
 @@ -338,6 +337,7 @@ def run_pipeline(self, pipeline):
 from apache_beam.runners.direct.transform_evaluator import \
   TransformEvaluatorRegistry
 from apache_beam.testing.test_stream import TestStream
+from apache_beam.metrics.execution import MetricsEnvironment
 
 Review comment:
   Or perhaps don't import metrics.execution when importing the public (write) 
Metrics api. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91444)
Time Spent: 8h 40m  (was: 8.5h)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91440=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91440
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 19:30
Start Date: 16/Apr/18 19:30
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181857518
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/operations.py
 ##
 @@ -133,24 +133,25 @@ def __init__(self, name_context, spec, counter_factory, 
state_sampler):
 
 # These are overwritten in the legacy harness.
 self.metrics_container = MetricsContainer(self.name_context.metrics_name())
-self.scoped_metrics_container = ScopedMetricsContainer(
-self.metrics_container)
+self.scoped_metrics_container = ScopedMetricsContainer()
 
 Review comment:
   Can this just be deleted?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91440)
Time Spent: 8h 10m  (was: 8h)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4070) Disable cython profiling by default

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4070?focusedWorklogId=91445=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91445
 ]

ASF GitHub Bot logged work on BEAM-4070:


Author: ASF GitHub Bot
Created on: 16/Apr/18 19:31
Start Date: 16/Apr/18 19:31
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #5134: [BEAM-4070]: Make 
cython: profile=False by default
URL: https://github.com/apache/beam/pull/5134#issuecomment-381722016
 
 
   Thank you @boyuanzz, could you look at the failing tests? Could you also run 
a simple benchmark, curious if we can get improvements this way?
   
   @robertwb is there a reason to keep these profiles in. (Based on 
http://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html I 
do not think so.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91445)
Time Spent: 20m  (was: 10m)

> Disable cython profiling by default
> ---
>
> Key: BEAM-4070
> URL: https://issues.apache.org/jira/browse/BEAM-4070
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Enabling cython profiling adds some overhead.
> http://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Apex_Gradle #109

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[mingmxu] support MAP in SQL schema

[mingmxu] in MAP, key as primitive, and value can be primitive/array/map/row

[mingmxu] use Collection for ARRAY type, and re-org `verify` code in `Row`

[mingmxu] rebase as file conflict with #5089

[mingmxu] rename CollectionType to CollectionElementType

[github] Add region to dataflowOptions struct.

[sidhom] [BEAM-4056] Identify side inputs by transform id and local name

[sidhom] Add side input assertions to ExecutableStageMatcher

--
[...truncated 27.90 MB...]
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 1 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 2 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 3 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 4 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 5 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 6 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 7 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 8 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 9 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 10 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 13 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 14 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 15 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 16 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 17 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 18 sending EndOfStream
Apr 16, 2018 7:37:06 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 19 sending EndOfStream
Apr 16, 2018 7:37:07 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:07 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:07 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 7:37:07 PM com.datatorrent.stram.engine.StreamingContainer 
processHeartbeatResponse
INFO: Undeploy request: [8, 9, 10]
Apr 16, 2018 7:37:07 PM com.datatorrent.stram.engine.StreamingContainer 
undeploy
INFO: Undeploy complete.
Apr 16, 2018 7:37:07 PM com.datatorrent.bufferserver.server.Server$3 run
INFO: Removing ln 
LogicalNode@ad344identifier=tcp://localhost:36177/10.output.11, 
upstream=10.output.11, group=stream9/13.data1, partitions=[], 
iterator=com.datatorrent.bufferserver.internal.DataList$DataListIterator@bd29a39{da=com.datatorrent.bufferserver.internal.DataList$Block@2e1a5826{identifier=10.output.11,
 data=1048576, readingOffset=0, writingOffset=231, 
starting_window=5ad4fb61, ending_window=5ad4fb66, refCount=2, 
uniqueIdentifier=0, next=null, future=null}}} from dl 
com.datatorrent.bufferserver.internal.DataList@4b8a0c49 {10.output.11}
Apr 16, 2018 7:37:07 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.

[jira] [Updated] (BEAM-4091) Typehint annotations don't work with @ptransform_fn annotation

2018-04-16 Thread Chuan Yu Foo (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Yu Foo updated BEAM-4091:
---
Description: 
Typehint annotations don't work with functions annotated with 
{{@ptransform_fn}}, but they do work with the equivalent classes.

The following is a minimal example illustrating this:

{code:python}
@beam.typehints.with_input_types(float)
@beam.typehints.with_output_types(bytes)
@beam.ptransform_fn
def _DoStuffFn(pcoll):
  return pcoll | 'TimesTwo' >> beam.Map(lambda x: x * 2)

@beam.typehints.with_input_types(float)
@beam.typehints.with_output_types(bytes)
class _DoStuffClass(beam.PTransform):

  def expand(self, pcoll):
return pcoll | 'TimesTwo' >> beam.Map(lambda x: x * 2)
{code}

With definitions as above, the class correctly fails the typecheck:

{code:python}
def class_correctly_fails():
  p = beam.Pipeline(options=PipelineOptions(runtime_type_check=True))
  _ = (p
   | 'Create' >> beam.Create([1, 2, 3, 4, 5])
   | 'DoStuff1' >> _DoStuffClass()
   | 'DoStuff2' >> _DoStuffClass()
   | 'Write' >> beam.io.WriteToText('/tmp/output'))
  p.run().wait_until_finish()

# apache_beam.typehints.decorators.TypeCheckError: Input type hint violation at 
DoStuff1: expected , got 
{code}

But the {{ptransform_fn}} incorrectly passes the typecheck:

{code:python}
def ptransform_incorrectly_passes():
  p = beam.Pipeline(options=PipelineOptions(runtime_type_check=True))
  _ = (p
   | 'Create' >> beam.Create([1, 2, 3, 4, 5])
   | 'DoStuff1' >> _DoStuffFn()
   | 'DoStuff2' >> _DoStuffFn()
   | 'Write' >> beam.io.WriteToText('/tmp/output'))
  p.run().wait_until_finish()
# No error
{code}


  was:
Typehint annotations don't work with functions annotated with @ptransform_fn, 
but they do work with the equivalent classes.

The following is a minimal example illustrating this:

{code:python}
@beam.typehints.with_input_types(float)
@beam.typehints.with_output_types(bytes)
@beam.ptransform_fn
def _DoStuffFn(pcoll):
  return pcoll | 'TimesTwo' >> beam.Map(lambda x: x * 2)

@beam.typehints.with_input_types(float)
@beam.typehints.with_output_types(bytes)
class _DoStuffClass(beam.PTransform):

  def expand(self, pcoll):
return pcoll | 'TimesTwo' >> beam.Map(lambda x: x * 2)
{code}

With definitions as above, the class correctly fails the typecheck:

{code:python}
def class_correctly_fails():
  p = beam.Pipeline(options=PipelineOptions(runtime_type_check=True))
  _ = (p
   | 'Create' >> beam.Create([1, 2, 3, 4, 5])
   | 'DoStuff1' >> _DoStuffClass()
   | 'DoStuff2' >> _DoStuffClass()
   | 'Write' >> beam.io.WriteToText('/tmp/output'))
  p.run().wait_until_finish()

# apache_beam.typehints.decorators.TypeCheckError: Input type hint violation at 
DoStuff1: expected , got 
{code}

But the {{ptransform_fn}} incorrectly passes the typecheck:

{code:python}
def ptransform_incorrectly_passes():
  p = beam.Pipeline(options=PipelineOptions(runtime_type_check=True))
  _ = (p
   | 'Create' >> beam.Create([1, 2, 3, 4, 5])
   | 'DoStuff1' >> _DoStuffFn()
   | 'DoStuff2' >> _DoStuffFn()
   | 'Write' >> beam.io.WriteToText('/tmp/output'))
  p.run().wait_until_finish()
# No error
{code}



> Typehint annotations don't work with @ptransform_fn annotation
> --
>
> Key: BEAM-4091
> URL: https://issues.apache.org/jira/browse/BEAM-4091
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.4.0
>Reporter: Chuan Yu Foo
>Assignee: Ahmet Altay
>Priority: Major
>
> Typehint annotations don't work with functions annotated with 
> {{@ptransform_fn}}, but they do work with the equivalent classes.
> The following is a minimal example illustrating this:
> {code:python}
> @beam.typehints.with_input_types(float)
> @beam.typehints.with_output_types(bytes)
> @beam.ptransform_fn
> def _DoStuffFn(pcoll):
>   return pcoll | 'TimesTwo' >> beam.Map(lambda x: x * 2)
> @beam.typehints.with_input_types(float)
> @beam.typehints.with_output_types(bytes)
> class _DoStuffClass(beam.PTransform):
>   def expand(self, pcoll):
> return pcoll | 'TimesTwo' >> beam.Map(lambda x: x * 2)
> {code}
> With definitions as above, the class correctly fails the typecheck:
> {code:python}
> def class_correctly_fails():
>   p = beam.Pipeline(options=PipelineOptions(runtime_type_check=True))
>   _ = (p
>| 'Create' >> beam.Create([1, 2, 3, 4, 5])
>| 'DoStuff1' >> _DoStuffClass()
>| 'DoStuff2' >> _DoStuffClass()
>| 'Write' >> beam.io.WriteToText('/tmp/output'))
>   p.run().wait_until_finish()
> # apache_beam.typehints.decorators.TypeCheckError: Input type hint violation 
> at DoStuff1: expected , got 
> {code}
> But the {{ptransform_fn}} incorrectly passes the typecheck:
> {code:python}
> def

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91474=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91474
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:38
Start Date: 16/Apr/18 20:38
Worklog Time Spent: 10m 
  Work Description: pabloem commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181877325
 
 

 ##
 File path: sdks/python/apache_beam/transforms/util.py
 ##
 @@ -221,6 +220,9 @@ def __init__(self,
 self._clock = clock
 self._data = []
 self._ignore_next_timing = False
+
+from apache_beam.metrics import Metrics
 
 Review comment:
   Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91474)
Time Spent: 9h 10m  (was: 9h)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91477
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:38
Start Date: 16/Apr/18 20:38
Worklog Time Spent: 10m 
  Work Description: pabloem commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181877336
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/direct_runner.py
 ##
 @@ -338,6 +337,7 @@ def run_pipeline(self, pipeline):
 from apache_beam.runners.direct.transform_evaluator import \
   TransformEvaluatorRegistry
 from apache_beam.testing.test_stream import TestStream
+from apache_beam.metrics.execution import MetricsEnvironment
 
 Review comment:
   That is reasonable, and I agree. I've fixed this by having a lazy import in 
the metrics execution module.
   Currently, we're working under the idea that state sampler is the global 
context provider, so metrics rely on it. I'd think that makes it into a module 
that provides a service to other modules that require/handle context 
management. Though I'd agree that a more loose coupling would be quite 
desirable. Let me know what you think : )


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91477)
Time Spent: 9h 40m  (was: 9.5h)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91476
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:38
Start Date: 16/Apr/18 20:38
Worklog Time Spent: 10m 
  Work Description: pabloem commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181877336
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/direct_runner.py
 ##
 @@ -338,6 +337,7 @@ def run_pipeline(self, pipeline):
 from apache_beam.runners.direct.transform_evaluator import \
   TransformEvaluatorRegistry
 from apache_beam.testing.test_stream import TestStream
+from apache_beam.metrics.execution import MetricsEnvironment
 
 Review comment:
   That is reasonable, and I agree. I've fixed this by having a lazy import in 
the metrics execution module.
   Currently, we're working under the idea that state sampler is the global 
context provider, so metrics rely on it. I'd think that makes it into a module 
that provides a service to other modules that require/handle context 
management. Though I'd agree that a more loose coupling would be good to have. 
Let me know what you think : )


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91476)
Time Spent: 9.5h  (was: 9h 20m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91472
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:38
Start Date: 16/Apr/18 20:38
Worklog Time Spent: 10m 
  Work Description: pabloem commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181877301
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/executor.py
 ##
 @@ -290,70 +293,87 @@ def __init__(self, transform_evaluator_registry, 
evaluation_context,
 self._retry_count = 0
 self._max_retries_per_bundle = TransformExecutor._MAX_RETRY_PER_BUNDLE
 
-  def call(self):
+  def call(self, state_sampler):
 self._call_count += 1
 assert self._call_count <= (1 + len(self._applied_ptransform.side_inputs))
 metrics_container = MetricsContainer(self._applied_ptransform.full_label)
-scoped_metrics_container = ScopedMetricsContainer(metrics_container)
-
-for side_input in self._applied_ptransform.side_inputs:
-  # Find the projection of main's window onto the side input's window.
-  window_mapping_fn = side_input._view_options().get(
-  'window_mapping_fn', sideinputs._global_window_mapping_fn)
-  main_onto_side_window = window_mapping_fn(self._latest_main_input_window)
-  block_until = main_onto_side_window.end
-
-  if side_input not in self._side_input_values:
-value = self._evaluation_context.get_value_or_block_until_ready(
-side_input, self, block_until)
-if not value:
-  # Monitor task will reschedule this executor once the side input is
-  # available.
-  return
-self._side_input_values[side_input] = value
-side_input_values = [self._side_input_values[side_input]
- for side_input in 
self._applied_ptransform.side_inputs]
-
-while self._retry_count < self._max_retries_per_bundle:
-  try:
-self.attempt_call(metrics_container,
-  scoped_metrics_container,
-  side_input_values)
-break
-  except Exception as e:
-self._retry_count += 1
-logging.error(
-'Exception at bundle %r, due to an exception.\n %s',
-self._input_bundle, traceback.format_exc())
-if self._retry_count == self._max_retries_per_bundle:
-  logging.error('Giving up after %s attempts.',
-self._max_retries_per_bundle)
-  self._completion_callback.handle_exception(self, e)
+start_state = state_sampler.scoped_state(
+self._applied_ptransform.full_label,
+'start',
+metrics_container=metrics_container)
+process_state = state_sampler.scoped_state(
+self._applied_ptransform.full_label,
+'process',
+metrics_container=metrics_container)
+finish_state = state_sampler.scoped_state(
+self._applied_ptransform.full_label,
+'finish',
+metrics_container=metrics_container)
+
+with start_state:
+  for side_input in self._applied_ptransform.side_inputs:
+# Find the projection of main's window onto the side input's window.
+window_mapping_fn = side_input._view_options().get(
+'window_mapping_fn', sideinputs._global_window_mapping_fn)
+main_onto_side_window = window_mapping_fn(
+self._latest_main_input_window)
+block_until = main_onto_side_window.end
+
+if side_input not in self._side_input_values:
+  value = self._evaluation_context.get_value_or_block_until_ready(
+  side_input, self, block_until)
+  if not value:
+# Monitor task will reschedule this executor once the side input is
+# available.
+return
+  self._side_input_values[side_input] = value
+  side_input_values = [
+  self._side_input_values[side_input]
+  for side_input in self._applied_ptransform.side_inputs]
+
+  while self._retry_count < self._max_retries_per_bundle:
+try:
+  self.attempt_call(metrics_container,
+side_input_values,
+process_state,
+finish_state)
+  break
+except Exception as e:
+  self._retry_count += 1
+  logging.error(
+  'Exception at bundle %r, due to an exception.\n %s',
+  self._input_bundle, traceback.format_exc())
+  if self._retry_count == self._max_retries_per_bundle:
+logging.error('Giving up after %s attempts.',
+  self._max_retries_per_bundle)
+self._completion_callback.handle_exception(self, e)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91475=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91475
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:38
Start Date: 16/Apr/18 20:38
Worklog Time Spent: 10m 
  Work Description: pabloem commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181877336
 
 

 ##
 File path: sdks/python/apache_beam/runners/direct/direct_runner.py
 ##
 @@ -338,6 +337,7 @@ def run_pipeline(self, pipeline):
 from apache_beam.runners.direct.transform_evaluator import \
   TransformEvaluatorRegistry
 from apache_beam.testing.test_stream import TestStream
+from apache_beam.metrics.execution import MetricsEnvironment
 
 Review comment:
   That is reasonable, and I agree. I've fixed this by having a lazy import in 
the metrics execution module.
   Currently, we're working under the idea that state sampler is the global 
context provider, so metrics rely on it. I'd think that makes it into a module 
that provides a service to other modules that require/handle context 
management. Though I'd agree that a more loose coupling would be good to have.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91475)
Time Spent: 9h 20m  (was: 9h 10m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91478
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:38
Start Date: 16/Apr/18 20:38
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-381741327
 
 
   Thanks Robert!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91478)
Time Spent: 9h 50m  (was: 9h 40m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=91473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91473
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:38
Start Date: 16/Apr/18 20:38
Worklog Time Spent: 10m 
  Work Description: pabloem commented on a change in pull request #4387: 
[BEAM-2732] Metrics rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#discussion_r181877312
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/operations.py
 ##
 @@ -133,24 +133,25 @@ def __init__(self, name_context, spec, counter_factory, 
state_sampler):
 
 # These are overwritten in the legacy harness.
 self.metrics_container = MetricsContainer(self.name_context.metrics_name())
-self.scoped_metrics_container = ScopedMetricsContainer(
-self.metrics_container)
+self.scoped_metrics_container = ScopedMetricsContainer()
 
 Review comment:
   Right. Done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91473)
Time Spent: 9h  (was: 8h 50m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-2990) support data type MAP

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2990?focusedWorklogId=91433=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91433
 ]

ASF GitHub Bot logged work on BEAM-2990:


Author: ASF GitHub Bot
Created on: 16/Apr/18 18:49
Start Date: 16/Apr/18 18:49
Worklog Time Spent: 10m 
  Work Description: XuMingmin closed pull request #5079: [BEAM-2990] 
support MAP in SQL schema
URL: https://github.com/apache/beam/pull/5079
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoder.java
index f32b6ce5d84..5caa6464556 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoder.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoder.java
@@ -96,9 +96,19 @@ private static long estimatedSizeBytes(FieldType 
typeDescriptor, Object value) {
 List list = (List) value;
 long listSizeBytes = 0;
 for (Object elem : list) {
-  listSizeBytes += 
estimatedSizeBytes(typeDescriptor.getComponentType(), elem);
+  listSizeBytes += 
estimatedSizeBytes(typeDescriptor.getCollectionElementType(), elem);
 }
 return 4 + listSizeBytes;
+  case MAP:
+Map map = (Map) value;
+long mapSizeBytes = 0;
+for (Map.Entry elem : map.entrySet()) {
+  mapSizeBytes += 
typeDescriptor.getMapKeyType().equals(TypeName.STRING)
+? ((String) elem.getKey()).length()
+  : ESTIMATED_FIELD_SIZES.get(typeDescriptor.getMapKeyType());
+  mapSizeBytes += estimatedSizeBytes(typeDescriptor.getMapValueType(), 
elem.getValue());
+}
+return 4 + mapSizeBytes;
   case STRING:
 // Not always accurate - String.getBytes().length() would be more 
accurate here, but slower.
 return ((String) value).length();
@@ -121,7 +131,10 @@ public Schema getSchema() {
 
   Coder getCoder(FieldType fieldType) {
 if (TypeName.ARRAY.equals(fieldType.getTypeName())) {
-  return ListCoder.of(getCoder(fieldType.getComponentType()));
+  return ListCoder.of(getCoder(fieldType.getCollectionElementType()));
+} else if (TypeName.MAP.equals(fieldType.getTypeName())) {
+  return MapCoder.of(coderForPrimitiveType(fieldType.getMapKeyType()),
+  getCoder(fieldType.getMapValueType()));
 } else if (TypeName.ROW.equals((fieldType.getTypeName( {
   return RowCoder.of(fieldType.getRowSchema());
 } else {
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
index 3a7bc346883..436941b3875 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
@@ -122,8 +122,9 @@ public Builder addBooleanField(String name, boolean 
nullable) {
   return this;
 }
 
-public Builder addArrayField(String name, FieldType componentType) {
-  fields.add(Field.of(name, 
TypeName.ARRAY.type().withComponentType(componentType)));
+public Builder addArrayField(String name, FieldType collectionElementType) 
{
+  fields.add(
+  Field.of(name, 
TypeName.ARRAY.type().withCollectionElementType(collectionElementType)));
   return this;
 }
 
@@ -199,6 +200,7 @@ public int hashCode() {
 DATETIME, // Date and time.
 BOOLEAN,  // Boolean.
 ARRAY,
+MAP,
 ROW;// The field is itself a nested row.
 
 private final FieldType fieldType = FieldType.of(this);
@@ -207,9 +209,13 @@ public int hashCode() {
 BYTE, INT16, INT32, INT64, DECIMAL, FLOAT, DOUBLE);
 public static final Set STRING_TYPES = ImmutableSet.of(STRING);
 public static final Set DATE_TYPES = ImmutableSet.of(DATETIME);
-public static final Set CONTAINER_TYPES = ImmutableSet.of(ARRAY);
+public static final Set COLLECTION_TYPES = 
ImmutableSet.of(ARRAY);
+public static final Set MAP_TYPES = ImmutableSet.of(MAP);
 public static final Set COMPOSITE_TYPES = ImmutableSet.of(ROW);
 
+public boolean isPrimitiveType() {
+  return !isCollectionType() && !isMapType() && !isCompositeType();
+}
 public boolean isNumericType() {
   return NUMERIC_TYPES.contains(this);
 }
@@ -219,8 +225,11 @@ public boolean isStringType() {
 public boolean isDateType() {
   return DATE_TYPES.contains(this);
 }
-public boolean isContainerType() {
-  return CONTAINER_TYPES.contains(this);
+public boolean

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark_Gradle #107

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 1.24 MB...]
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:627)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:626)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:828)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$.getOrCreate(JavaStreamingContext.scala:626)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext.getOrCreate(JavaStreamingContext.scala)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:169)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:123)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:83)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:346)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:328)
at 
org.apache.beam.runners.spark.translation.streaming.CreateStreamTest.testFirstElementLate(CreateStreamTest.java:240)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:317)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy3.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:108)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at

Build failed in Jenkins: beam_PostCommit_Python_Verify #4704

2018-04-16 Thread Apache Jenkins Server

See 


--
GitHub pull request #4387 of commit 1c585e8e4ea67a6d96a3a75114fcc65a3d4ec7d2, 
no merge conflicts.
Setting status of 1c585e8e4ea67a6d96a3a75114fcc65a3d4ec7d2 to PENDING with url 
https://builds.apache.org/job/beam_PostCommit_Python_Verify/4704/ and message: 
'Build started sha1 is merged.'
Using context: Jenkins: Python SDK PostCommit Tests
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/4387/*:refs/remotes/origin/pr/4387/*
 > git rev-parse refs/remotes/origin/pr/4387/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/4387/merge^{commit} # timeout=10
Checking out Revision d4356d30e635d62224d29fb0262d05aef2c42a54 
(refs/remotes/origin/pr/4387/merge)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d4356d30e635d62224d29fb0262d05aef2c42a54
Commit message: "Merge 1c585e8e4ea67a6d96a3a75114fcc65a3d4ec7d2 into 
3b3f944d4b6aad10a20bc466f75da2e9210192ff"
First time build. Skipping changelog.
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PostCommit_Python_Verify] $ /bin/bash -xe 
/tmp/jenkins418061081312666917.sh
+ cd src
+ bash sdks/python/run_postcommit.sh

# pip install --user installation location.
LOCAL_PATH=$HOME/.local/bin/

# Remove any tox cache from previous workspace
# TODO(udim): Remove this line and add '-r' to tox invocation instead.
rm -rf sdks/python/target/.tox

# INFRA does not install these packages
pip install --user --upgrade virtualenv tox
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:339: 
SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:137: 
InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:137: 
InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Requirement already up-to-date: virtualenv in 
/home/jenkins/.local/lib/python2.7/site-packages (15.2.0)
/usr/local/lib/python2.7/dist-packages/pip/_vendor/urllib3/util/ssl_.py:137: 
InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Requirement already up-to-date: tox in 
/home/jenkins/.local/lib/python2.7/site-packages (3.0.0)
Requirement not upgraded as not directly required: py>=1.4.17 in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (1.5.3)
Requirement not upgraded as not directly required: pluggy<1.0,>=0.3.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (0.6.0)
Requirement not upgraded as not directly required: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from tox) (1.11.0)
cheetah 2.4.4 requires Markdown>=2.0.1, which is not installed.
apache-beam 2.5.0.dev0 requires hdfs<3.0.0,>=2.1.0, which is not installed.
apache-beam

[beam] 01/01: Merge pull request #5079 from XuMingmin/BEAM-2990

2018-04-16 Thread mingmxu

This is an automated email from the ASF dual-hosted git repository.

mingmxu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 3b3f944d4b6aad10a20bc466f75da2e9210192ff
Merge: a7819fe eae842f
Author: XuMingmin 
AuthorDate: Mon Apr 16 11:49:34 2018 -0700

Merge pull request #5079 from XuMingmin/BEAM-2990

[BEAM-2990] support MAP in SQL schema

 .../java/org/apache/beam/sdk/coders/RowCoder.java  |  17 ++-
 .../java/org/apache/beam/sdk/schemas/Schema.java   |  58 ++---
 .../apache/beam/sdk/util/RowJsonDeserializer.java  |   2 +-
 .../apache/beam/sdk/util/RowJsonValidation.java|   4 +-
 .../main/java/org/apache/beam/sdk/values/Row.java  |  79 +
 .../org/apache/beam/sdk/coders/RowCoderTest.java   |   8 +-
 .../org/apache/beam/sdk/schemas/SchemaTest.java|   8 +-
 .../beam/sdk/util/RowJsonDeserializerTest.java |   4 +-
 .../java/org/apache/beam/sdk/values/RowTest.java   | 108 -
 .../beam/sdk/extensions/sql/RowSqlTypes.java   |  27 -
 .../sql/impl/interpreter/BeamSqlFnExecutor.java|  12 ++
 .../interpreter/operator/BeamSqlPrimitive.java |   3 +
 .../operator/array/BeamSqlArrayItemExpression.java |   2 +-
 .../BeamSqlMapExpression.java} |  34 +++---
 .../BeamSqlMapItemExpression.java} |  17 +--
 .../interpreter/operator/map/package-info.java |  26 
 .../extensions/sql/impl/utils/CalciteUtils.java|  39 --
 .../beam/sdk/extensions/sql/BeamSqlMapTest.java| 131 +
 18 files changed, 489 insertions(+), 90 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
ming...@apache.org.

[beam] branch master updated (a7819fe -> 3b3f944)

2018-04-16 Thread mingmxu

This is an automated email from the ASF dual-hosted git repository.

mingmxu pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from a7819fe  Merge pull request #5118: Identify side inputs by transform 
id and local name
 add 432979d  support MAP in SQL schema
 add 83aa2e4  in MAP, key as primitive, and value can be 
primitive/array/map/row
 add 4ec9e60  use Collection for ARRAY type, and re-org `verify` code in 
`Row`
 add 1fab0a4  rebase as file conflict with #5089
 add eae842f  rename CollectionType to CollectionElementType
 new 3b3f944  Merge pull request #5079 from XuMingmin/BEAM-2990

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../java/org/apache/beam/sdk/coders/RowCoder.java  |  17 ++-
 .../java/org/apache/beam/sdk/schemas/Schema.java   |  58 ++---
 .../apache/beam/sdk/util/RowJsonDeserializer.java  |   2 +-
 .../apache/beam/sdk/util/RowJsonValidation.java|   4 +-
 .../main/java/org/apache/beam/sdk/values/Row.java  |  79 +
 .../org/apache/beam/sdk/coders/RowCoderTest.java   |   8 +-
 .../org/apache/beam/sdk/schemas/SchemaTest.java|   8 +-
 .../beam/sdk/util/RowJsonDeserializerTest.java |   4 +-
 .../java/org/apache/beam/sdk/values/RowTest.java   | 108 -
 .../beam/sdk/extensions/sql/RowSqlTypes.java   |  27 -
 .../sql/impl/interpreter/BeamSqlFnExecutor.java|  12 ++
 .../interpreter/operator/BeamSqlPrimitive.java |   3 +
 .../operator/array/BeamSqlArrayItemExpression.java |   2 +-
 .../BeamSqlMapExpression.java} |  24 ++--
 .../BeamSqlMapItemExpression.java} |  18 +--
 .../operator/{row => map}/package-info.java|   5 +-
 .../extensions/sql/impl/utils/CalciteUtils.java|  39 --
 .../beam/sdk/extensions/sql/BeamSqlMapTest.java| 131 +
 18 files changed, 457 insertions(+), 92 deletions(-)
 copy 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/operator/{array/BeamSqlArrayExpression.java
 => map/BeamSqlMapExpression.java} (74%)
 copy 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/operator/{collection/BeamSqlCardinalityExpression.java
 => map/BeamSqlMapItemExpression.java} (77%)
 copy 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/operator/{row
 => map}/package-info.java (94%)
 create mode 100644 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlMapTest.java

-- 
To stop receiving notification emails like this one, please contact
ming...@apache.org.

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #85

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 18.45 MB...]
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues as step s17
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow) as step s18
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly as step s19
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParMultiDo(ToIsmRecordForMapLike) as step s20
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForSize as step s21
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParDo(ToIsmMetadataRecordForSize) as step s22
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForKeys as step s23
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParDo(ToIsmMetadataRecordForKey) as step s24
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/Flatten.PCollections as step s25
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/CreateDataflowView as step s26
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Partition 
input as step s27
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Group by 
partition as step s28
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Batch 
mutations together as step s29
Apr 16, 2018 7:04:11 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Write 
mutations to Spanner as step s30
Apr 16, 2018 7:04:11 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testwrite-jenkins-0416190400-8960da7/output/results/staging/
Apr 16, 2018 7:04:11 PM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <80355 bytes, hash TxkhH_ERAmot2qUhnmikaw> to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testwrite-jenkins-0416190400-8960da7/output/results/staging/pipeline-TxkhH_ERAmot2qUhnmikaw.pb

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_OUT
Dataflow SDK version: 2.5.0-SNAPSHOT

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_ERROR
Apr 16, 2018 7:04:13 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-04-16_12_04_12-6461375284801524899?project=apache-beam-testing

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_OUT
Submitted job: 2018-04-16_12_04_12-6461375284801524899

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_ERROR
Apr 16, 2018 7:04:13 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To cancel the job using the 'gcloud' tool, run:
> gcloud dataflow jobs --project=apache-beam-testing cancel 
--region=us-central1 2018-04-16_12_04_12-6461375284801524899
Apr 16, 2018 7:04:13 PM org.apache.beam.runners.dataflow.TestDataflowRunner 
run
INFO: Running

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark_Gradle #108

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[mingmxu] support MAP in SQL schema

[mingmxu] in MAP, key as primitive, and value can be primitive/array/map/row

[mingmxu] use Collection for ARRAY type, and re-org `verify` code in `Row`

[mingmxu] rebase as file conflict with #5089

[mingmxu] rename CollectionType to CollectionElementType

[github] Add region to dataflowOptions struct.

[sidhom] [BEAM-4056] Identify side inputs by transform id and local name

[sidhom] Add side input assertions to ExecutableStageMatcher

--
[...truncated 1.24 MB...]
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:627)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:626)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:828)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$.getOrCreate(JavaStreamingContext.scala:626)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext.getOrCreate(JavaStreamingContext.scala)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:169)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:123)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:83)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:346)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:328)
at 
org.apache.beam.runners.spark.translation.streaming.CreateStreamTest.testFirstElementLate(CreateStreamTest.java:240)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:317)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy3.processTestClass(Unknown Source)
at

[jira] [Work logged] (BEAM-4070) Disable cython profiling by default

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4070?focusedWorklogId=91451=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91451
 ]

ASF GitHub Bot logged work on BEAM-4070:


Author: ASF GitHub Bot
Created on: 16/Apr/18 19:53
Start Date: 16/Apr/18 19:53
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #5134: [BEAM-4070]: Make 
cython: profile=False by default
URL: https://github.com/apache/beam/pull/5134#issuecomment-381728118
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91451)
Time Spent: 40m  (was: 0.5h)

> Disable cython profiling by default
> ---
>
> Key: BEAM-4070
> URL: https://issues.apache.org/jira/browse/BEAM-4070
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Enabling cython profiling adds some overhead.
> http://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4070) Disable cython profiling by default

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4070?focusedWorklogId=91462=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91462
 ]

ASF GitHub Bot logged work on BEAM-4070:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:23
Start Date: 16/Apr/18 20:23
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #5134: [BEAM-4070]: Make 
cython: profile=False by default
URL: https://github.com/apache/beam/pull/5134#issuecomment-381736955
 
 
   Ran distribution_counter_microbenchmark:
   ```
   Disable cython profiling:
   Per element update time cost: 1.94871425629e-08
   
   Enable cython profiling:
   Per element update time cost: 2.2584438324e-08
   ```
   
   Ran map_fn_microbenchmark.py
   ```
   Disable cython profiling:
   Fixed cost   0.912458370739
   Per-element  1.03488045028e-06
   R^2  0.950584928612
   
   Enable cython profiling:
   Fixed cost   0.912591088374
   Per-element  1.08467976252e-06
   R^2  0.924557343352
   ```
   
   It seems, disable profiling does improve performance(ah...maybe a little). 
But I think it's valuable to disable it at least along hot path, eg, 
operations.py, windlow_value.py. How do you feel like? @aaltay  @robertwb 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91462)
Time Spent: 1.5h  (was: 1h 20m)

> Disable cython profiling by default
> ---
>
> Key: BEAM-4070
> URL: https://issues.apache.org/jira/browse/BEAM-4070
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Enabling cython profiling adds some overhead.
> http://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4028) Step / Operation naming should rely on a NameContext class

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4028?focusedWorklogId=91471=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91471
 ]

ASF GitHub Bot logged work on BEAM-4028:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:35
Start Date: 16/Apr/18 20:35
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #5135: [BEAM-4028] 
Transitioning MapTask objects to NameContext
URL: https://github.com/apache/beam/pull/5135#issuecomment-381740269
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91471)
Time Spent: 3.5h  (was: 3h 20m)

> Step / Operation naming should rely on a NameContext class
> --
>
> Key: BEAM-4028
> URL: https://issues.apache.org/jira/browse/BEAM-4028
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Steps can have different names depending on the runner (stage, step, user, 
> system name...). 
> Depending on the needs of different components (operations, logging, metrics, 
> statesampling) these step names are passed around without a specific order.
> Instead, SDK should rely on `NameContext` objects that carry all the naming 
> information for a single step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (BEAM-4090) Fork/Update Primitive Implementations for the ReferenceRunner

2018-04-16 Thread Thomas Groh (JIRA)

Thomas Groh created BEAM-4090:
-

 Summary: Fork/Update Primitive Implementations for the 
ReferenceRunner
 Key: BEAM-4090
 URL: https://issues.apache.org/jira/browse/BEAM-4090
 Project: Beam
  Issue Type: Bug
  Components: runner-direct
Reporter: Thomas Groh
Assignee: Thomas Groh


The primitives that require implementation are:

Impulse, Flatten, GroupByKey

GroupByKey may be implemented by PartitionByKey/GroupByKeyAndWindow 

The primitives that may be implemented as well are:

AssignWindows, for known WindowFns



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1369

2018-04-16 Thread Apache Jenkins Server

See 


Changes:

[mingmxu] support MAP in SQL schema

[mingmxu] in MAP, key as primitive, and value can be primitive/array/map/row

[mingmxu] use Collection for ARRAY type, and re-org `verify` code in `Row`

[mingmxu] rebase as file conflict with #5089

[mingmxu] rename CollectionType to CollectionElementType

[github] Add region to dataflowOptions struct.

[sidhom] [BEAM-4056] Identify side inputs by transform id and local name

[sidhom] Add side input assertions to ExecutableStageMatcher

--
[...truncated 4.41 KB...]

# Make sure to unalias pydoc if it's already there
alias pydoc 2>/dev/null >/dev/null && unalias pydoc

pydoc () {
python -m pydoc "$@"
}

# This should detect bash and zsh, which have a hash command that must
# be called to get it to forget past commands.  Without forgetting
# past commands the $PATH changes we made may not be respected
if [ -n "${BASH-}" ] || [ -n "${ZSH_VERSION-}" ] ; then
hash -r 2>/dev/null
fi
cd sdks/python
pip install -e .[gcp,test]
Obtaining 
file://
Collecting avro<2.0.0,>=1.8.1 (from apache-beam==2.5.0.dev0)
:339:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  SNIMissingWarning
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting crcmod<2.0,>=1.7 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting dill==0.2.6 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
Collecting grpcio<2,>=1.8 (from apache-beam==2.5.0.dev0)
:137:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning
  Using cached 
https://files.pythonhosted.org/packages/0d/54/b647a6323be6526be27b2c90bb042769f1a7a6e59bd1a5f2eeb795bfece4/grpcio-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting hdfs<3.0.0,>=2.1.0 (from

[jira] [Commented] (BEAM-3669) Linter error in statesampler_fast

2018-04-16 Thread Pablo Estrada (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439860#comment-16439860
 ] 

Pablo Estrada commented on BEAM-3669:
-

Investigation by Boyuan is that this error happens when the output from Cython 
compilation has not been cleaned up when lint is run. We can ensure that lint 
always cleans up before it runs, since users may run it independently by 
themselves.

> Linter error in statesampler_fast
> -
>
> Key: BEAM-3669
> URL: https://issues.apache.org/jira/browse/BEAM-3669
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Pablo Estrada
>Priority: Major
>
> This is a precommit failure, but I have seen it in 2 unrelated pre-commits. 
> It is possible that this is an issue with the head:
> Link to a job: 
> https://builds.apache.org/job/beam_PreCommit_Python_MavenInstall/2661/org.apache.beam$beam-sdks-python/console
>  
> Running pylint for module apache_beam:
> * Module apache_beam.runners.worker.statesampler_fast
> E:  1, 0: compile() expected string without null bytes (syntax-error)
> * Module apache_beam.coders.stream
> E:  1, 0: compile() expected string without null bytes (syntax-error)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Apex_Gradle #108

2018-04-16 Thread Apache Jenkins Server

See 


--
[...truncated 27.96 MB...]
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 1 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 2 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 3 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 4 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 5 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 6 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 7 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 8 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 9 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 10 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 13 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 14 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 15 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 16 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 17 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 18 sending EndOfStream
Apr 16, 2018 6:57:21 PM com.datatorrent.stram.engine.Node emitEndStream
INFO: 19 sending EndOfStream
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.engine.StreamingContainer 
processHeartbeatResponse
INFO: Undeploy request: [8, 9, 10]
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.engine.StreamingContainer 
undeploy
INFO: Undeploy complete.
Apr 16, 2018 6:57:22 PM com.datatorrent.bufferserver.server.Server$3 run
INFO: Removing ln 
LogicalNode@3a002c76identifier=tcp://localhost:46986/10.output.11, 
upstream=10.output.11, group=stream6/13.data1, partitions=[], 
iterator=com.datatorrent.bufferserver.internal.DataList$DataListIterator@4f57325c{da=com.datatorrent.bufferserver.internal.DataList$Block@37c153bd{identifier=10.output.11,
 data=1048576, readingOffset=0, writingOffset=243, 
starting_window=5ad4f20e0001, ending_window=5ad4f20e0008, refCount=2, 
uniqueIdentifier=0, next=null, future=null}}} from dl 
com.datatorrent.bufferserver.internal.DataList@2c1d56a0 {10.output.11}
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.engine.StreamingContainer 
processHeartbeatResponse
INFO: Undeploy request: [14]
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.Journal write
WARNING: Journal output stream is null. Skipping write to the WAL.
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.engine.StreamingContainer 
undeploy
INFO: Undeploy complete.
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.engine.StreamingContainer 
processHeartbeatResponse
INFO: Undeploy request: [2, 3, 4, 5, 6]
Apr 16, 2018 6:57:22 PM com.datatorrent.stram.engine.StreamingContainer 
undeploy
INFO: Undeploy complete.
Apr 16, 2018 6:57:22 PM

[jira] [Work logged] (BEAM-4070) Disable cython profiling by default

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4070?focusedWorklogId=91457=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91457
 ]

ASF GitHub Bot logged work on BEAM-4070:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:09
Start Date: 16/Apr/18 20:09
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #5134: [BEAM-4070]: Make 
cython: profile=False by default
URL: https://github.com/apache/beam/pull/5134#issuecomment-381732896
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91457)
Time Spent: 1h 20m  (was: 1h 10m)

> Disable cython profiling by default
> ---
>
> Key: BEAM-4070
> URL: https://issues.apache.org/jira/browse/BEAM-4070
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Enabling cython profiling adds some overhead.
> http://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4070) Disable cython profiling by default

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4070?focusedWorklogId=91455=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91455
 ]

ASF GitHub Bot logged work on BEAM-4070:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:08
Start Date: 16/Apr/18 20:08
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #5134: [BEAM-4070]: Make 
cython: profile=False by default
URL: https://github.com/apache/beam/pull/5134#issuecomment-381732581
 
 
   retest it please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91455)
Time Spent: 1h  (was: 50m)

> Disable cython profiling by default
> ---
>
> Key: BEAM-4070
> URL: https://issues.apache.org/jira/browse/BEAM-4070
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Enabling cython profiling adds some overhead.
> http://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-4070) Disable cython profiling by default

2018-04-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4070?focusedWorklogId=91454=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91454
 ]

ASF GitHub Bot logged work on BEAM-4070:


Author: ASF GitHub Bot
Created on: 16/Apr/18 20:08
Start Date: 16/Apr/18 20:08
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #5134: [BEAM-4070]: Make 
cython: profile=False by default
URL: https://github.com/apache/beam/pull/5134#issuecomment-381732581
 
 
   Retest please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 91454)
Time Spent: 50m  (was: 40m)

> Disable cython profiling by default
> ---
>
> Key: BEAM-4070
> URL: https://issues.apache.org/jira/browse/BEAM-4070
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Enabling cython profiling adds some overhead.
> http://cython.readthedocs.io/en/latest/src/tutorial/profiling_tutorial.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 3 >

1 - 100 of 267 matches

Mail list logo