Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1175

2018-03-23 Thread Apache Jenkins Server
See 


--
[...truncated 777.76 KB...]
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": "kind:pair", 
  "component_encodings": [
{
  "@type": "kind:bytes"
}, 
{
  "@type": 
"VarIntCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxhiUWeeSXOIA5XIYNmYyFjbSFTkh4A89cR+g==",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "compute/MapToVoidKey0.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s2"
}, 
"serialized_fn": "", 
"user_name": "compute/MapToVoidKey0"
  }
}
  ], 
  "type": "JOB_TYPE_BATCH"
}
root: INFO: Create job: 
root: INFO: Created job with id: [2018-03-23_20_24_57-14294174990347249966]
root: INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-03-23_20_24_57-14294174990347249966?project=apache-beam-testing
root: INFO: Job 2018-03-23_20_24_57-14294174990347249966 is in state 
JOB_STATE_PENDING
root: INFO: 2018-03-24T03:24:57.401Z: JOB_MESSAGE_WARNING: Job 
2018-03-23_20_24_57-14294174990347249966 might autoscale up to 1000 workers.
root: INFO: 2018-03-24T03:24:57.427Z: JOB_MESSAGE_DETAILED: Autoscaling is 
enabled for job 2018-03-23_20_24_57-14294174990347249966. The number of workers 
will be between 1 and 1000.
root: INFO: 2018-03-24T03:24:57.457Z: JOB_MESSAGE_DETAILED: Autoscaling was 
automatically enabled for job 2018-03-23_20_24_57-14294174990347249966.
root: INFO: 2018-03-24T03:25:00.335Z: JOB_MESSAGE_DETAILED: Checking required 
Cloud APIs are enabled.
root: INFO: 2018-03-24T03:25:00.437Z: JOB_MESSAGE_DETAILED: Checking 
permissions granted to controller Service Account.
root: INFO: 2018-03-24T03:25:01.080Z: JOB_MESSAGE_DETAILED: Expanding 
CoGroupByKey operations into optimizable parts.
root: INFO: 2018-03-24T03:25:01.115Z: JOB_MESSAGE_DEBUG: Combiner lifting 
skipped for step assert_that/Group/GroupByKey: GroupByKey not followed by a 
combiner.
root: INFO: 2018-03-24T03:25:01.144Z: JOB_MESSAGE_DETAILED: Expanding 
GroupByKey operations into optimizable parts.
root: INFO: 2018-03-24T03:25:01.170Z: JOB_MESSAGE_DETAILED: Lifting 
ValueCombiningMappingFns into MergeBucketsMappingFns
root: INFO: 2018-03-24T03:25:01.204Z: JOB_MESSAGE_DEBUG: Annotating graph with 
Autotuner information.
root: INFO: 2018-03-24T03:25:01.233Z: JOB_MESSAGE_DETAILED: Fusing adjacent 
ParDo, Read, Write, and Flatten operations
root: INFO: 2018-03-24T03:25:01.263Z: JOB_MESSAGE_DETAILED: Unzipping flatten 
s11 for input s10.out
root: INFO: 2018-03-24T03:25:01.292Z: JOB_MESSAGE_DETAILED: Fusing unzipped 
copy of assert_that/Group/GroupByKey/Reify, through flatten 
assert_that/Group/Flatten, into producer assert_that/Group/pair_with_1
root: INFO: 2018-03-24T03:25:01.317Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Group/GroupByKey/GroupByWindow into 
assert_that/Group/GroupByKey/Read
root: INFO: 2018-03-24T03:25:01.348Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Unkey into assert_that/Group/Map(_merge_tagged_vals_under_key)
root: INFO: 2018-03-24T03:25:01.376Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Match into assert_that/Unkey
root: INFO: 2018-03-24T03:25:01.399Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Group/Map(_merge_tagged_vals_under_key) into 
assert_that/Group/GroupByKey/GroupByWindow
root: INFO: 2018-03-24T03:25:01.422Z: JOB_MESSAGE_DETAILED: Unzipping flatten 
s11-u13 for input s12-reify-value0-c11
root: INFO: 2018-03-24T03:25:01.446Z: JOB_MESSAGE_DETAILED: Fusing unzipped 
copy of assert_that/Group/GroupByKey/Write, through flatten s11-u13, into 
producer assert_that/Group/GroupByKey/Reify
root: INFO: 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5211

2018-03-23 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #4495

2018-03-23 Thread Apache Jenkins Server
See 


--
[...truncated 290.12 KB...]
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples import wordcount
 from apache_beam.examples import wordcount_fnapi
 from apache_beam.testing.pipeline_verifiers import FileChecksumMatcher
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
 from apache_beam.testing.test_utils import delete_files
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 
 class WordCountIT(unittest.TestCase):
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
 2018-03-23 17:50:29.204663
+++ 
:after
  2018-03-24 03:09:42.687154
@@ -29,15 +29,14 @@
 import unittest
 import uuid
 
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples import streaming_wordcount
 from apache_beam.io.gcp.tests.pubsub_matcher import PubSubMessageMatcher
 from apache_beam.runners.runner import PipelineState
 from apache_beam.testing import test_utils
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 INPUT_TOPIC = 'wc_topic_input'
 OUTPUT_TOPIC = 'wc_topic_output'
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
 2018-01-24 00:22:36.719312
+++ 
:after
  2018-03-24 03:09:43.217807
@@ -21,14 +21,13 @@
 import time
 import unittest
 
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples.cookbook import bigquery_tornadoes
 from apache_beam.io.gcp.tests import utils
 from apache_beam.io.gcp.tests.bigquery_matcher import BigqueryMatcher
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 
 class BigqueryTornadoesIT(unittest.TestCase):
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
   2018-01-24 00:22:36.959311
+++ 
:after
2018-03-24 03:09:43.361293
@@ -20,12 +20,11 @@
 import logging
 import unittest
 
-from hamcrest.core.assert_that import assert_that as hc_assert_that
-from hamcrest.core.base_matcher import BaseMatcher
-
 from apache_beam.internal import pickler
 from apache_beam.options.pipeline_options import PipelineOptions
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.assert_that import assert_that as hc_assert_that
+from hamcrest.core.base_matcher import BaseMatcher
 
 
 # A simple matcher that is ued for testing extra options appending.
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
   2018-01-24 00:22:36.959311
+++ 
:after
2018-03-24 03:09:43.376873
@@ -25,12 +25,11 @@
 import logging
 import time
 
-from hamcrest.core.base_matcher import BaseMatcher
-
 from apache_beam.io.filesystems import FileSystems
 from apache_beam.runners.runner import PipelineState
 from apache_beam.testing import test_utils as utils
 from apache_beam.utils import retry
+from hamcrest.core.base_matcher import BaseMatcher
 
 __all__ = [
 'PipelineStateMatcher',
ERROR: 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5210

2018-03-23 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #6285

2018-03-23 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83896=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83896
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 24/Mar/18 00:48
Start Date: 24/Mar/18 00:48
Worklog Time Spent: 10m 
  Work Description: cclauss commented on issue #4798: [BEAM-3738] Add more 
flake8 tests to run_pylint.sh
URL: https://github.com/apache/beam/pull/4798#issuecomment-375833548
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83896)
Time Spent: 13h  (was: 12h 50m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83895=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83895
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 24/Mar/18 00:47
Start Date: 24/Mar/18 00:47
Worklog Time Spent: 10m 
  Work Description: cclauss commented on issue #4798: [BEAM-3738] Add more 
flake8 tests to run_pylint.sh
URL: https://github.com/apache/beam/pull/4798#issuecomment-375833548
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83895)
Time Spent: 12h 50m  (was: 12h 40m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Spark #1503

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[coheigea] Put the String literal first when comparing it to an Object

[jb] [BEAM-3500] "Attach" JDBC connection to the bundle (improve the pooling)

[jb] [BEAM-3500] Test if the user provides both withDataSourceConfiguration()

[jb] [BEAM-3500] Wrap the datasource as a poolable datasource and expose

[jb] [BEAM-3500] Add commons-pool2 dependency

[jb] [BEAM-3500] Only expose max number of connections in the pool to the

[jb] [BEAM-3500] Cleanup pool configuration parameters

[jb] [BEAM-3500] Remove dataSourceFactory

[jb] [BEAM-3500] Remove unecessary check on dataSourceConfiguration

[mariand] Switched AvroIO default codec to snappyCodec().

[wcn] Updating generated files.

[aaltay] [BEAM-3738] Enable py3 lint and cleanup tox.ini. (#4877)

--
[...truncated 92.10 KB...]
'apache-beam-testing:bqjob_r36c5fedf4647a044_016255745b6b_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-03-24 00:41:53,696 61827c41 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-03-24 00:42:19,549 61827c41 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-03-24 00:42:22,626 61827c41 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: Upload complete.
Waiting on bqjob_r33739105a75ab995_01625574cb65_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r33739105a75ab995_01625574cb65_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r33739105a75ab995_01625574cb65_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-03-24 00:42:22,626 61827c41 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-03-24 00:42:42,489 61827c41 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-03-24 00:42:44,775 61827c41 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: Upload complete.
Waiting on bqjob_r451b44c0be05a94c_016255752321_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r451b44c0be05a94c_016255752321_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r451b44c0be05a94c_016255752321_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-03-24 00:42:44,775 61827c41 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-03-24 00:43:11,129 61827c41 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-03-24 00:43:13,298 61827c41 

[jira] [Resolved] (BEAM-3881) Failure reading backlog in KinesisIO

2018-03-23 Thread Kevin Peterson (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Peterson resolved BEAM-3881.
--
   Resolution: Fixed
Fix Version/s: 2.4.0

Was fixed in 2.4.

> Failure reading backlog in KinesisIO
> 
>
> Key: BEAM-3881
> URL: https://issues.apache.org/jira/browse/BEAM-3881
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kinesis
>Reporter: Kevin Peterson
>Assignee: Alexey Romanenko
>Priority: Major
> Fix For: 2.4.0
>
>
> I'm getting an error when reading from Kinesis in my pipeline. Using Beam 
> v2.3, running on Google Cloud Dataflow.
> I'm constructing the source via:
> {code:java}
> KinesisIO.Read read = KinesisIO
> .read()
> .withAWSClientsProvider(
> configuration.getAwsAccessKeyId(),
> configuration.getAwsSecretAccessKey(),
> region)
> .withStreamName(configuration.getKinesisStream())
> .withUpToDateThreshold(Duration.standardMinutes(30))
> .withInitialTimestampInStream(configuration.getStartTime());
> {code}
> The exception is:
> {noformat}
> Mar 19, 2018 12:54:41 PM 
> org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
> SEVERE: 2018-03-19T19:54:53.010Z: (2896b8774de760ec): 
> java.lang.RuntimeException: Unknown kinesis failure, when trying to reach 
> kinesis
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:223)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:161)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:150)
> org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:200)
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:398)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1199)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:137)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:940)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArithmeticException: Value cannot fit in an int: 
> 153748225435
> org.joda.time.field.FieldUtils.safeToInt(FieldUtils.java:206)
> org.joda.time.field.BaseDurationField.getDifference(BaseDurationField.java:141)
> org.joda.time.base.BaseSingleFieldPeriod.between(BaseSingleFieldPeriod.java:72)
> org.joda.time.Minutes.minutesBetween(Minutes.java:101)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.lambda$getBacklogBytes$3(SimplifiedKinesisClient.java:163)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:205)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:161)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:150)
> org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:200)
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:398)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1199)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:137)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:940)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)



Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #6284

2018-03-23 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1174

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[wcn] Updating generated files.

--
[...truncated 769.50 KB...]
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": "kind:pair", 
  "component_encodings": [
{
  "@type": "kind:bytes"
}, 
{
  "@type": 
"VarIntCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxhiUWeeSXOIA5XIYNmYyFjbSFTkh4A89cR+g==",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "compute/MapToVoidKey0.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s2"
}, 
"serialized_fn": "", 
"user_name": "compute/MapToVoidKey0"
  }
}
  ], 
  "type": "JOB_TYPE_BATCH"
}
root: INFO: Create job: 
root: INFO: Created job with id: [2018-03-23_17_20_23-17322368233364390018]
root: INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-03-23_17_20_23-17322368233364390018?project=apache-beam-testing
root: INFO: Job 2018-03-23_17_20_23-17322368233364390018 is in state 
JOB_STATE_PENDING
root: INFO: 2018-03-24T00:20:23.955Z: JOB_MESSAGE_WARNING: Job 
2018-03-23_17_20_23-17322368233364390018 might autoscale up to 1000 workers.
root: INFO: 2018-03-24T00:20:23.974Z: JOB_MESSAGE_DETAILED: Autoscaling is 
enabled for job 2018-03-23_17_20_23-17322368233364390018. The number of workers 
will be between 1 and 1000.
root: INFO: 2018-03-24T00:20:24.003Z: JOB_MESSAGE_DETAILED: Autoscaling was 
automatically enabled for job 2018-03-23_17_20_23-17322368233364390018.
root: INFO: 2018-03-24T00:20:27.192Z: JOB_MESSAGE_DETAILED: Checking required 
Cloud APIs are enabled.
root: INFO: 2018-03-24T00:20:27.368Z: JOB_MESSAGE_DETAILED: Checking 
permissions granted to controller Service Account.
root: INFO: 2018-03-24T00:20:28.113Z: JOB_MESSAGE_DETAILED: Expanding 
CoGroupByKey operations into optimizable parts.
root: INFO: 2018-03-24T00:20:28.156Z: JOB_MESSAGE_DEBUG: Combiner lifting 
skipped for step assert_that/Group/GroupByKey: GroupByKey not followed by a 
combiner.
root: INFO: 2018-03-24T00:20:28.194Z: JOB_MESSAGE_DETAILED: Expanding 
GroupByKey operations into optimizable parts.
root: INFO: 2018-03-24T00:20:28.237Z: JOB_MESSAGE_DETAILED: Lifting 
ValueCombiningMappingFns into MergeBucketsMappingFns
root: INFO: 2018-03-24T00:20:28.280Z: JOB_MESSAGE_DEBUG: Annotating graph with 
Autotuner information.
root: INFO: 2018-03-24T00:20:28.321Z: JOB_MESSAGE_DETAILED: Fusing adjacent 
ParDo, Read, Write, and Flatten operations
root: INFO: 2018-03-24T00:20:28.353Z: JOB_MESSAGE_DETAILED: Unzipping flatten 
s11 for input s10.out
root: INFO: 2018-03-24T00:20:28.385Z: JOB_MESSAGE_DETAILED: Fusing unzipped 
copy of assert_that/Group/GroupByKey/Reify, through flatten 
assert_that/Group/Flatten, into producer assert_that/Group/pair_with_1
root: INFO: 2018-03-24T00:20:28.421Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Group/GroupByKey/GroupByWindow into 
assert_that/Group/GroupByKey/Read
root: INFO: 2018-03-24T00:20:28.439Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Unkey into assert_that/Group/Map(_merge_tagged_vals_under_key)
root: INFO: 2018-03-24T00:20:28.468Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Match into assert_that/Unkey
root: INFO: 2018-03-24T00:20:28.494Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Group/Map(_merge_tagged_vals_under_key) into 
assert_that/Group/GroupByKey/GroupByWindow
root: INFO: 2018-03-24T00:20:28.532Z: JOB_MESSAGE_DETAILED: Unzipping flatten 
s11-u13 for input s12-reify-value0-c11
root: INFO: 2018-03-24T00:20:28.554Z: JOB_MESSAGE_DETAILED: Fusing unzipped 
copy of assert_that/Group/GroupByKey/Write, through flatten 

Build failed in Jenkins: beam_PerformanceTests_Python #1059

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[coheigea] Put the String literal first when comparing it to an Object

[jb] [BEAM-3500] "Attach" JDBC connection to the bundle (improve the pooling)

[jb] [BEAM-3500] Test if the user provides both withDataSourceConfiguration()

[jb] [BEAM-3500] Wrap the datasource as a poolable datasource and expose

[jb] [BEAM-3500] Add commons-pool2 dependency

[jb] [BEAM-3500] Only expose max number of connections in the pool to the

[jb] [BEAM-3500] Cleanup pool configuration parameters

[jb] [BEAM-3500] Remove dataSourceFactory

[jb] [BEAM-3500] Remove unecessary check on dataSourceConfiguration

[mariand] Switched AvroIO default codec to snappyCodec().

[wcn] Updating generated files.

[aaltay] [BEAM-3738] Enable py3 lint and cleanup tox.ini. (#4877)

--
[...truncated 59.71 KB...]
[INFO] Copying 212 resources
[INFO] 
[INFO] --- maven-resources-plugin:3.0.2:copy-resources (copy-go-cmd-source) @ 
beam-sdks-go ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 6 resources
[INFO] 
[INFO] --- maven-assembly-plugin:3.1.0:single (export-go-pkg-sources) @ 
beam-sdks-go ---
[INFO] Reading assembly descriptor: descriptor.xml
[INFO] Building zip: 

[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) 
@ beam-sdks-go ---
[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:get (go-get-imports) @ beam-sdks-go ---
[INFO] Prepared command line : bin/go get google.golang.org/grpc 
golang.org/x/oauth2/google google.golang.org/api/storage/v1 
github.com/spf13/cobra cloud.google.com/go/bigquery 
google.golang.org/api/googleapi google.golang.org/api/dataflow/v1b3
[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:build (go-build) @ beam-sdks-go ---
[INFO] Prepared command line : bin/go build -buildmode=default -o 

 github.com/apache/beam/sdks/go/cmd/beamctl
[INFO] The Result file has been successfuly created : 

[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:build (go-build-linux-amd64) @ beam-sdks-go 
---
[INFO] Prepared command line : bin/go build -buildmode=default -o 

 github.com/apache/beam/sdks/go/cmd/beamctl
[INFO] The Result file has been successfuly created : 

[INFO] 
[INFO] --- maven-checkstyle-plugin:3.0.0:check (default) @ beam-sdks-go ---
[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:test (go-test) @ beam-sdks-go ---
[INFO] Prepared command line : bin/go test ./...
[INFO] 
[INFO] -Exec.Out-
[INFO] ?github.com/apache/beam/sdks/go/cmd/beamctl  [no test files]
[INFO] ?github.com/apache/beam/sdks/go/cmd/beamctl/cmd  [no test files]
[INFO] ?github.com/apache/beam/sdks/go/cmd/specialize   [no test files]
[INFO] ?github.com/apache/beam/sdks/go/cmd/symtab   [no test files]
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam 0.096s
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam/artifact0.161s
[INFO] 
[ERROR] 
[ERROR] -Exec.Err-
[ERROR] # github.com/apache/beam/sdks/go/pkg/beam/util/gcsx
[ERROR] github.com/apache/beam/sdks/go/pkg/beam/util/gcsx/gcs.go:46:37: 
undefined: option.WithoutAuthentication
[ERROR] 
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Beam :: Parent .. SUCCESS [ 10.137 s]
[INFO] Apache Beam :: SDKs :: Java :: Build Tools . SUCCESS [  8.723 s]
[INFO] Apache Beam :: Model ... SUCCESS [  0.172 s]
[INFO] Apache Beam :: Model :: Pipeline ... SUCCESS [ 20.940 s]
[INFO] Apache Beam :: Model :: Job Management . SUCCESS [ 12.831 s]
[INFO] Apache Beam :: Model :: Fn Execution ... SUCCESS [ 13.593 s]
[INFO] Apache Beam :: SDKs  SUCCESS [  0.239 s]
[INFO] Apache Beam :: SDKs :: Go .. FAILURE [ 53.873 s]
[INFO] Apache Beam :: SDKs :: Go :: Container . SKIPPED
[INFO] Apache Beam :: SDKs :: Java  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Core  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Fn Execution  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Extensions .. SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Google Cloud Platform Core 
SKIPPED
[INFO] 

Build failed in Jenkins: beam_PerformanceTests_HadoopInputFormat #55

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[coheigea] Put the String literal first when comparing it to an Object

[jb] [BEAM-3500] "Attach" JDBC connection to the bundle (improve the pooling)

[jb] [BEAM-3500] Test if the user provides both withDataSourceConfiguration()

[jb] [BEAM-3500] Wrap the datasource as a poolable datasource and expose

[jb] [BEAM-3500] Add commons-pool2 dependency

[jb] [BEAM-3500] Only expose max number of connections in the pool to the

[jb] [BEAM-3500] Cleanup pool configuration parameters

[jb] [BEAM-3500] Remove dataSourceFactory

[jb] [BEAM-3500] Remove unecessary check on dataSourceConfiguration

[mariand] Switched AvroIO default codec to snappyCodec().

[wcn] Updating generated files.

[aaltay] [BEAM-3738] Enable py3 lint and cleanup tox.ini. (#4877)

--
[...truncated 42.56 KB...]
[INFO] Excluding org.apache.commons:commons-compress:jar:1.16.1 from the shaded 
jar.
[INFO] Excluding io.netty:netty-codec:jar:4.1.5.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http:jar:4.1.5.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-transport:jar:4.1.5.Final from the shaded jar.
[INFO] Excluding io.dropwizard.metrics:metrics-core:jar:3.1.0 from the shaded 
jar.
[INFO] Excluding org.objenesis:objenesis:jar:1.0 from the shaded jar.
[INFO] Excluding commons-io:commons-io:jar:2.4 from the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-runners-google-cloud-dataflow-java:jar:2.5.0-SNAPSHOT from 
the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:2.5.0-SNAPSHOT
 from the shaded jar.
[INFO] Excluding com.google.cloud.bigdataoss:gcsio:jar:1.4.5 from the shaded 
jar.
[INFO] Excluding 
com.google.apis:google-api-services-cloudresourcemanager:jar:v1-rev6-1.22.0 
from the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.5.0-SNAPSHOT from 
the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.5.0-SNAPSHOT from the 
shaded jar.
[INFO] Excluding io.grpc:grpc-core:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.errorprone:error_prone_annotations:jar:2.0.15 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-context:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.instrumentation:instrumentation-api:jar:0.3.0 from 
the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-bigquery:jar:v2-rev374-1.22.0 from the 
shaded jar.
[INFO] Excluding com.google.api:gax-grpc:jar:0.20.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.api:api-common:jar:1.0.0-rc2 from the shaded jar.
[INFO] Excluding com.google.api:gax:jar:1.3.1 from the shaded jar.
[INFO] Excluding org.threeten:threetenbp:jar:1.3.3 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core-grpc:jar:1.2.0 from the 
shaded jar.
[INFO] Excluding com.google.protobuf:protobuf-java-util:jar:3.2.0 from the 
shaded jar.
[INFO] Excluding com.google.apis:google-api-services-pubsub:jar:v1-rev10-1.22.0 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-iam-v1:jar:0.1.18 from the 
shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-protobuf:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-auth:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-netty:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http2:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler-proxy:jar:4.1.8.Final from the shaded 
jar.
[INFO] Excluding io.netty:netty-codec-socks:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-all:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-okhttp:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.squareup.okhttp:okhttp:jar:2.5.0 from the shaded jar.
[INFO] Excluding com.squareup.okio:okio:jar:1.6.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-lite:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-nano:jar:1.2.0 from the shaded jar.
[INFO] Excluding 

[jira] [Commented] (BEAM-3881) Failure reading backlog in KinesisIO

2018-03-23 Thread Kevin Peterson (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412301#comment-16412301
 ] 

Kevin Peterson commented on BEAM-3881:
--

I added a log statement 
[here|https://github.com/apache/beam/blob/master/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/SimplifiedKinesisClient.java#L163],
 and ran the pipeline again.

Looks like the {{CountSince}} input value is nonsensical.
{noformat}
BEAM-3881 --- CountSince: -290308-12-21T19:59:05.225Z, CountTo: 
2018-03-24T00:07:12.818Z
{noformat}
Digging in a bit more, I think this was actually fixed in 
[https://github.com/apache/beam/commit/994c7f3f714d51270f71234a4f5dba752a70766a],
 which changed how the watermark was initialized:
{noformat}
-  private Instant lastWatermark = BoundedWindow.TIMESTAMP_MIN_VALUE;
+  private Instant lastWatermark = 
Instant.now().minus(MAX_KINESIS_STREAM_RETENTION_PERIOD);
{noformat}

That change should be in v2.4, so I'll update the pipeline and close the ticket 
if it works.

> Failure reading backlog in KinesisIO
> 
>
> Key: BEAM-3881
> URL: https://issues.apache.org/jira/browse/BEAM-3881
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kinesis
>Reporter: Kevin Peterson
>Assignee: Alexey Romanenko
>Priority: Major
>
> I'm getting an error when reading from Kinesis in my pipeline. Using Beam 
> v2.3, running on Google Cloud Dataflow.
> I'm constructing the source via:
> {code:java}
> KinesisIO.Read read = KinesisIO
> .read()
> .withAWSClientsProvider(
> configuration.getAwsAccessKeyId(),
> configuration.getAwsSecretAccessKey(),
> region)
> .withStreamName(configuration.getKinesisStream())
> .withUpToDateThreshold(Duration.standardMinutes(30))
> .withInitialTimestampInStream(configuration.getStartTime());
> {code}
> The exception is:
> {noformat}
> Mar 19, 2018 12:54:41 PM 
> org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
> SEVERE: 2018-03-19T19:54:53.010Z: (2896b8774de760ec): 
> java.lang.RuntimeException: Unknown kinesis failure, when trying to reach 
> kinesis
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:223)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:161)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:150)
> org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:200)
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:398)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1199)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:137)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:940)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArithmeticException: Value cannot fit in an int: 
> 153748225435
> org.joda.time.field.FieldUtils.safeToInt(FieldUtils.java:206)
> org.joda.time.field.BaseDurationField.getDifference(BaseDurationField.java:141)
> org.joda.time.base.BaseSingleFieldPeriod.between(BaseSingleFieldPeriod.java:72)
> org.joda.time.Minutes.minutesBetween(Minutes.java:101)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.lambda$getBacklogBytes$3(SimplifiedKinesisClient.java:163)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:205)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:161)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:150)
> org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:200)
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:398)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1199)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:137)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:940)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745){noformat}



--
This message was 

Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Spark #4485

2018-03-23 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #4494

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[wcn] Updating generated files.

--
[...truncated 264.89 KB...]
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples import wordcount
 from apache_beam.examples import wordcount_fnapi
 from apache_beam.testing.pipeline_verifiers import FileChecksumMatcher
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
 from apache_beam.testing.test_utils import delete_files
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 
 class WordCountIT(unittest.TestCase):
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
 2018-03-23 17:50:29.204663
+++ 
:after
  2018-03-23 23:59:26.858301
@@ -29,15 +29,14 @@
 import unittest
 import uuid
 
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples import streaming_wordcount
 from apache_beam.io.gcp.tests.pubsub_matcher import PubSubMessageMatcher
 from apache_beam.runners.runner import PipelineState
 from apache_beam.testing import test_utils
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 INPUT_TOPIC = 'wc_topic_input'
 OUTPUT_TOPIC = 'wc_topic_output'
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
 2018-01-24 00:22:36.719312
+++ 
:after
  2018-03-23 23:59:27.394002
@@ -21,14 +21,13 @@
 import time
 import unittest
 
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples.cookbook import bigquery_tornadoes
 from apache_beam.io.gcp.tests import utils
 from apache_beam.io.gcp.tests.bigquery_matcher import BigqueryMatcher
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 
 class BigqueryTornadoesIT(unittest.TestCase):
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
   2018-01-24 00:22:36.959311
+++ 
:after
2018-03-23 23:59:27.535269
@@ -20,12 +20,11 @@
 import logging
 import unittest
 
-from hamcrest.core.assert_that import assert_that as hc_assert_that
-from hamcrest.core.base_matcher import BaseMatcher
-
 from apache_beam.internal import pickler
 from apache_beam.options.pipeline_options import PipelineOptions
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.assert_that import assert_that as hc_assert_that
+from hamcrest.core.base_matcher import BaseMatcher
 
 
 # A simple matcher that is ued for testing extra options appending.
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
   2018-01-24 00:22:36.959311
+++ 
:after
2018-03-23 23:59:27.549958
@@ -25,12 +25,11 @@
 import logging
 import time
 
-from hamcrest.core.base_matcher import BaseMatcher
-
 from apache_beam.io.filesystems import FileSystems
 from apache_beam.runners.runner import PipelineState
 from apache_beam.testing import test_utils as utils
 from apache_beam.utils import retry
+from hamcrest.core.base_matcher import BaseMatcher
 
 __all__ = [
 

[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=83882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83882
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 23/Mar/18 23:59
Start Date: 23/Mar/18 23:59
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r176889665
 
 

 ##
 File path: release/src/main/groovy/mobilegaming-java-dataflow.groovy
 ##
 @@ -0,0 +1,106 @@
+#!groovy
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import MobileGamingJavaUtils
+
+t = new TestScripts(args)
+
+/*
+ * Run the mobile game examples on Dataflow.
+ * https://beam.apache.org/get-started/mobile-gaming-example/
+ */
+
+t.describe 'Run Apache Beam Java SDK Mobile Gaming Examples - Dataflow'
+
+QuickstartArchetype.generate(t)
+
+t.intent 'Running the Mobile-Gaming Code with DataflowRunner'
+
+def runner = "DataflowRunner"
+
+/**
+ *  Run the UserScore example on DataflowRunner
+ * */
+
+t.intent("Running: UserScore example on DataflowRunner")
 
 Review comment:
   How do multiple intents interact with each other?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83882)
Time Spent: 76h 40m  (was: 76.5h)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 76h 40m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=83884=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83884
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 23/Mar/18 23:59
Start Date: 23/Mar/18 23:59
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r176889373
 
 

 ##
 File path: release/src/main/groovy/MoblieGamingJavaUtils.groovy
 ##
 @@ -0,0 +1,148 @@
+#!groovy
+import java.text.SimpleDateFormat
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+class MobileGamingJavaUtils {
+
+  public static final RUNNERS = [DirectRunner: "direct-runner",
+DataflowRunner: "dataflow-runner",
+SparkRunner: "spark-runner",
+ApexRunner: "apex-runner",
+FlinkRunner: "flink-runner"]
+
+  public static final EXECUTION_TIMEOUT = 20
+
+  // Lists used to verify team names generated in the LeaderBoard example.
+  // This list should be kept sync with COLORS in 
org.apache.beam.examples.complete.game.injector.Injector.
+  public static final COLORS = new ArrayList<>(Arrays.asList(
+"Magenta",
+"AliceBlue",
+"Almond",
+"Amaranth",
+"Amber",
+"Amethyst",
+"AndroidGreen",
+"AntiqueBrass",
+"Fuchsia",
+"Ruby",
+"AppleGreen",
+"Apricot",
+"Aqua",
+"ArmyGreen",
+"Asparagus",
+"Auburn",
+"Azure",
+"Banana",
+"Beige",
+"Bisque",
+"BarnRed",
+"BattleshipGrey"))
+
+  private static final USERSCORE_OUTPUT_PREFIX = "java-userscore-result-"
 
 Review comment:
   This should be inlineable


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83884)
Time Spent: 76h 50m  (was: 76h 40m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 76h 50m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=83883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83883
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 23/Mar/18 23:59
Start Date: 23/Mar/18 23:59
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r176889691
 
 

 ##
 File path: release/src/main/groovy/mobilegaming-java-dataflow.groovy
 ##
 @@ -0,0 +1,106 @@
+#!groovy
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import MobileGamingJavaUtils
+
+t = new TestScripts(args)
+
+/*
+ * Run the mobile game examples on Dataflow.
+ * https://beam.apache.org/get-started/mobile-gaming-example/
+ */
+
+t.describe 'Run Apache Beam Java SDK Mobile Gaming Examples - Dataflow'
+
+QuickstartArchetype.generate(t)
+
+t.intent 'Running the Mobile-Gaming Code with DataflowRunner'
 
 Review comment:
   Some of the `t.intent` calls are formatted inconsistently with the others.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83883)
Time Spent: 76h 50m  (was: 76h 40m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 76h 50m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=83881=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83881
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 23/Mar/18 23:59
Start Date: 23/Mar/18 23:59
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r176889511
 
 

 ##
 File path: release/src/main/groovy/TestScripts.groovy
 ##
 @@ -95,16 +116,31 @@ class TestScripts {
  }
}
 
-   // Check for expected results in stdout of the last command
+   // Check for expected results in stdout of the last command, if fails, log 
errors then exit.
public void see(String expected) {
  if (!var.lastText.contains(expected)) {
var.startDir.deleteDir()
println "Cannot find ${expected} in ${var.lastText}"
-   _error("Cannot find expected text")
+   error("Cannot find expected text")
  }
  println "Verified $expected"
}
 
+   public boolean seeOneOf(List expected) {
 
 Review comment:
   `seeAnyOf`? This succeeds on 1 or more matches


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83881)
Time Spent: 76.5h  (was: 76h 20m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 76.5h
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=83878=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83878
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 23/Mar/18 23:59
Start Date: 23/Mar/18 23:59
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r176887180
 
 

 ##
 File path: release/src/main/groovy/MoblieGamingJavaUtils.groovy
 ##
 @@ -0,0 +1,148 @@
+#!groovy
+import java.text.SimpleDateFormat
 
 Review comment:
   This ordering seems off - unless there's a traditional style I'm missing


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83878)
Time Spent: 76h  (was: 75h 50m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 76h
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=83879=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83879
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 23/Mar/18 23:59
Start Date: 23/Mar/18 23:59
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r176889381
 
 

 ##
 File path: release/src/main/groovy/MoblieGamingJavaUtils.groovy
 ##
 @@ -0,0 +1,148 @@
+#!groovy
+import java.text.SimpleDateFormat
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+class MobileGamingJavaUtils {
+
+  public static final RUNNERS = [DirectRunner: "direct-runner",
+DataflowRunner: "dataflow-runner",
+SparkRunner: "spark-runner",
+ApexRunner: "apex-runner",
+FlinkRunner: "flink-runner"]
+
+  public static final EXECUTION_TIMEOUT = 20
+
+  // Lists used to verify team names generated in the LeaderBoard example.
+  // This list should be kept sync with COLORS in 
org.apache.beam.examples.complete.game.injector.Injector.
+  public static final COLORS = new ArrayList<>(Arrays.asList(
+"Magenta",
+"AliceBlue",
+"Almond",
+"Amaranth",
+"Amber",
+"Amethyst",
+"AndroidGreen",
+"AntiqueBrass",
+"Fuchsia",
+"Ruby",
+"AppleGreen",
+"Apricot",
+"Aqua",
+"ArmyGreen",
+"Asparagus",
+"Auburn",
+"Azure",
+"Banana",
+"Beige",
+"Bisque",
+"BarnRed",
+"BattleshipGrey"))
+
+  private static final USERSCORE_OUTPUT_PREFIX = "java-userscore-result-"
+
+  private static final HOURLYTEAMSCORE_OUTPUT_PREFIX = 
"java-hourlyteamscore-result-"
 
 Review comment:
   Same.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83879)
Time Spent: 76h 10m  (was: 76h)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 76h 10m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=83880=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83880
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 23/Mar/18 23:59
Start Date: 23/Mar/18 23:59
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r176889656
 
 

 ##
 File path: release/src/main/groovy/mobilegaming-java-dataflow.groovy
 ##
 @@ -0,0 +1,106 @@
+#!groovy
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import MobileGamingJavaUtils
+
+t = new TestScripts(args)
+
+/*
+ * Run the mobile game examples on Dataflow.
+ * https://beam.apache.org/get-started/mobile-gaming-example/
+ */
+
+t.describe 'Run Apache Beam Java SDK Mobile Gaming Examples - Dataflow'
+
+QuickstartArchetype.generate(t)
+
+t.intent 'Running the Mobile-Gaming Code with DataflowRunner'
+
+def runner = "DataflowRunner"
+
+/**
+ *  Run the UserScore example on DataflowRunner
+ * */
+
+t.intent("Running: UserScore example on DataflowRunner")
+t.run(MobileGamingJavaUtils.createExampleExecutionCommand("UserScore", runner, 
t))
+t.run "gsutil cat 
gs://${t.gcsBucket()}/${MobileGamingJavaUtils.getUserScoreOutputName(runner)}* 
| grep user19_BananaWallaby"
+t.see "total_score: 231, user: user19_BananaWallaby"
+t.intent("SUCCEED: UserScore successfully run on DataflowRunner.")
 
 Review comment:
   This doesn't seem like an intent to me


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83880)
Time Spent: 76h 20m  (was: 76h 10m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 76h 20m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=83885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83885
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 23/Mar/18 23:59
Start Date: 23/Mar/18 23:59
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r176889976
 
 

 ##
 File path: release/src/main/groovy/mobilegaming-java-dataflow.groovy
 ##
 @@ -0,0 +1,106 @@
+#!groovy
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import MobileGamingJavaUtils
+
+t = new TestScripts(args)
+
+/*
+ * Run the mobile game examples on Dataflow.
+ * https://beam.apache.org/get-started/mobile-gaming-example/
+ */
+
+t.describe 'Run Apache Beam Java SDK Mobile Gaming Examples - Dataflow'
+
+QuickstartArchetype.generate(t)
+
+t.intent 'Running the Mobile-Gaming Code with DataflowRunner'
+
+def runner = "DataflowRunner"
+
+/**
+ *  Run the UserScore example on DataflowRunner
+ * */
+
+t.intent("Running: UserScore example on DataflowRunner")
+t.run(MobileGamingJavaUtils.createExampleExecutionCommand("UserScore", runner, 
t))
+t.run "gsutil cat 
gs://${t.gcsBucket()}/${MobileGamingJavaUtils.getUserScoreOutputName(runner)}* 
| grep user19_BananaWallaby"
+t.see "total_score: 231, user: user19_BananaWallaby"
+t.intent("SUCCEED: UserScore successfully run on DataflowRunner.")
+t.run "gsutil rm 
gs://${t.gcsBucket()}/${MobileGamingJavaUtils.getUserScoreOutputName(runner)}*"
+
+
+/**
+ * Run the HourlyTeamScore example on DataflowRunner
+ * */
+
+t.intent("Running: HourlyTeamScore example on DataflowRunner")
+t.run(MobileGamingJavaUtils.createExampleExecutionCommand("HourlyTeamScore", 
runner, t))
+t.run "gsutil cat 
gs://${t.gcsBucket()}/${MobileGamingJavaUtils.getHourlyTeamScoreOutputName(runner)}*
 | grep AzureBilby "
+t.see "total_score: 2788, team: AzureBilby"
+t.intent("SUCCEED: HourlyTeamScore successfully run on DataflowRunner.")
+t.run "gsutil rm 
gs://${t.gcsBucket()}/${MobileGamingJavaUtils.getHourlyTeamScoreOutputName(runner)}*"
+
+
+/**
+ * Run the LeaderBoard example on DataflowRunner
+ * */
+
+t.intent("Running: LeaderBoard example on DataflowRunner")
+t.run("bq rm -f -t ${t.bqDataset()}.leaderboard_DataflowRunner_user")
+t.run("bq rm -f -t ${t.bqDataset()}.leaderboard_DataflowRunner_team")
+// it will take couple seconds to clean up tables. This loop makes sure tables 
are completely deleted before running the pipeline
+while({
+  sleep(3000)
+  t.run ("bq query SELECT table_id FROM ${t.bqDataset()}.__TABLES_SUMMARY__")
 
 Review comment:
   I'm not super happy about the mutation of the `TestScripts` object to check 
a state - can't it just return the content of interest?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83885)
Time Spent: 77h  (was: 76h 50m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 77h
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=83886=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83886
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 23/Mar/18 23:59
Start Date: 23/Mar/18 23:59
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r176889556
 
 

 ##
 File path: release/src/main/groovy/TestScripts.groovy
 ##
 @@ -95,16 +116,31 @@ class TestScripts {
  }
}
 
-   // Check for expected results in stdout of the last command
+   // Check for expected results in stdout of the last command, if fails, log 
errors then exit.
public void see(String expected) {
  if (!var.lastText.contains(expected)) {
var.startDir.deleteDir()
println "Cannot find ${expected} in ${var.lastText}"
-   _error("Cannot find expected text")
+   error("Cannot find expected text")
  }
  println "Verified $expected"
}
 
+   public boolean seeOneOf(List expected) {
+ for (String expect: expected) {
+   if(var.lastText.contains(expect)) {
+ println "Verified $expect"
+ return true
+   }
+ }
+ println "Cannot find ${expected} in text"
+ return false
+   }
+
+   public boolean seeExact(String expected) {
 
 Review comment:
   Do we use this anywhere?
   
   Even if we do, it looks more like "seeSubstring"


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83886)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 77h
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=83887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83887
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 23/Mar/18 23:59
Start Date: 23/Mar/18 23:59
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r176889355
 
 

 ##
 File path: release/src/main/groovy/MoblieGamingJavaUtils.groovy
 ##
 @@ -0,0 +1,148 @@
+#!groovy
+import java.text.SimpleDateFormat
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+class MobileGamingJavaUtils {
+
+  public static final RUNNERS = [DirectRunner: "direct-runner",
+DataflowRunner: "dataflow-runner",
+SparkRunner: "spark-runner",
+ApexRunner: "apex-runner",
+FlinkRunner: "flink-runner"]
+
+  public static final EXECUTION_TIMEOUT = 20
 
 Review comment:
   Unitless values aren't great


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83887)
Time Spent: 77h 10m  (was: 77h)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 77h 10m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Merge pull request #4940: Update generated files

2018-03-23 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 8b72e5af513e5182c403f3b67ff6d16395c281db
Merge: 82c32de 0a2f430
Author: Thomas Groh 
AuthorDate: Fri Mar 23 16:18:54 2018 -0700

Merge pull request #4940: Update generated files

 .../go/pkg/beam/core/runtime/exec/optimized/gen.go |   1 +
 sdks/go/pkg/beam/core/runtime/harness/gen.go   |   2 +-
 sdks/go/pkg/beam/core/util/reflectx/call.go|   1 +
 .../beam/model/fnexecution_v1/beam_fn_api.pb.go| 665 ++---
 .../model/fnexecution_v1/beam_provision_api.pb.go  |   6 +-
 sdks/go/pkg/beam/model/gen.go  |   6 +-
 .../beam/model/pipeline_v1/beam_runner_api.pb.go   | 610 ++-
 sdks/go/pkg/beam/transforms/stats/max.go   |   1 +
 sdks/go/pkg/beam/transforms/stats/min.go   |   1 +
 sdks/go/pkg/beam/transforms/stats/sum.go   |   1 +
 10 files changed, 680 insertions(+), 614 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.


[beam] branch master updated (82c32de -> 8b72e5a)

2018-03-23 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 82c32de  Merge pull request #4904: [BEAM-3874] Switched AvroIO default 
codec to snappyCodec.
 add 0a2f430  Updating generated files.
 new 8b72e5a  Merge pull request #4940: Update generated files

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../go/pkg/beam/core/runtime/exec/optimized/gen.go |   1 +
 sdks/go/pkg/beam/core/runtime/harness/gen.go   |   2 +-
 sdks/go/pkg/beam/core/util/reflectx/call.go|   1 +
 .../beam/model/fnexecution_v1/beam_fn_api.pb.go| 665 ++---
 .../model/fnexecution_v1/beam_provision_api.pb.go  |   6 +-
 sdks/go/pkg/beam/model/gen.go  |   6 +-
 .../beam/model/pipeline_v1/beam_runner_api.pb.go   | 610 ++-
 sdks/go/pkg/beam/transforms/stats/max.go   |   1 +
 sdks/go/pkg/beam/transforms/stats/min.go   |   1 +
 sdks/go/pkg/beam/transforms/stats/sum.go   |   1 +
 10 files changed, 680 insertions(+), 614 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.


[jira] [Commented] (BEAM-3881) Failure reading backlog in KinesisIO

2018-03-23 Thread Kevin Peterson (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412213#comment-16412213
 ] 

Kevin Peterson commented on BEAM-3881:
--

I think I've narrowed down the conditions this occurs to when reading a backlog 
of messages. Once the pipeline has fully caught up (or is started with a trim 
horizon of LATEST), I don't see the error anymore.

Happens repeatedly, but not continuously, when reading older messages though. 
This also occurs on pipelines with throughputs large (100k elements/s, 100 
workers) and small (100 elements/s, 1 worker). I'll see a few exceptions of 
this nature over a 2-3 min period, and then no exceptions for 15-30 min, and 
then repeat.

> Failure reading backlog in KinesisIO
> 
>
> Key: BEAM-3881
> URL: https://issues.apache.org/jira/browse/BEAM-3881
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kinesis
>Reporter: Kevin Peterson
>Assignee: Alexey Romanenko
>Priority: Major
>
> I'm getting an error when reading from Kinesis in my pipeline. Using Beam 
> v2.3, running on Google Cloud Dataflow.
> I'm constructing the source via:
> {code:java}
> KinesisIO.Read read = KinesisIO
> .read()
> .withAWSClientsProvider(
> configuration.getAwsAccessKeyId(),
> configuration.getAwsSecretAccessKey(),
> region)
> .withStreamName(configuration.getKinesisStream())
> .withUpToDateThreshold(Duration.standardMinutes(30))
> .withInitialTimestampInStream(configuration.getStartTime());
> {code}
> The exception is:
> {noformat}
> Mar 19, 2018 12:54:41 PM 
> org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
> SEVERE: 2018-03-19T19:54:53.010Z: (2896b8774de760ec): 
> java.lang.RuntimeException: Unknown kinesis failure, when trying to reach 
> kinesis
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:223)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:161)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:150)
> org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:200)
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:398)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1199)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:137)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:940)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArithmeticException: Value cannot fit in an int: 
> 153748225435
> org.joda.time.field.FieldUtils.safeToInt(FieldUtils.java:206)
> org.joda.time.field.BaseDurationField.getDifference(BaseDurationField.java:141)
> org.joda.time.base.BaseSingleFieldPeriod.between(BaseSingleFieldPeriod.java:72)
> org.joda.time.Minutes.minutesBetween(Minutes.java:101)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.lambda$getBacklogBytes$3(SimplifiedKinesisClient.java:163)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:205)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:161)
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:150)
> org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:200)
> com.google.cloud.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:398)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1199)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:137)
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:940)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3355) Make Go SDK runtime harness hooks pluggable

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3355?focusedWorklogId=83855=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83855
 ]

ASF GitHub Bot logged work on BEAM-3355:


Author: ASF GitHub Bot
Created on: 23/Mar/18 22:42
Start Date: 23/Mar/18 22:42
Worklog Time Spent: 10m 
  Work Description: wcn3 commented on issue #4311: [BEAM-3355] Diagnostic 
interfaces
URL: https://github.com/apache/beam/pull/4311#issuecomment-375817227
 
 
   PTAL with regards to the user API surface.
   
   I haven't actually tested these changes, there are still some internal 
inconsistencies and likely bugs. I just wanted to make sure we're on the same 
page with regards to the high-level API per our previous discussions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83855)
Time Spent: 4h 20m  (was: 4h 10m)

> Make Go SDK runtime harness hooks pluggable
> ---
>
> Key: BEAM-3355
> URL: https://issues.apache.org/jira/browse/BEAM-3355
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Bill Neubauer
>Priority: Minor
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> We currently hardcode cpu profiling and session recording in the harness. We 
> should make it pluggable instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3776) StateMerging.mergeWatermarks sets a late watermark hold for late merging windows that depend only on the window

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3776?focusedWorklogId=83846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83846
 ]

ASF GitHub Bot logged work on BEAM-3776:


Author: ASF GitHub Bot
Created on: 23/Mar/18 22:27
Start Date: 23/Mar/18 22:27
Worklog Time Spent: 10m 
  Work Description: scwhittle commented on a change in pull request #4793: 
[BEAM-3776] Fix issue with merging late windows where a watermark hold could be 
added behind the input watermark.
URL: https://github.com/apache/beam/pull/4793#discussion_r176878673
 
 

 ##
 File path: 
runners/core-java/src/test/java/org/apache/beam/runners/core/ReduceFnTester.java
 ##
 @@ -533,28 +534,33 @@ public void advanceSynchronizedProcessingTime(
*/
   @SafeVarargs
   public final void injectElements(TimestampedValue... values) throws 
Exception {
+injectElements(Arrays.asList(values));
+  }
+
+  public final void injectElements(Iterable values) 
throws Exception {
 for (TimestampedValue value : values) {
   WindowTracing.trace("TriggerTester.injectElements: {}", value);
 }
 
 Iterable inputs =
-Arrays.asList(values)
-.stream()
-.map(
-input -> {
-  try {
-InputT value = input.getValue();
-Instant timestamp = input.getTimestamp();
-Collection windows =
-windowFn.assignWindows(
-new TestAssignContext(
-windowFn, value, timestamp, 
GlobalWindow.INSTANCE));
-return WindowedValue.of(value, timestamp, windows, 
PaneInfo.NO_FIRING);
-  } catch (Exception e) {
-throw new RuntimeException(e);
-  }
-})
-.collect(Collectors.toList());
+Iterables.transform(
 
 Review comment:
   Reworked to keep streams without needing StreamSupport


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83846)
Time Spent: 4h  (was: 3h 50m)

> StateMerging.mergeWatermarks sets a late watermark hold for late merging 
> windows that depend only on the window
> ---
>
> Key: BEAM-3776
> URL: https://issues.apache.org/jira/browse/BEAM-3776
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 2.1.0, 2.2.0, 2.3.0
>Reporter: Sam Whittle
>Assignee: Sam Whittle
>Priority: Critical
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> WatermarkHold.addElementHold and WatermarkHold.addGarbageCollectionHold take 
> to not add holds that would be before the input watermark.
> However WatermarkHold.onMerge calls StateMerging.mergeWatermarks which if the 
> window depends only on window, sets a hold for the end of the window 
> regardless of the input watermark.
> Thus if you have a WindowingStrategy such as:
> WindowingStrategy.of(Sessions.withGapDuration(gapDuration))
>  .withMode(AccumulationMode.DISCARDING_FIRED_PANES)
>  .withTrigger(
>  Repeatedly.forever(
>  AfterWatermark.pastEndOfWindow()
>  .withLateFirings(AfterPane.elementCountAtLeast(10
>  .withAllowedLateness(allowedLateness))
> and you merge windows that are late, you might end up holding the watermark 
> until the allowedLateness has passed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3776) StateMerging.mergeWatermarks sets a late watermark hold for late merging windows that depend only on the window

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3776?focusedWorklogId=83847=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83847
 ]

ASF GitHub Bot logged work on BEAM-3776:


Author: ASF GitHub Bot
Created on: 23/Mar/18 22:27
Start Date: 23/Mar/18 22:27
Worklog Time Spent: 10m 
  Work Description: scwhittle commented on issue #4793: [BEAM-3776] Fix 
issue with merging late windows where a watermark hold could be added behind 
the input watermark.
URL: https://github.com/apache/beam/pull/4793#issuecomment-375814669
 
 
   @tgroh Can you PTAL? Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83847)
Time Spent: 4h 10m  (was: 4h)

> StateMerging.mergeWatermarks sets a late watermark hold for late merging 
> windows that depend only on the window
> ---
>
> Key: BEAM-3776
> URL: https://issues.apache.org/jira/browse/BEAM-3776
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 2.1.0, 2.2.0, 2.3.0
>Reporter: Sam Whittle
>Assignee: Sam Whittle
>Priority: Critical
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> WatermarkHold.addElementHold and WatermarkHold.addGarbageCollectionHold take 
> to not add holds that would be before the input watermark.
> However WatermarkHold.onMerge calls StateMerging.mergeWatermarks which if the 
> window depends only on window, sets a hold for the end of the window 
> regardless of the input watermark.
> Thus if you have a WindowingStrategy such as:
> WindowingStrategy.of(Sessions.withGapDuration(gapDuration))
>  .withMode(AccumulationMode.DISCARDING_FIRED_PANES)
>  .withTrigger(
>  Repeatedly.forever(
>  AfterWatermark.pastEndOfWindow()
>  .withLateFirings(AfterPane.elementCountAtLeast(10
>  .withAllowedLateness(allowedLateness))
> and you merge windows that are late, you might end up holding the watermark 
> until the allowedLateness has passed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3776) StateMerging.mergeWatermarks sets a late watermark hold for late merging windows that depend only on the window

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3776?focusedWorklogId=83844=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83844
 ]

ASF GitHub Bot logged work on BEAM-3776:


Author: ASF GitHub Bot
Created on: 23/Mar/18 22:25
Start Date: 23/Mar/18 22:25
Worklog Time Spent: 10m 
  Work Description: scwhittle commented on a change in pull request #4793: 
[BEAM-3776] Fix issue with merging late windows where a watermark hold could be 
added behind the input watermark.
URL: https://github.com/apache/beam/pull/4793#discussion_r176878350
 
 

 ##
 File path: 
runners/core-java/src/test/java/org/apache/beam/runners/core/ReduceFnRunnerTest.java
 ##
 @@ -873,6 +911,288 @@ public void testWatermarkHoldAndLateData() throws 
Exception {
 tester.assertHasOnlyGlobalAndFinishedSetsFor();
   }
 
+  // Performs the specified actions and verifies that the watermark hold is 
set correctly.
+  public void mergingWatermarkHoldTestHelper(List configuration) 
throws Exception {
+LOG.info("Running config {}",  configuration);
+MetricsContainerImpl container = new MetricsContainerImpl("any");
+MetricsEnvironment.setCurrentContainer(container);
+// Test handling of late data. Specifically, ensure the watermark hold is 
correct.
+Duration allowedLateness = Duration.standardMinutes(1);
+Duration gapDuration = Duration.millis(10);
+LOG.info("Gap duration {}", gapDuration);
+ReduceFnTester tester =
+ReduceFnTester.nonCombining(
+WindowingStrategy.of(Sessions.withGapDuration(gapDuration))
+.withMode(AccumulationMode.DISCARDING_FIRED_PANES)
+.withTrigger(
+Repeatedly.forever(
+AfterWatermark.pastEndOfWindow()
+
.withLateFirings(AfterPane.elementCountAtLeast(1
+.withAllowedLateness(allowedLateness));
+tester.setAutoAdvanceOutputWatermark(false);
+
+// Input watermark -> null
+assertEquals(null, tester.getWatermarkHold());
+assertEquals(null, tester.getOutputWatermark());
+
+int maxTs = 0;
+long watermark = 0;
+for (Action action : configuration) {
+  if (action.times != null) {
+LOG.info("Injecting {}", action.times);
+injectElements(tester, action.times);
+int maxLocalTs = Ordering.natural().max(action.times);
+if (maxLocalTs > maxTs) {
+  maxTs = maxLocalTs;
+}
+  }
+  if (action.inputWatermark > watermark) {
+watermark = action.inputWatermark;
+LOG.info("Advancing watermark to {}", new Instant(watermark));
+tester.advanceInputWatermark(new Instant(watermark));
+  }
+  Instant hold = tester.getWatermarkHold();
+  if (hold != null) {
+assertThat(hold, greaterThanOrEqualTo(new Instant(watermark)));
+assertThat(watermark, lessThan(maxTs + gapDuration.getMillis()));
+  }
+}
+if (gapDuration.getMillis() + maxTs > watermark) {
+  watermark = gapDuration.getMillis() + maxTs;
+  tester.advanceInputWatermark(new Instant(watermark));
+}
+LOG.info("Output {}", tester.extractOutput());
+assertThat(tester.getWatermarkHold(), nullValue());
+tester.advanceInputWatermark(new Instant(watermark).plus(allowedLateness));
+assertThat(tester.getWatermarkHold(), nullValue());
+
+// Nothing dropped.
+long droppedElements =
+container
+.getCounter(
+MetricName.named(ReduceFnRunner.class,
+ReduceFnRunner.DROPPED_DUE_TO_CLOSED_WINDOW))
+.getCumulative()
+.longValue();
+assertEquals(0, droppedElements);
+  }
+
+  @Test
+  public void testMergingWatermarkHoldLateNewWindow() throws Exception {
+LinkedList actions = new LinkedList<>();
+actions.add(Action.inputWatermark(40));
+actions.add(Action.times(1));
+mergingWatermarkHoldTestHelper(actions);
+  }
+
+  @Test
+  public void testMergingWatermarkHoldLateNewWindowExtended() throws Exception 
{
+LinkedList actions = new LinkedList<>();
+  actions.add(Action.inputWatermark(40));
+  actions.add(Action.times(1));
+  actions.add(Action.times(10));
+mergingWatermarkHoldTestHelper(actions);
+  }
+
+  @Test
+  public void testMergingWatermarkHoldLateNewWindowMerged() throws Exception {
+LinkedList actions = new LinkedList<>();
+actions.add(Action.inputWatermark(40));
+actions.add(Action.times(1));
+actions.add(Action.times(14));
+actions.add(Action.times(6));
+mergingWatermarkHoldTestHelper(actions);
+  }
+
+  @Test
+  public void 
testMergingWatermarkHoldLateNewWindowExtendedPastInputWatermark() throws 
Exception {
+LinkedList actions = new LinkedList<>();
+actions.add(Action.inputWatermark(40));
+actions.add(Action.times(25));
+

[jira] [Work logged] (BEAM-3776) StateMerging.mergeWatermarks sets a late watermark hold for late merging windows that depend only on the window

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3776?focusedWorklogId=83842=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83842
 ]

ASF GitHub Bot logged work on BEAM-3776:


Author: ASF GitHub Bot
Created on: 23/Mar/18 22:24
Start Date: 23/Mar/18 22:24
Worklog Time Spent: 10m 
  Work Description: scwhittle commented on a change in pull request #4793: 
[BEAM-3776] Fix issue with merging late windows where a watermark hold could be 
added behind the input watermark.
URL: https://github.com/apache/beam/pull/4793#discussion_r176878187
 
 

 ##
 File path: 
runners/core-java/src/test/java/org/apache/beam/runners/core/ReduceFnRunnerTest.java
 ##
 @@ -873,6 +911,288 @@ public void testWatermarkHoldAndLateData() throws 
Exception {
 tester.assertHasOnlyGlobalAndFinishedSetsFor();
   }
 
+  // Performs the specified actions and verifies that the watermark hold is 
set correctly.
+  public void mergingWatermarkHoldTestHelper(List configuration) 
throws Exception {
+LOG.info("Running config {}",  configuration);
+MetricsContainerImpl container = new MetricsContainerImpl("any");
+MetricsEnvironment.setCurrentContainer(container);
+// Test handling of late data. Specifically, ensure the watermark hold is 
correct.
+Duration allowedLateness = Duration.standardMinutes(1);
+Duration gapDuration = Duration.millis(10);
+LOG.info("Gap duration {}", gapDuration);
+ReduceFnTester tester =
+ReduceFnTester.nonCombining(
+WindowingStrategy.of(Sessions.withGapDuration(gapDuration))
+.withMode(AccumulationMode.DISCARDING_FIRED_PANES)
+.withTrigger(
+Repeatedly.forever(
+AfterWatermark.pastEndOfWindow()
+
.withLateFirings(AfterPane.elementCountAtLeast(1
+.withAllowedLateness(allowedLateness));
+tester.setAutoAdvanceOutputWatermark(false);
+
+// Input watermark -> null
+assertEquals(null, tester.getWatermarkHold());
+assertEquals(null, tester.getOutputWatermark());
+
+int maxTs = 0;
+long watermark = 0;
+for (Action action : configuration) {
+  if (action.times != null) {
+LOG.info("Injecting {}", action.times);
+injectElements(tester, action.times);
+int maxLocalTs = Ordering.natural().max(action.times);
+if (maxLocalTs > maxTs) {
+  maxTs = maxLocalTs;
+}
+  }
+  if (action.inputWatermark > watermark) {
+watermark = action.inputWatermark;
+LOG.info("Advancing watermark to {}", new Instant(watermark));
+tester.advanceInputWatermark(new Instant(watermark));
+  }
+  Instant hold = tester.getWatermarkHold();
+  if (hold != null) {
+assertThat(hold, greaterThanOrEqualTo(new Instant(watermark)));
+assertThat(watermark, lessThan(maxTs + gapDuration.getMillis()));
+  }
+}
+if (gapDuration.getMillis() + maxTs > watermark) {
+  watermark = gapDuration.getMillis() + maxTs;
+  tester.advanceInputWatermark(new Instant(watermark));
+}
+LOG.info("Output {}", tester.extractOutput());
+assertThat(tester.getWatermarkHold(), nullValue());
+tester.advanceInputWatermark(new Instant(watermark).plus(allowedLateness));
+assertThat(tester.getWatermarkHold(), nullValue());
+
+// Nothing dropped.
+long droppedElements =
+container
+.getCounter(
+MetricName.named(ReduceFnRunner.class,
+ReduceFnRunner.DROPPED_DUE_TO_CLOSED_WINDOW))
+.getCumulative()
+.longValue();
+assertEquals(0, droppedElements);
+  }
+
+  @Test
+  public void testMergingWatermarkHoldLateNewWindow() throws Exception {
+LinkedList actions = new LinkedList<>();
+actions.add(Action.inputWatermark(40));
+actions.add(Action.times(1));
+mergingWatermarkHoldTestHelper(actions);
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83842)
Time Spent: 3.5h  (was: 3h 20m)

> StateMerging.mergeWatermarks sets a late watermark hold for late merging 
> windows that depend only on the window
> ---
>
> Key: BEAM-3776
> URL: https://issues.apache.org/jira/browse/BEAM-3776
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 

[jira] [Updated] (BEAM-3923) Define and add updates needed to run a simple Java transform using Python SDK on top of the reference runner (ULR)

2018-03-23 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-3923:
-
Component/s: sdk-py-core

> Define and add updates needed to run a simple Java transform using Python SDK 
> on top of the reference runner (ULR)
> --
>
> Key: BEAM-3923
> URL: https://issues.apache.org/jira/browse/BEAM-3923
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-harness, sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
>
> Task here is the do research and perform updates necessary to run a simple 
> Java transform that does not require expansion through Beam portability 
> offering. Initially we'll try to get this working for ULR and will try to 
> expand into other runners later.
> This will also including coming up with a reasonable mechanism to define a 
> shim in SDK A for a transform that is fully defined in SDK B.
> This should be combined with efforts to introduce mechanisms to expand cross 
> platform transforms (see [1]) to achieve a full cross language transform 
> offering which will, for example, allow IOs defined in SDK B to be used in 
> SDK A.
> [1] [https://s.apache.org/beam-mixed-language-pipelines]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3850) I/O transform for HDH5 files with extension H5

2018-03-23 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-3850:


Assignee: (was: Chamikara Jayalath)

> I/O transform for HDH5 files with extension H5
> --
>
> Key: BEAM-3850
> URL: https://issues.apache.org/jira/browse/BEAM-3850
> Project: Beam
>  Issue Type: New Feature
>  Components: io-ideas, sdk-py-core
> Environment: Python (could be Java) if it does not work
>Reporter: Eila Arich-Landkof
>Priority: Major
>
> Following the great summit today, I would like to ask for I/O transform for 
> H5 file in python. If impossible, Java will work as well.
> The HDF5 group is very accessible: [https://support.hdfgroup.org/HDF5/]
> Example for H5 file can be found @ Mount Sinai Maayan lab: 
> [https://amp.pharm.mssm.edu/archs4/download.html]
> I am using it as part of Oriel Research genomic and clinical processing 
> workflow.
> I am available for any question.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3850) I/O transform for HDH5 files with extension H5

2018-03-23 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath reassigned BEAM-3850:


Assignee: Chamikara Jayalath  (was: Eugene Kirpichov)

> I/O transform for HDH5 files with extension H5
> --
>
> Key: BEAM-3850
> URL: https://issues.apache.org/jira/browse/BEAM-3850
> Project: Beam
>  Issue Type: New Feature
>  Components: io-ideas, sdk-py-core
> Environment: Python (could be Java) if it does not work
>Reporter: Eila Arich-Landkof
>Assignee: Chamikara Jayalath
>Priority: Major
>
> Following the great summit today, I would like to ask for I/O transform for 
> H5 file in python. If impossible, Java will work as well.
> The HDF5 group is very accessible: [https://support.hdfgroup.org/HDF5/]
> Example for H5 file can be found @ Mount Sinai Maayan lab: 
> [https://amp.pharm.mssm.edu/archs4/download.html]
> I am using it as part of Oriel Research genomic and clinical processing 
> workflow.
> I am available for any question.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3923) Define and add updates needed to run a simple Java transform using Python SDK on top of the reference runner (ULR)

2018-03-23 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-3923:


 Summary: Define and add updates needed to run a simple Java 
transform using Python SDK on top of the reference runner (ULR)
 Key: BEAM-3923
 URL: https://issues.apache.org/jira/browse/BEAM-3923
 Project: Beam
  Issue Type: New Feature
  Components: sdk-java-core, sdk-java-harness
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


Task here is the do research and perform updates necessary to run a simple Java 
transform that does not require expansion through Beam portability offering. 
Initially we'll try to get this working for ULR and will try to expand into 
other runners later.

This will also including coming up with a reasonable mechanism to define a shim 
in SDK A for a transform that is fully defined in SDK B.

This should be combined with efforts to introduce mechanisms to expand cross 
platform transforms (see [1]) to achieve a full cross language transform 
offering which will, for example, allow IOs defined in SDK B to be used in SDK 
A.

[1] [https://s.apache.org/beam-mixed-language-pipelines]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1173

2018-03-23 Thread Apache Jenkins Server
See 


--
[...truncated 871.83 KB...]
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": "kind:pair", 
  "component_encodings": [
{
  "@type": "kind:bytes"
}, 
{
  "@type": 
"VarIntCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxhiUWeeSXOIA5XIYNmYyFjbSFTkh4A89cR+g==",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "compute/MapToVoidKey0.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s2"
}, 
"serialized_fn": "", 
"user_name": "compute/MapToVoidKey0"
  }
}
  ], 
  "type": "JOB_TYPE_BATCH"
}
root: INFO: Create job: 
root: INFO: Created job with id: [2018-03-23_14_40_33-101295976211962563]
root: INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-03-23_14_40_33-101295976211962563?project=apache-beam-testing
root: INFO: Job 2018-03-23_14_40_33-101295976211962563 is in state 
JOB_STATE_PENDING
root: INFO: 2018-03-23T21:40:33.513Z: JOB_MESSAGE_WARNING: Job 
2018-03-23_14_40_33-101295976211962563 might autoscale up to 250 workers.
root: INFO: 2018-03-23T21:40:33.535Z: JOB_MESSAGE_DETAILED: Autoscaling is 
enabled for job 2018-03-23_14_40_33-101295976211962563. The number of workers 
will be between 1 and 250.
root: INFO: 2018-03-23T21:40:33.562Z: JOB_MESSAGE_DETAILED: Autoscaling was 
automatically enabled for job 2018-03-23_14_40_33-101295976211962563.
root: INFO: 2018-03-23T21:40:36.393Z: JOB_MESSAGE_DETAILED: Checking required 
Cloud APIs are enabled.
root: INFO: 2018-03-23T21:40:36.553Z: JOB_MESSAGE_DETAILED: Checking 
permissions granted to controller Service Account.
root: INFO: 2018-03-23T21:40:37.376Z: JOB_MESSAGE_DETAILED: Expanding 
CoGroupByKey operations into optimizable parts.
root: INFO: 2018-03-23T21:40:37.420Z: JOB_MESSAGE_DEBUG: Combiner lifting 
skipped for step assert_that/Group/GroupByKey: GroupByKey not followed by a 
combiner.
root: INFO: 2018-03-23T21:40:37.452Z: JOB_MESSAGE_DETAILED: Expanding 
GroupByKey operations into optimizable parts.
root: INFO: 2018-03-23T21:40:37.478Z: JOB_MESSAGE_DETAILED: Lifting 
ValueCombiningMappingFns into MergeBucketsMappingFns
root: INFO: 2018-03-23T21:40:37.533Z: JOB_MESSAGE_DEBUG: Annotating graph with 
Autotuner information.
root: INFO: 2018-03-23T21:40:37.579Z: JOB_MESSAGE_DETAILED: Fusing adjacent 
ParDo, Read, Write, and Flatten operations
root: INFO: 2018-03-23T21:40:37.614Z: JOB_MESSAGE_DETAILED: Unzipping flatten 
s11 for input s10.out
root: INFO: 2018-03-23T21:40:37.649Z: JOB_MESSAGE_DETAILED: Fusing unzipped 
copy of assert_that/Group/GroupByKey/Reify, through flatten 
assert_that/Group/Flatten, into producer assert_that/Group/pair_with_1
root: INFO: 2018-03-23T21:40:37.680Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Group/GroupByKey/GroupByWindow into 
assert_that/Group/GroupByKey/Read
root: INFO: 2018-03-23T21:40:37.715Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Unkey into assert_that/Group/Map(_merge_tagged_vals_under_key)
root: INFO: 2018-03-23T21:40:37.747Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Match into assert_that/Unkey
root: INFO: 2018-03-23T21:40:37.776Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Group/Map(_merge_tagged_vals_under_key) into 
assert_that/Group/GroupByKey/GroupByWindow
root: INFO: 2018-03-23T21:40:37.808Z: JOB_MESSAGE_DETAILED: Unzipping flatten 
s11-u13 for input s12-reify-value0-c11
root: INFO: 2018-03-23T21:40:37.845Z: JOB_MESSAGE_DETAILED: Fusing unzipped 
copy of assert_that/Group/GroupByKey/Write, through flatten s11-u13, into 
producer assert_that/Group/GroupByKey/Reify
root: INFO: 

Build failed in Jenkins: beam_PostCommit_Python_Verify #4493

2018-03-23 Thread Apache Jenkins Server
See 


--
[...truncated 290.24 KB...]
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples import wordcount
 from apache_beam.examples import wordcount_fnapi
 from apache_beam.testing.pipeline_verifiers import FileChecksumMatcher
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
 from apache_beam.testing.test_utils import delete_files
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 
 class WordCountIT(unittest.TestCase):
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
 2018-03-23 17:50:29.204663
+++ 
:after
  2018-03-23 21:45:33.046206
@@ -29,15 +29,14 @@
 import unittest
 import uuid
 
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples import streaming_wordcount
 from apache_beam.io.gcp.tests.pubsub_matcher import PubSubMessageMatcher
 from apache_beam.runners.runner import PipelineState
 from apache_beam.testing import test_utils
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 INPUT_TOPIC = 'wc_topic_input'
 OUTPUT_TOPIC = 'wc_topic_output'
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
 2018-01-24 00:22:36.719312
+++ 
:after
  2018-03-23 21:45:33.577484
@@ -21,14 +21,13 @@
 import time
 import unittest
 
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples.cookbook import bigquery_tornadoes
 from apache_beam.io.gcp.tests import utils
 from apache_beam.io.gcp.tests.bigquery_matcher import BigqueryMatcher
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 
 class BigqueryTornadoesIT(unittest.TestCase):
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
   2018-01-24 00:22:36.959311
+++ 
:after
2018-03-23 21:45:33.719489
@@ -20,12 +20,11 @@
 import logging
 import unittest
 
-from hamcrest.core.assert_that import assert_that as hc_assert_that
-from hamcrest.core.base_matcher import BaseMatcher
-
 from apache_beam.internal import pickler
 from apache_beam.options.pipeline_options import PipelineOptions
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.assert_that import assert_that as hc_assert_that
+from hamcrest.core.base_matcher import BaseMatcher
 
 
 # A simple matcher that is ued for testing extra options appending.
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
   2018-01-24 00:22:36.959311
+++ 
:after
2018-03-23 21:45:33.733857
@@ -25,12 +25,11 @@
 import logging
 import time
 
-from hamcrest.core.base_matcher import BaseMatcher
-
 from apache_beam.io.filesystems import FileSystems
 from apache_beam.runners.runner import PipelineState
 from apache_beam.testing import test_utils as utils
 from apache_beam.utils import retry
+from hamcrest.core.base_matcher import BaseMatcher
 
 __all__ = [
 'PipelineStateMatcher',
ERROR: 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5209

2018-03-23 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-3453) Allow usage of public Google PubSub topics in Python DirectRunner

2018-03-23 Thread Ahmet Altay (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412111#comment-16412111
 ] 

Ahmet Altay commented on BEAM-3453:
---

Any updates on this one?

> Allow usage of public Google PubSub topics in Python DirectRunner
> -
>
> Key: BEAM-3453
> URL: https://issues.apache.org/jira/browse/BEAM-3453
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Charles Chen
>Assignee: Charles Chen
>Priority: Major
>
> Currently, the Beam Python DirectRunner does not allow the usage of data from 
> public Google Cloud PubSub topics.  We should allow this functionality so 
> that users can more easily test Beam Python's streaming functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2393) BoundedSource is not fault-tolerant in FlinkRunner Streaming mode

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2393?focusedWorklogId=83815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83815
 ]

ASF GitHub Bot logged work on BEAM-2393:


Author: ASF GitHub Bot
Created on: 23/Mar/18 21:23
Start Date: 23/Mar/18 21:23
Worklog Time Spent: 10m 
  Work Description: aljoscha commented on issue #4895: [BEAM-2393] Make 
BoundedSource fault-tolerant in FlinkRunner Streaming mode
URL: https://github.com/apache/beam/pull/4895#issuecomment-375802301
 
 
   Could you please rebase and see if the tests pass locally?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83815)
Time Spent: 20m  (was: 10m)

> BoundedSource is not fault-tolerant in FlinkRunner Streaming mode
> -
>
> Key: BEAM-2393
> URL: https://issues.apache.org/jira/browse/BEAM-2393
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Jingsong Lee
>Assignee: Grzegorz Kołakowski
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{BoundedSourceWrapper}} does not implement snapshot() and restore(), when 
> the failure to restart, it will send duplicate data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3800) Set uids on Flink operators

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3800?focusedWorklogId=83813=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83813
 ]

ASF GitHub Bot logged work on BEAM-3800:


Author: ASF GitHub Bot
Created on: 23/Mar/18 21:21
Start Date: 23/Mar/18 21:21
Worklog Time Spent: 10m 
  Work Description: aljoscha commented on issue #4903: [BEAM-3800] Set uids 
on Flink operators
URL: https://github.com/apache/beam/pull/4903#issuecomment-375801726
 
 
   retest this please
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83813)
Time Spent: 20m  (was: 10m)

> Set uids on Flink operators
> ---
>
> Key: BEAM-3800
> URL: https://issues.apache.org/jira/browse/BEAM-3800
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Grzegorz Kołakowski
>Assignee: Grzegorz Kołakowski
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Flink operators should have unique ids assigned, which are, in turn, used for 
> checkpointing stateful operators. Assigning operator ids is highly 
> recommended according to Flink documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2831) Pipeline crashes due to Beam encoder breaking Flink memory management

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2831?focusedWorklogId=83811=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83811
 ]

ASF GitHub Bot logged work on BEAM-2831:


Author: ASF GitHub Bot
Created on: 23/Mar/18 21:05
Start Date: 23/Mar/18 21:05
Worklog Time Spent: 10m 
  Work Description: aljoscha commented on issue #4892: [BEAM-2831] Do not 
wrap IOException in SerializableCoder
URL: https://github.com/apache/beam/pull/4892#issuecomment-375798111
 
 
   For me, the change looks good.
   
   @kennknowles Is currently not available, @lukecwik do you have an idea 
whether this could cause unforeseen problems.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83811)
Time Spent: 50m  (was: 40m)

> Pipeline crashes due to Beam encoder breaking Flink memory management
> -
>
> Key: BEAM-2831
> URL: https://issues.apache.org/jira/browse/BEAM-2831
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.0.0, 2.1.0
> Environment: Flink 1.2.1 and 1.3.0, Java HotSpot and OpenJDK 8, macOS 
> 10.12.6 and unknown Linux
>Reporter: Reinier Kip
>Assignee: Aljoscha Krettek
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I’ve been running a Beam pipeline on Flink. Depending on the dataset size and 
> the heap memory configuration of the jobmanager and taskmanager, I may run 
> into an EOFException, which causes the job to fail.
> As [discussed on Flink's 
> mailinglist|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/EOFException-related-to-memory-segments-during-run-of-Beam-pipeline-on-Flink-td15255.html]
>  (stacktrace enclosed), Flink catches these EOFExceptions and activates disk 
> spillover. Because Beam wraps these exceptions, this mechanism fails, the 
> exception travels up the stack, and the job aborts.
> Hopefully this is enough information and this is something that can be 
> adjusted for in Beam. I'd be glad to provide more information where needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2831) Pipeline crashes due to Beam encoder breaking Flink memory management

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2831?focusedWorklogId=83810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83810
 ]

ASF GitHub Bot logged work on BEAM-2831:


Author: ASF GitHub Bot
Created on: 23/Mar/18 21:04
Start Date: 23/Mar/18 21:04
Worklog Time Spent: 10m 
  Work Description: aljoscha commented on issue #4892: [BEAM-2831] Do not 
wrap IOException in SerializableCoder
URL: https://github.com/apache/beam/pull/4892#issuecomment-375797900
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83810)
Time Spent: 40m  (was: 0.5h)

> Pipeline crashes due to Beam encoder breaking Flink memory management
> -
>
> Key: BEAM-2831
> URL: https://issues.apache.org/jira/browse/BEAM-2831
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.0.0, 2.1.0
> Environment: Flink 1.2.1 and 1.3.0, Java HotSpot and OpenJDK 8, macOS 
> 10.12.6 and unknown Linux
>Reporter: Reinier Kip
>Assignee: Aljoscha Krettek
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I’ve been running a Beam pipeline on Flink. Depending on the dataset size and 
> the heap memory configuration of the jobmanager and taskmanager, I may run 
> into an EOFException, which causes the job to fail.
> As [discussed on Flink's 
> mailinglist|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/EOFException-related-to-memory-segments-during-run-of-Beam-pipeline-on-Flink-td15255.html]
>  (stacktrace enclosed), Flink catches these EOFExceptions and activates disk 
> spillover. Because Beam wraps these exceptions, this mechanism fails, the 
> exception travels up the stack, and the job aborts.
> Hopefully this is enough information and this is something that can be 
> adjusted for in Beam. I'd be glad to provide more information where needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=83806=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83806
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:54
Start Date: 23/Mar/18 20:54
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-375795482
 
 
   @timrobertson100 my excuses I was quite busy with other stuff I will take a 
look and give you some feedback at the beginning of next week at latest. Thanks 
for the contribution and for your interest in improving SolrIO.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83806)
Time Spent: 1h 20m  (was: 1h 10m)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3892) Make MetricQueryResults and related classes more json-serialization friendly

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3892?focusedWorklogId=83805=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83805
 ]

ASF GitHub Bot logged work on BEAM-3892:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:53
Start Date: 23/Mar/18 20:53
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4918: [BEAM-3892] Make 
MetricQueryResults and related classes more json-serialization friendly
URL: https://github.com/apache/beam/pull/4918#issuecomment-375795235
 
 
   @echauchot can you please rebase so we can get jenkins happy and merge.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83805)
Time Spent: 1h 50m  (was: 1h 40m)

> Make MetricQueryResults and related classes more json-serialization friendly
> 
>
> Key: BEAM-3892
> URL: https://issues.apache.org/jira/browse/BEAM-3892
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When working on this PR [https://github.com/apache/beam/pull/4548] 
> MetricQueryResults needed to be serialized to be pushed to a metrics sink. As 
> they were it required a custom serializer that just calls the name(), 
> counter(), committed(), attempted() ... methods. MetricQueryResults are so 
> close to be serializable with the default serializer, just need the accessors 
> to be renamed get*, that creating DTO objects with get* methods to just call 
> the non-get methods seems unnecessary. 
> So just rename public accessors to get* on the experimental API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3892) Make MetricQueryResults and related classes more json-serialization friendly

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3892?focusedWorklogId=83804=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83804
 ]

ASF GitHub Bot logged work on BEAM-3892:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:53
Start Date: 23/Mar/18 20:53
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #4918: [BEAM-3892] Make 
MetricQueryResults and related classes more json-serialization friendly
URL: https://github.com/apache/beam/pull/4918#issuecomment-375795214
 
 
   @aviemzur I think metrics is still a moving target so this point probably 
won't be soon.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83804)
Time Spent: 1h 40m  (was: 1.5h)

> Make MetricQueryResults and related classes more json-serialization friendly
> 
>
> Key: BEAM-3892
> URL: https://issues.apache.org/jira/browse/BEAM-3892
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When working on this PR [https://github.com/apache/beam/pull/4548] 
> MetricQueryResults needed to be serialized to be pushed to a metrics sink. As 
> they were it required a custom serializer that just calls the name(), 
> counter(), committed(), attempted() ... methods. MetricQueryResults are so 
> close to be serializable with the default serializer, just need the accessors 
> to be renamed get*, that creating DTO objects with get* methods to just call 
> the non-get methods seems unnecessary. 
> So just rename public accessors to get* on the experimental API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3848) SolrIO: Improve retrying mechanism in client writes

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3848?focusedWorklogId=83798=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83798
 ]

ASF GitHub Bot logged work on BEAM-3848:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:36
Start Date: 23/Mar/18 20:36
Worklog Time Spent: 10m 
  Work Description: jkff commented on issue #4905: [BEAM-3848] Enables 
ability to retry Solr writes on error (SolrIO)
URL: https://github.com/apache/beam/pull/4905#issuecomment-375790839
 
 
   Thanks! Looks generally reasonable to me, but @iemejia would you mind doing 
the first detailed round here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83798)
Time Spent: 1h 10m  (was: 1h)

> SolrIO: Improve retrying mechanism in client writes
> ---
>
> Key: BEAM-3848
> URL: https://issues.apache.org/jira/browse/BEAM-3848
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-solr
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Tim Robertson
>Assignee: Tim Robertson
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> A busy SOLR server is prone to return RemoteSOLRException on writing which 
> currently fails a complete task (e.g. a partition of a spark RDD being 
> written to SOLR).
> A good addition would be the ability to provide a retrying mechanism for the 
> batch in flight, rather than failing fast, which will most likely trigger a 
> much larger retry of more writes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3706) Update CombinePayload to improved model for Portability

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3706?focusedWorklogId=83796=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83796
 ]

ASF GitHub Bot logged work on BEAM-3706:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:33
Start Date: 23/Mar/18 20:33
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #4924: 
[BEAM-3706] Removing side inputs from CombinePayload proto.
URL: https://github.com/apache/beam/pull/4924#discussion_r176855962
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
 ##
 @@ -846,12 +846,7 @@ boolean canTranslate(
   WindowingStrategy windowingStrategy =
   (WindowingStrategy) input.getWindowingStrategy();
 
-  boolean hasNoSideInputs;
-  try {
-hasNoSideInputs = 
CombineTranslation.getSideInputs(context.getCurrentTransform()).isEmpty();
-  } catch (IOException e) {
-throw new RuntimeException(e);
-  }
+  boolean hasNoSideInputs = true;
 
   return windowingStrategy.getWindowFn().isNonMerging() || hasNoSideInputs;
 
 Review comment:
   This is always true, remove the implementation


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83796)
Time Spent: 1h 50m  (was: 1h 40m)

> Update CombinePayload to improved model for Portability
> ---
>
> Key: BEAM-3706
> URL: https://issues.apache.org/jira/browse/BEAM-3706
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Labels: portability
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This will mean changing the proto definition in beam_runner_api, most likely 
> trimming out fields that are no longer necessary and adding any new ones that 
> could be useful. The majority of work will probably be in investigating if 
> some existing fields can actually be removed (SideInputs and Parameters for 
> example).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3516) SpannerWriteGroupFn does not respect mutation limits

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3516?focusedWorklogId=83793=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83793
 ]

ASF GitHub Bot logged work on BEAM-3516:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:21
Start Date: 23/Mar/18 20:21
Worklog Time Spent: 10m 
  Work Description: NathanHowell commented on issue #4860: [BEAM-3516] 
Spanner BatchFn does not respect mutation limits
URL: https://github.com/apache/beam/pull/4860#issuecomment-375787131
 
 
   @mairbek I've made the requested changes, can you take another look? (an 
unrelated test is failing atm)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83793)
Time Spent: 1h 10m  (was: 1h)

> SpannerWriteGroupFn does not respect mutation limits
> 
>
> Key: BEAM-3516
> URL: https://issues.apache.org/jira/browse/BEAM-3516
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.2.0
>Reporter: Ryan Gordon
>Assignee: Chamikara Jayalath
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When using SpannerIO.write(), if it happens to be a large batch or a table 
> with indexes its very possible it can hit the Spanner Mutations Limitation 
> and fail with the following error:
> {quote}Jan 02, 2018 2:42:59 PM 
> org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
> SEVERE: 2018-01-02T22:42:57.873Z: (3e7c871d215e890b): 
> com.google.cloud.spanner.SpannerException: INVALID_ARGUMENT: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: The transaction contains 
> too many mutations. Insert and update operations count with the multiplicity 
> of the number of columns they affect. For example, inserting values into one 
> key column and four non-key columns count as five mutations total for the 
> insert. Delete and delete range operations count as one mutation regardless 
> of the number of columns affected. The total mutation count includes any 
> changes to indexes that the transaction generates. Please reduce the number 
> of writes, or use fewer indexes. (Maximum number: 2)
> links {
>  description: "Cloud Spanner limits documentation."
>  url: "https://cloud.google.com/spanner/docs/limits;
> }
> at 
> com.google.cloud.spanner.SpannerExceptionFactory.newSpannerExceptionPreformatted(SpannerExceptionFactory.java:119)
>  at 
> com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:43)
>  at 
> com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:80)
>  at 
> com.google.cloud.spanner.spi.v1.GrpcSpannerRpc.get(GrpcSpannerRpc.java:404)
>  at 
> com.google.cloud.spanner.spi.v1.GrpcSpannerRpc.commit(GrpcSpannerRpc.java:376)
>  at 
> com.google.cloud.spanner.SpannerImpl$SessionImpl$2.call(SpannerImpl.java:729)
>  at 
> com.google.cloud.spanner.SpannerImpl$SessionImpl$2.call(SpannerImpl.java:726)
>  at com.google.cloud.spanner.SpannerImpl.runWithRetries(SpannerImpl.java:200)
>  at 
> com.google.cloud.spanner.SpannerImpl$SessionImpl.writeAtLeastOnce(SpannerImpl.java:725)
>  at 
> com.google.cloud.spanner.SessionPool$PooledSession.writeAtLeastOnce(SessionPool.java:248)
>  at 
> com.google.cloud.spanner.DatabaseClientImpl.writeAtLeastOnce(DatabaseClientImpl.java:37)
>  at 
> org.apache.beam.sdk.io.gcp.spanner.SpannerWriteGroupFn.flushBatch(SpannerWriteGroupFn.java:108)
>  at 
> org.apache.beam.sdk.io.gcp.spanner.SpannerWriteGroupFn.processElement(SpannerWriteGroupFn.java:79)
> {quote}
>  
> As a workaround we can override the "withBatchSizeBytes" to something much 
> smaller:
> {quote}mutations.apply("Write", SpannerIO
>    .write()
>    // Artificially reduce the max batch size b/c the batcher currently doesn't
>    // take into account the 2 mutation multiplicity limit
>    .withBatchSizeBytes(1024) // 1KB
>    .withProjectId("#PROJECTID#")
>    .withInstanceId("#INSTANCE#")
>    .withDatabaseId("#DATABASE#")
>  );
> {quote}
> While this is not as efficient, it at least allows it to work consistently



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3706) Update CombinePayload to improved model for Portability

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3706?focusedWorklogId=83792=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83792
 ]

ASF GitHub Bot logged work on BEAM-3706:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:19
Start Date: 23/Mar/18 20:19
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #4924: 
[BEAM-3706] Removing side inputs from CombinePayload proto.
URL: https://github.com/apache/beam/pull/4924#discussion_r176853193
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
 ##
 @@ -196,10 +196,13 @@ private PTransformTranslation() {}
 transformBuilder.setSpec(spec);
   }
 } else if (KNOWN_PAYLOAD_TRANSLATORS.containsKey(transform.getClass())) {
-  transformBuilder.setSpec(
+  FunctionSpec spec =
   KNOWN_PAYLOAD_TRANSLATORS
   .get(transform.getClass())
-  .translate(appliedPTransform, components));
+  .translate(appliedPTransform, components);
+  if (spec != null) {
 
 Review comment:
   This follows the pattern a few lines higher where `null` function specs are 
used as a signal to represent that they are absent.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83792)
Time Spent: 1h 40m  (was: 1.5h)

> Update CombinePayload to improved model for Portability
> ---
>
> Key: BEAM-3706
> URL: https://issues.apache.org/jira/browse/BEAM-3706
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Labels: portability
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This will mean changing the proto definition in beam_runner_api, most likely 
> trimming out fields that are no longer necessary and adding any new ones that 
> could be useful. The majority of work will probably be in investigating if 
> some existing fields can actually be removed (SideInputs and Parameters for 
> example).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3706) Update CombinePayload to improved model for Portability

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3706?focusedWorklogId=83791=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83791
 ]

ASF GitHub Bot logged work on BEAM-3706:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:18
Start Date: 23/Mar/18 20:18
Worklog Time Spent: 10m 
  Work Description: aljoscha commented on a change in pull request #4924: 
[BEAM-3706] Removing side inputs from CombinePayload proto.
URL: https://github.com/apache/beam/pull/4924#discussion_r176852910
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
 ##
 @@ -846,12 +846,7 @@ boolean canTranslate(
   WindowingStrategy windowingStrategy =
   (WindowingStrategy) input.getWindowingStrategy();
 
-  boolean hasNoSideInputs;
-  try {
-hasNoSideInputs = 
CombineTranslation.getSideInputs(context.getCurrentTransform()).isEmpty();
-  } catch (IOException e) {
-throw new RuntimeException(e);
-  }
+  boolean hasNoSideInputs = true;
 
   return windowingStrategy.getWindowFn().isNonMerging() || hasNoSideInputs;
 
 Review comment:
   Here we can just delete the `|| hasNoSideInputs` bit, because it's now 
always `true`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83791)
Time Spent: 1.5h  (was: 1h 20m)

> Update CombinePayload to improved model for Portability
> ---
>
> Key: BEAM-3706
> URL: https://issues.apache.org/jira/browse/BEAM-3706
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Labels: portability
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> This will mean changing the proto definition in beam_runner_api, most likely 
> trimming out fields that are no longer necessary and adding any new ones that 
> could be useful. The majority of work will probably be in investigating if 
> some existing fields can actually be removed (SideInputs and Parameters for 
> example).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3706) Update CombinePayload to improved model for Portability

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3706?focusedWorklogId=83790=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83790
 ]

ASF GitHub Bot logged work on BEAM-3706:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:11
Start Date: 23/Mar/18 20:11
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #4924: 
[BEAM-3706] Removing side inputs from CombinePayload proto.
URL: https://github.com/apache/beam/pull/4924#discussion_r176851219
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkBatchTransformTranslators.java
 ##
 @@ -396,12 +396,8 @@ public void translateNode(
   // construct a map from side input to WindowingStrategy so that
   // the DoFn runner can map main-input windows to side input windows
   Map sideInputStrategies = 
new HashMap<>();
-  List sideInputs;
-  try {
-sideInputs = 
CombineTranslation.getSideInputs(context.getCurrentTransform());
-  } catch (IOException e) {
-throw new RuntimeException(e);
-  }
+  List sideInputs = new ArrayList<>();
 
 Review comment:
   Since we no longer have side inputs in Combine, can you go ahead and remove 
all references to side inputs in this function?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83790)
Time Spent: 1h 20m  (was: 1h 10m)

> Update CombinePayload to improved model for Portability
> ---
>
> Key: BEAM-3706
> URL: https://issues.apache.org/jira/browse/BEAM-3706
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Labels: portability
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This will mean changing the proto definition in beam_runner_api, most likely 
> trimming out fields that are no longer necessary and adding any new ones that 
> could be useful. The majority of work will probably be in investigating if 
> some existing fields can actually be removed (SideInputs and Parameters for 
> example).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3706) Update CombinePayload to improved model for Portability

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3706?focusedWorklogId=83789=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83789
 ]

ASF GitHub Bot logged work on BEAM-3706:


Author: ASF GitHub Bot
Created on: 23/Mar/18 20:11
Start Date: 23/Mar/18 20:11
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #4924: 
[BEAM-3706] Removing side inputs from CombinePayload proto.
URL: https://github.com/apache/beam/pull/4924#discussion_r176851403
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
 ##
 @@ -912,12 +907,7 @@ public void translateNode(
   TypeInformation>> outputTypeInfo =
   context.getTypeInfo(context.getOutput(transform));
 
-  List sideInputs;
-  try {
-sideInputs = 
CombineTranslation.getSideInputs(context.getCurrentTransform());
-  } catch (IOException e) {
-throw new RuntimeException(e);
-  }
+  List sideInputs = new ArrayList<>();
 
 Review comment:
   As above, please remove references to side inputs here. If it's not clear 
how to do that, please add a TODO and a new bug to track this removal.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83789)
Time Spent: 1h 10m  (was: 1h)

> Update CombinePayload to improved model for Portability
> ---
>
> Key: BEAM-3706
> URL: https://issues.apache.org/jira/browse/BEAM-3706
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
>  Labels: portability
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This will mean changing the proto definition in beam_runner_api, most likely 
> trimming out fields that are no longer necessary and adding any new ones that 
> could be useful. The majority of work will probably be in investigating if 
> some existing fields can actually be removed (SideInputs and Parameters for 
> example).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #4484

2018-03-23 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #6282

2018-03-23 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1172

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[coheigea] Put the String literal first when comparing it to an Object

[jb] [BEAM-3500] "Attach" JDBC connection to the bundle (improve the pooling)

[jb] [BEAM-3500] Test if the user provides both withDataSourceConfiguration()

[jb] [BEAM-3500] Wrap the datasource as a poolable datasource and expose

[jb] [BEAM-3500] Add commons-pool2 dependency

[jb] [BEAM-3500] Only expose max number of connections in the pool to the

[jb] [BEAM-3500] Cleanup pool configuration parameters

[jb] [BEAM-3500] Remove dataSourceFactory

[jb] [BEAM-3500] Remove unecessary check on dataSourceConfiguration

[mariand] Switched AvroIO default codec to snappyCodec().

[aaltay] [BEAM-3738] Enable py3 lint and cleanup tox.ini. (#4877)

--
[...truncated 782.52 KB...]
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": "kind:pair", 
  "component_encodings": [
{
  "@type": "kind:bytes"
}, 
{
  "@type": 
"VarIntCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxhiUWeeSXOIA5XIYNmYyFjbSFTkh4A89cR+g==",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "compute/MapToVoidKey0.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s2"
}, 
"serialized_fn": "", 
"user_name": "compute/MapToVoidKey0"
  }
}
  ], 
  "type": "JOB_TYPE_BATCH"
}
root: INFO: Create job: 
root: INFO: Created job with id: [2018-03-23_12_53_39-8870822437358831112]
root: INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-03-23_12_53_39-8870822437358831112?project=apache-beam-testing
root: INFO: Job 2018-03-23_12_53_39-8870822437358831112 is in state 
JOB_STATE_PENDING
root: INFO: 2018-03-23T19:53:39.955Z: JOB_MESSAGE_WARNING: Job 
2018-03-23_12_53_39-8870822437358831112 might autoscale up to 250 workers.
root: INFO: 2018-03-23T19:53:39.977Z: JOB_MESSAGE_DETAILED: Autoscaling is 
enabled for job 2018-03-23_12_53_39-8870822437358831112. The number of workers 
will be between 1 and 250.
root: INFO: 2018-03-23T19:53:40.008Z: JOB_MESSAGE_DETAILED: Autoscaling was 
automatically enabled for job 2018-03-23_12_53_39-8870822437358831112.
root: INFO: 2018-03-23T19:53:42.651Z: JOB_MESSAGE_DETAILED: Checking required 
Cloud APIs are enabled.
root: INFO: 2018-03-23T19:53:42.959Z: JOB_MESSAGE_DETAILED: Checking 
permissions granted to controller Service Account.
root: INFO: 2018-03-23T19:53:43.874Z: JOB_MESSAGE_DETAILED: Expanding 
CoGroupByKey operations into optimizable parts.
root: INFO: 2018-03-23T19:53:43.911Z: JOB_MESSAGE_DEBUG: Combiner lifting 
skipped for step assert_that/Group/GroupByKey: GroupByKey not followed by a 
combiner.
root: INFO: 2018-03-23T19:53:43.943Z: JOB_MESSAGE_DETAILED: Expanding 
GroupByKey operations into optimizable parts.
root: INFO: 2018-03-23T19:53:43.971Z: JOB_MESSAGE_DETAILED: Lifting 
ValueCombiningMappingFns into MergeBucketsMappingFns
root: INFO: 2018-03-23T19:53:44.013Z: JOB_MESSAGE_DEBUG: Annotating graph with 
Autotuner information.
root: INFO: 2018-03-23T19:53:44.053Z: JOB_MESSAGE_DETAILED: Fusing adjacent 
ParDo, Read, Write, and Flatten operations
root: INFO: 2018-03-23T19:53:44.088Z: JOB_MESSAGE_DETAILED: Unzipping flatten 
s11 for input s10.out
root: INFO: 2018-03-23T19:53:44.123Z: JOB_MESSAGE_DETAILED: Fusing unzipped 
copy of assert_that/Group/GroupByKey/Reify, through flatten 
assert_that/Group/Flatten, into producer assert_that/Group/pair_with_1
root: INFO: 2018-03-23T19:53:44.166Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Group/GroupByKey/GroupByWindow into 
assert_that/Group/GroupByKey/Read
root: INFO: 2018-03-23T19:53:44.194Z: 

[jira] [Assigned] (BEAM-3919) checkpoint can not work with flink 1.4.1,1.4.2

2018-03-23 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek reassigned BEAM-3919:
--

Assignee: (was: Aljoscha Krettek)

> checkpoint can not work with flink 1.4.1,1.4.2
> --
>
> Key: BEAM-3919
> URL: https://issues.apache.org/jira/browse/BEAM-3919
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.3.0, 2.4.0
>Reporter: eisig
>Priority: Critical
>
> When submmit application to flink cluster(1.4.1,1.4.2) with checkpoint 
> enabled. 
> Job fail whith exception:
> java.lang.NoSuchMethodError: 
> org.apache.flink.streaming.api.operators.HeapInternalTimerService.snapshotTimersForKeyGroup(Lorg/apache/flink/core/memory/DataOutputViewStreamWrapper;I)V
>  
> It seems that 
> `org.apache.flink.streaming.api.operators.HeapInternalTimerService.snapshotTimersForKeyGroup`.
>   was changed in flink1.4.1.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-3500) JdbcIO: Improve connection management

2018-03-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré resolved BEAM-3500.

   Resolution: Fixed
Fix Version/s: 2.5.0

> JdbcIO: Improve connection management
> -
>
> Key: BEAM-3500
> URL: https://issues.apache.org/jira/browse/BEAM-3500
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-jdbc
>Affects Versions: 2.2.0
>Reporter: Pawel Bartoszek
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> JdbcIO write DoFn acquires connection in {{@Setup}} and release it in 
> {{@Teardown}} methods, which means that connection might stay opened for days 
> in streaming job case. Keeping single connection open for so long might be 
> very risky as it's exposed to database, network etc issues.
> *Taking connection from the pool when it is actually needed*
> I suggest that connection would be taken from the connection pool in 
> {{executeBatch}} method and released when the batch is flushed. This will 
> allow the pool to take care of any returned unhealthy connections etc.
> *Make JdbcIO accept data source factory*
>  It would be nice if JdbcIO accepted DataSourceFactory rather than DataSource 
> itself. I am saying that because sink checks if DataSource implements 
> `Serializable` interface, which make it impossible to pass 
> BasicDataSource(used internally by sink) as it doesn’t implement this 
> interface. Something like:
> {code:java}
> interface DataSourceFactory extends Serializable{
>  DataSource create();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3500) JdbcIO: Improve connection management

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3500?focusedWorklogId=83777=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83777
 ]

ASF GitHub Bot logged work on BEAM-3500:


Author: ASF GitHub Bot
Created on: 23/Mar/18 19:36
Start Date: 23/Mar/18 19:36
Worklog Time Spent: 10m 
  Work Description: jbonofre commented on issue #4461: [BEAM-3500] "Attach" 
JDBC connection to the bundle and add DataSourceFactory allowing full control 
of the way the DataSource is created
URL: https://github.com/apache/beam/pull/4461#issuecomment-375776291
 
 
   Cool ! Thanks a lot !!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83777)
Time Spent: 7.5h  (was: 7h 20m)

> JdbcIO: Improve connection management
> -
>
> Key: BEAM-3500
> URL: https://issues.apache.org/jira/browse/BEAM-3500
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-jdbc
>Affects Versions: 2.2.0
>Reporter: Pawel Bartoszek
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> JdbcIO write DoFn acquires connection in {{@Setup}} and release it in 
> {{@Teardown}} methods, which means that connection might stay opened for days 
> in streaming job case. Keeping single connection open for so long might be 
> very risky as it's exposed to database, network etc issues.
> *Taking connection from the pool when it is actually needed*
> I suggest that connection would be taken from the connection pool in 
> {{executeBatch}} method and released when the batch is flushed. This will 
> allow the pool to take care of any returned unhealthy connections etc.
> *Make JdbcIO accept data source factory*
>  It would be nice if JdbcIO accepted DataSourceFactory rather than DataSource 
> itself. I am saying that because sink checks if DataSource implements 
> `Serializable` interface, which make it impossible to pass 
> BasicDataSource(used internally by sink) as it doesn’t implement this 
> interface. Something like:
> {code:java}
> interface DataSourceFactory extends Serializable{
>  DataSource create();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5208

2018-03-23 Thread Apache Jenkins Server
See 




Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #6281

2018-03-23 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #4492

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[coheigea] Put the String literal first when comparing it to an Object

[jb] [BEAM-3500] "Attach" JDBC connection to the bundle (improve the pooling)

[jb] [BEAM-3500] Test if the user provides both withDataSourceConfiguration()

[jb] [BEAM-3500] Wrap the datasource as a poolable datasource and expose

[jb] [BEAM-3500] Add commons-pool2 dependency

[jb] [BEAM-3500] Only expose max number of connections in the pool to the

[jb] [BEAM-3500] Cleanup pool configuration parameters

[jb] [BEAM-3500] Remove dataSourceFactory

[jb] [BEAM-3500] Remove unecessary check on dataSourceConfiguration

[mariand] Switched AvroIO default codec to snappyCodec().

[aaltay] [BEAM-3738] Enable py3 lint and cleanup tox.ini. (#4877)

--
[...truncated 290.28 KB...]
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples import wordcount
 from apache_beam.examples import wordcount_fnapi
 from apache_beam.testing.pipeline_verifiers import FileChecksumMatcher
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
 from apache_beam.testing.test_utils import delete_files
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 
 class WordCountIT(unittest.TestCase):
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
 2018-03-23 17:50:29.204663
+++ 
:after
  2018-03-23 19:26:43.761648
@@ -29,15 +29,14 @@
 import unittest
 import uuid
 
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples import streaming_wordcount
 from apache_beam.io.gcp.tests.pubsub_matcher import PubSubMessageMatcher
 from apache_beam.runners.runner import PipelineState
 from apache_beam.testing import test_utils
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 INPUT_TOPIC = 'wc_topic_input'
 OUTPUT_TOPIC = 'wc_topic_output'
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
 2018-01-24 00:22:36.719312
+++ 
:after
  2018-03-23 19:26:44.298138
@@ -21,14 +21,13 @@
 import time
 import unittest
 
-from hamcrest.core.core.allof import all_of
-from nose.plugins.attrib import attr
-
 from apache_beam.examples.cookbook import bigquery_tornadoes
 from apache_beam.io.gcp.tests import utils
 from apache_beam.io.gcp.tests.bigquery_matcher import BigqueryMatcher
 from apache_beam.testing.pipeline_verifiers import PipelineStateMatcher
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.core.allof import all_of
+from nose.plugins.attrib import attr
 
 
 class BigqueryTornadoesIT(unittest.TestCase):
ERROR: 

 Imports are incorrectly sorted.
--- 
:before
   2018-01-24 00:22:36.959311
+++ 
:after
2018-03-23 19:26:44.443558
@@ -20,12 +20,11 @@
 import logging
 import unittest
 
-from hamcrest.core.assert_that import assert_that as hc_assert_that
-from hamcrest.core.base_matcher import BaseMatcher
-
 from apache_beam.internal import pickler
 from apache_beam.options.pipeline_options import PipelineOptions
 from apache_beam.testing.test_pipeline import TestPipeline
+from hamcrest.core.assert_that import assert_that as hc_assert_that
+from hamcrest.core.base_matcher import BaseMatcher
 
 
 # A simple matcher that is ued for testing extra options appending.
ERROR: 

 Imports are incorrectly sorted.
--- 

[jira] [Commented] (BEAM-3781) Figure out min supported Python 3 version

2018-03-23 Thread holdenk (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411920#comment-16411920
 ] 

holdenk commented on BEAM-3781:
---

So I'd like to encourage 3.4 since PySpark currently supports 3.4 and if we 
want folks to be able to (more) easily port pipelines it would be good to 
support 3.4. (See [https://github.com/apache/spark/blob/master/python/setup.py] 
)

> Figure out min supported Python 3 version
> -
>
> Key: BEAM-3781
> URL: https://issues.apache.org/jira/browse/BEAM-3781
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>Priority: Minor
>
> We have 3.4.3 installed on Jenkins workers. We could target that as we add 
> support, but in th long run we will need to figure out the supported version 
> story for python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3874) Switch AvroIO sink default codec to Snappy

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3874?focusedWorklogId=83772=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83772
 ]

ASF GitHub Bot logged work on BEAM-3874:


Author: ASF GitHub Bot
Created on: 23/Mar/18 19:09
Start Date: 23/Mar/18 19:09
Worklog Time Spent: 10m 
  Work Description: jkff commented on issue #4904: [BEAM-3874] Switched 
AvroIO default codec to snappyCodec.
URL: https://github.com/apache/beam/pull/4904#issuecomment-375769897
 
 
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83772)
Time Spent: 0.5h  (was: 20m)

> Switch AvroIO sink default codec to Snappy
> --
>
> Key: BEAM-3874
> URL: https://issues.apache.org/jira/browse/BEAM-3874
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-avro
>Reporter: Marian Dvorsky
>Assignee: Eugene Kirpichov
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> AvroIO currently uses CodecFactory.deflateCodec(6) as the default codec for 
> writes.
> That compresses well, but is quite expensive.
> Snappy codec offers sparser, but much faster compression, and is typically a 
> better CPU/storage tradeoff except for very long lived files. 
> We should consider switching the default to Snappy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Merge pull request #4904: [BEAM-3874] Switched AvroIO default codec to snappyCodec.

2018-03-23 Thread jkff
This is an automated email from the ASF dual-hosted git repository.

jkff pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 82c32deb38b861ad7ac7baae90bcb730230b1904
Merge: 19d8a5a 2b8e2c8
Author: Eugene Kirpichov 
AuthorDate: Fri Mar 23 15:08:56 2018 -0400

Merge pull request #4904: [BEAM-3874] Switched AvroIO default codec to 
snappyCodec.

[BEAM-3874] Switched AvroIO default codec to snappyCodec.

 sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java | 6 +++---
 sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
j...@apache.org.


[jira] [Work logged] (BEAM-3874) Switch AvroIO sink default codec to Snappy

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3874?focusedWorklogId=83771=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83771
 ]

ASF GitHub Bot logged work on BEAM-3874:


Author: ASF GitHub Bot
Created on: 23/Mar/18 19:08
Start Date: 23/Mar/18 19:08
Worklog Time Spent: 10m 
  Work Description: jkff closed pull request #4904: [BEAM-3874] Switched 
AvroIO default codec to snappyCodec.
URL: https://github.com/apache/beam/pull/4904
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index 39a400e9c9a..478a7159f1e 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -184,7 +184,7 @@
  * custom file naming policy.
  *
  * By default, {@link AvroIO.Write} produces output files that are 
compressed using the {@link
- * org.apache.avro.file.Codec CodecFactory.deflateCodec(6)}. This default can 
be changed or
+ * org.apache.avro.file.Codec CodecFactory.snappyCodec()}. This default can be 
changed or
  * overridden using {@link AvroIO.Write#withCodec}.
  *
  * Writing specific or generic records
@@ -848,7 +848,7 @@ public 
CreateParseSourceFn(SerializableFunction parseFn, Coder
   @AutoValue
   public abstract static class TypedWrite
   extends PTransform {
-static final CodecFactory DEFAULT_CODEC = CodecFactory.deflateCodec(6);
+static final CodecFactory DEFAULT_CODEC = CodecFactory.snappyCodec();
 static final SerializableAvroCodecFactory DEFAULT_SERIALIZABLE_CODEC =
 new SerializableAvroCodecFactory(DEFAULT_CODEC);
 
@@ -1409,7 +1409,7 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
 
 /**
  * Specifies to use the given {@link CodecFactory} for each generated 
file. By default, {@code
- * CodecFactory.deflateCodec(6)}.
+ * CodecFactory.snappyCodec()}.
  */
 public Sink withCodec(CodecFactory codec) {
   return toBuilder().setCodec(new 
SerializableAvroCodecFactory(codec)).build();
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
index a1bbe61165d..6dda36cd59f 100644
--- a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
@@ -1055,7 +1055,7 @@ public void testDynamicDestinationsViaSinkWithNumShards() 
throws Exception {
   @Test
   public void testWriteWithDefaultCodec() throws Exception {
 AvroIO.Write write = AvroIO.write(String.class).to("/tmp/foo/baz");
-assertEquals(CodecFactory.deflateCodec(6).toString(), 
write.inner.getCodec().toString());
+assertEquals(CodecFactory.snappyCodec().toString(), 
write.inner.getCodec().toString());
   }
 
   @Test
@@ -1222,7 +1222,7 @@ public void testWriteDisplayData() {
 .withShardNameTemplate("-SS-of-NN-")
 .withSuffix("bar")
 .withNumShards(100)
-.withCodec(CodecFactory.snappyCodec());
+.withCodec(CodecFactory.deflateCodec(6));
 
 DisplayData displayData = DisplayData.from(write);
 
@@ -1237,6 +1237,6 @@ public void testWriteDisplayData() {
 + 
".AvroIOTest$\",\"fields\":[{\"name\":\"intField\",\"type\":\"int\"},"
 + "{\"name\":\"stringField\",\"type\":\"string\"}]}"));
 assertThat(displayData, hasDisplayItem("numShards", 100));
-assertThat(displayData, hasDisplayItem("codec", 
CodecFactory.snappyCodec().toString()));
+assertThat(displayData, hasDisplayItem("codec", 
CodecFactory.deflateCodec(6).toString()));
   }
 }


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83771)
Time Spent: 20m  (was: 10m)

> Switch AvroIO sink default codec to Snappy
> --
>
> Key: BEAM-3874
> URL: https://issues.apache.org/jira/browse/BEAM-3874
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-avro
>Reporter: Marian Dvorsky
>Assignee: Eugene Kirpichov
>Priority: Minor
>  

[beam] branch master updated (19d8a5a -> 82c32de)

2018-03-23 Thread jkff
This is an automated email from the ASF dual-hosted git repository.

jkff pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 19d8a5a  Merge pull request #4461: [BEAM-3500] Minimizes JDBC 
connections open for too long
 add 2b8e2c8  Switched AvroIO default codec to snappyCodec().
 new 82c32de  Merge pull request #4904: [BEAM-3874] Switched AvroIO default 
codec to snappyCodec.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java | 6 +++---
 sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
j...@apache.org.


[beam] 01/01: Merge pull request #4461: [BEAM-3500] Minimizes JDBC connections open for too long

2018-03-23 Thread jkff
This is an automated email from the ASF dual-hosted git repository.

jkff pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 19d8a5a88e92119b895e0627d4dd11a87b9fc542
Merge: dbadcab 7dd265d
Author: Eugene Kirpichov 
AuthorDate: Fri Mar 23 15:07:45 2018 -0400

Merge pull request #4461: [BEAM-3500] Minimizes JDBC connections open for 
too long

Uses a pool with capacity 1 connection and with an idle timeout, gets 
connection from pool and returns to pool in Start/FinishBundle rather than 
Setup/Teardown.

 sdks/java/io/jdbc/pom.xml  |  5 ++
 .../java/org/apache/beam/sdk/io/jdbc/JdbcIO.java   | 71 --
 2 files changed, 56 insertions(+), 20 deletions(-)


-- 
To stop receiving notification emails like this one, please contact
j...@apache.org.


[beam] branch master updated (dbadcab -> 19d8a5a)

2018-03-23 Thread jkff
This is an automated email from the ASF dual-hosted git repository.

jkff pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from dbadcab  [BEAM-3738] Enable py3 lint and cleanup tox.ini. (#4877)
 add 6583530  [BEAM-3500] "Attach" JDBC connection to the bundle (improve 
the pooling)
 add 5c66b29  [BEAM-3500] Test if the user provides both 
withDataSourceConfiguration() and withDataSourceFactory()
 add efb3b35  [BEAM-3500] Wrap the datasource as a poolable datasource and 
expose pooling configuration
 add 3e01d43  [BEAM-3500] Add commons-pool2 dependency
 add 3e2c4e2  [BEAM-3500] Only expose max number of connections in the pool 
to the users
 add 537cc5f  [BEAM-3500] Cleanup pool configuration parameters
 add a2fa6db  [BEAM-3500] Remove dataSourceFactory
 add 7dd265d  [BEAM-3500] Remove unecessary check on dataSourceConfiguration
 new 19d8a5a  Merge pull request #4461: [BEAM-3500] Minimizes JDBC 
connections open for too long

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/java/io/jdbc/pom.xml  |  5 ++
 .../java/org/apache/beam/sdk/io/jdbc/JdbcIO.java   | 71 --
 2 files changed, 56 insertions(+), 20 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
j...@apache.org.


[jira] [Work logged] (BEAM-3500) JdbcIO: Improve connection management

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3500?focusedWorklogId=83767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83767
 ]

ASF GitHub Bot logged work on BEAM-3500:


Author: ASF GitHub Bot
Created on: 23/Mar/18 19:07
Start Date: 23/Mar/18 19:07
Worklog Time Spent: 10m 
  Work Description: jkff closed pull request #4461: [BEAM-3500] "Attach" 
JDBC connection to the bundle and add DataSourceFactory allowing full control 
of the way the DataSource is created
URL: https://github.com/apache/beam/pull/4461
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/java/io/jdbc/pom.xml b/sdks/java/io/jdbc/pom.xml
index 9151cfb80ae..577d4f55a74 100644
--- a/sdks/java/io/jdbc/pom.xml
+++ b/sdks/java/io/jdbc/pom.xml
@@ -279,6 +279,11 @@
   commons-dbcp2
   2.1.1
 
+
+  org.apache.commons
+  commons-pool2
+  2.4.2
+
 
 
   joda-time
diff --git 
a/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java 
b/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java
index f7a66045886..35e4aacd5ac 100644
--- a/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java
+++ b/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java
@@ -52,6 +52,11 @@
 import org.apache.beam.sdk.values.PCollectionView;
 import org.apache.beam.sdk.values.PDone;
 import org.apache.commons.dbcp2.BasicDataSource;
+import org.apache.commons.dbcp2.DataSourceConnectionFactory;
+import org.apache.commons.dbcp2.PoolableConnectionFactory;
+import org.apache.commons.dbcp2.PoolingDataSource;
+import org.apache.commons.pool2.impl.GenericObjectPool;
+import org.apache.commons.pool2.impl.GenericObjectPoolConfig;
 import org.joda.time.Duration;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -235,17 +240,17 @@ public static DataSourceConfiguration create(DataSource 
dataSource) {
   checkArgument(dataSource != null, "dataSource can not be null");
   checkArgument(dataSource instanceof Serializable, "dataSource must be 
Serializable");
   return new AutoValue_JdbcIO_DataSourceConfiguration.Builder()
-  .setDataSource(dataSource)
-  .build();
+  .setDataSource(dataSource)
+  .build();
 }
 
 public static DataSourceConfiguration create(String driverClassName, 
String url) {
   checkArgument(driverClassName != null, "driverClassName can not be 
null");
   checkArgument(url != null, "url can not be null");
   return new AutoValue_JdbcIO_DataSourceConfiguration.Builder()
-  
.setDriverClassName(ValueProvider.StaticValueProvider.of(driverClassName))
-  .setUrl(ValueProvider.StaticValueProvider.of(url))
-  .build();
+  
.setDriverClassName(ValueProvider.StaticValueProvider.of(driverClassName))
+  .setUrl(ValueProvider.StaticValueProvider.of(url))
+  .build();
 }
 
 public static DataSourceConfiguration create(ValueProvider 
driverClassName,
@@ -254,8 +259,7 @@ public static DataSourceConfiguration 
create(ValueProvider driverClassNa
   checkArgument(url != null, "url can not be null");
   return new AutoValue_JdbcIO_DataSourceConfiguration.Builder()
   .setDriverClassName(driverClassName)
-  .setUrl(url)
-  .build();
+  .setUrl(url).build();
 }
 
 public DataSourceConfiguration withUsername(String username) {
@@ -307,9 +311,10 @@ private void populateDisplayData(DisplayData.Builder 
builder) {
   }
 }
 
-DataSource buildDatasource() throws Exception{
+DataSource buildDatasource() throws Exception {
+  DataSource current = null;
   if (getDataSource() != null) {
-return getDataSource();
+current = getDataSource();
   } else {
 BasicDataSource basicDataSource = new BasicDataSource();
 if (getDriverClassName() != null) {
@@ -327,8 +332,25 @@ DataSource buildDatasource() throws Exception{
 if (getConnectionProperties() != null && 
getConnectionProperties().get() != null) {
   
basicDataSource.setConnectionProperties(getConnectionProperties().get());
 }
-return basicDataSource;
+current = basicDataSource;
   }
+
+  // wrapping the datasource as a pooling datasource
+  DataSourceConnectionFactory connectionFactory = new 
DataSourceConnectionFactory(current);
+  PoolableConnectionFactory poolableConnectionFactory =
+  new PoolableConnectionFactory(connectionFactory, null);
+  GenericObjectPoolConfig poolConfig = new GenericObjectPoolConfig();
+  poolConfig.setMaxTotal(1);

[jira] [Work logged] (BEAM-3500) JdbcIO: Improve connection management

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3500?focusedWorklogId=83766=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83766
 ]

ASF GitHub Bot logged work on BEAM-3500:


Author: ASF GitHub Bot
Created on: 23/Mar/18 19:06
Start Date: 23/Mar/18 19:06
Worklog Time Spent: 10m 
  Work Description: jkff commented on issue #4461: [BEAM-3500] "Attach" 
JDBC connection to the bundle and add DataSourceFactory allowing full control 
of the way the DataSource is created
URL: https://github.com/apache/beam/pull/4461#issuecomment-375769221
 
 
   Yes, thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83766)
Time Spent: 7h 10m  (was: 7h)

> JdbcIO: Improve connection management
> -
>
> Key: BEAM-3500
> URL: https://issues.apache.org/jira/browse/BEAM-3500
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-jdbc
>Affects Versions: 2.2.0
>Reporter: Pawel Bartoszek
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> JdbcIO write DoFn acquires connection in {{@Setup}} and release it in 
> {{@Teardown}} methods, which means that connection might stay opened for days 
> in streaming job case. Keeping single connection open for so long might be 
> very risky as it's exposed to database, network etc issues.
> *Taking connection from the pool when it is actually needed*
> I suggest that connection would be taken from the connection pool in 
> {{executeBatch}} method and released when the batch is flushed. This will 
> allow the pool to take care of any returned unhealthy connections etc.
> *Make JdbcIO accept data source factory*
>  It would be nice if JdbcIO accepted DataSourceFactory rather than DataSource 
> itself. I am saying that because sink checks if DataSource implements 
> `Serializable` interface, which make it impossible to pass 
> BasicDataSource(used internally by sink) as it doesn’t implement this 
> interface. Something like:
> {code:java}
> interface DataSourceFactory extends Serializable{
>  DataSource create();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #6279

2018-03-23 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83759=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83759
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:50
Start Date: 23/Mar/18 18:50
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #4798: [BEAM-3738] Add more 
flake8 tests to run_pylint.sh
URL: https://github.com/apache/beam/pull/4798#issuecomment-375765483
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83759)
Time Spent: 12h 40m  (was: 12.5h)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83758
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:49
Start Date: 23/Mar/18 18:49
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4877: 
[BEAM-3738] Enable py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877#discussion_r176833151
 
 

 ##
 File path: sdks/python/tox.ini
 ##
 @@ -17,142 +17,107 @@
 
 [tox]
 # new environments will be excluded by default unless explicitly added to 
envlist.
-# TODO (after BEAM-3671) add lint_py3 back in.
-envlist = py27,py27gcp,py27cython,lint_py2,docs
+envlist = py27,py27-{gcp,cython2,lint},py3-lint,docs
 toxworkdir = {toxinidir}/target/.tox
 
 [pycodestyle]
 # Disable all errors and warnings except for the ones related to blank lines.
 # pylint does not check the number of blank lines.
 select = E3
 
+# Shared environment options.
+[testenv]
+# Set [] options for pip installation of apache-beam tarball.
+extras = test
+# Don't warn that these commands aren't installed.
+whitelist_externals =
+  find
+  time
+
 [testenv:py27]
-# autocomplete_test depends on nose when invoked directly.
-deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-whitelist_externals=find
 commands =
   python --version
   pip --version
-  # Clean up all previous python generated files.
-  - find apache_beam -type f -name '*.pyc' -delete
-  pip install -e .[test]
+  {toxinidir}/run_tox_cleanup.sh
   python apache_beam/examples/complete/autocomplete_test.py
   python setup.py test
-passenv = TRAVIS*
+  {toxinidir}/run_tox_cleanup.sh
 
-[testenv:py27cython]
+# This environment will fail in Jenkins if named "py27-cython".
+[testenv:py27-cython2]
 # cython tests are only expected to work in linux (2.x and 3.x)
 # If we want to add other platforms in the future, it should be:
 # `platform = linux2|darwin|...`
 # See https://docs.python.org/2/library/sys.html#sys.platform for platform 
codes
 platform = linux2
-# autocomplete_test depends on nose when invoked directly.
 deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-  cython==0.25.2
-whitelist_externals=
-  find
-  time
+  cython==0.26.1
 
 Review comment:
   pip-compile can generate requirements.txt from setup.py, and the docker 
image can pull that in.
   The only problem is that we have a different set of requirements depending 
on the extras specified (such as [test] and [gcp]), or whether cython is used.
   So the solution is probably to use a script to generate a set of 
requirements files, using pip-compile output as an intermediate step. setup.py 
will be used as the single source of truth for version pinning.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83758)
Time Spent: 12.5h  (was: 12h 20m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83757=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83757
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:43
Start Date: 23/Mar/18 18:43
Worklog Time Spent: 10m 
  Work Description: cclauss commented on issue #4798: [BEAM-3738] Add more 
flake8 tests to run_pylint.sh
URL: https://github.com/apache/beam/pull/4798#issuecomment-375763430
 
 
   Now that #4877 is merged, can we get another review on these changes to 
avoid backslides?  @udim 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83757)
Time Spent: 12h 20m  (was: 12h 10m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3545) Fn API metrics in Go SDK harness

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3545?focusedWorklogId=83750=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83750
 ]

ASF GitHub Bot logged work on BEAM-3545:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:37
Start Date: 23/Mar/18 18:37
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #4899: [BEAM-3545] Go SDK 
UserCounters
URL: https://github.com/apache/beam/pull/4899#issuecomment-375761865
 
 
   Added licenses to the new files so the tests could pass.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83750)
Time Spent: 2h 40m  (was: 2.5h)

> Fn API metrics in Go SDK harness
> 
>
> Key: BEAM-3545
> URL: https://issues.apache.org/jira/browse/BEAM-3545
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Kenneth Knowles
>Assignee: Robert Burke
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83749=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83749
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:36
Start Date: 23/Mar/18 18:36
Worklog Time Spent: 10m 
  Work Description: aaltay closed pull request #4877: [BEAM-3738] Enable 
py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/typehints/typehints.py 
b/sdks/python/apache_beam/typehints/typehints.py
index 543641526f5..af9e1fef70b 100644
--- a/sdks/python/apache_beam/typehints/typehints.py
+++ b/sdks/python/apache_beam/typehints/typehints.py
@@ -1127,7 +1127,6 @@ def coerce_to_kv_type(element_type, label=None):
 Union[tuple(t.tuple_types[1] for t in union_types)]]
   else:
 # TODO: Possibly handle other valid types.
-print "element_type", element_type
 raise ValueError(
 "Input to %r must be compatible with KV[Any, Any]. "
 "Found %s." % (label, element_type))
diff --git a/sdks/python/build.gradle b/sdks/python/build.gradle
index 484adb3629e..2c6637a5373 100644
--- a/sdks/python/build.gradle
+++ b/sdks/python/build.gradle
@@ -64,7 +64,7 @@ build.dependsOn buildPython
 task lint (dependsOn: 'setupTest') {
   doLast {
 exec {
-  commandLine 'tox', '-e', 'lint', '-c', 'tox.ini'
+  commandLine 'tox', '-e', '{py27,py3}-lint', '-c', 'tox.ini'
 }
   }
 }
@@ -74,7 +74,7 @@ check.dependsOn lint
 task testCython (dependsOn: ['setupTest', 'testPython', 'testGcp']) {
   doLast {
 exec {
-  commandLine 'tox', '-e', 'py27cython', '-c', 'tox.ini'
+  commandLine 'tox', '-e', 'py27-cython2', '-c', 'tox.ini'
 }
   }
 }
@@ -83,7 +83,7 @@ test.dependsOn testCython
 task testGcp (dependsOn: 'setupTest') {
   doLast {
 exec {
-  commandLine 'tox', '-e', 'py27gcp', '-c', 'tox.ini'
+  commandLine 'tox', '-e', 'py27-gcp', '-c', 'tox.ini'
 }
   }
 }
diff --git a/sdks/python/run_tox_cleanup.sh b/sdks/python/run_tox_cleanup.sh
new file mode 100755
index 000..76b683cf588
--- /dev/null
+++ b/sdks/python/run_tox_cleanup.sh
@@ -0,0 +1,30 @@
+#!/bin/bash
+#
+#Licensed to the Apache Software Foundation (ASF) under one or more
+#contributor license agreements.  See the NOTICE file distributed with
+#this work for additional information regarding copyright ownership.
+#The ASF licenses this file to You under the Apache License, Version 2.0
+#(the "License"); you may not use this file except in compliance with
+#the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+#
+
+# This script is used to remove generated files in previous Tox runs so that
+# they are not picked up by tests.
+
+set -e
+
+for dir in apache_beam target/build; do
+if [ -d "${dir}" ]; then
+for ext in pyc c so; do
+find ${dir} -type f -name "*.${ext}" -delete
+done
+fi
+done
diff --git a/sdks/python/setup.py b/sdks/python/setup.py
index 02c02f56c42..ffc4df78413 100644
--- a/sdks/python/setup.py
+++ b/sdks/python/setup.py
@@ -113,11 +113,8 @@ def get_version():
 'futures>=3.1.1,<4.0.0',
 ]
 
-REQUIRED_SETUP_PACKAGES = [
-'nose>=1.0',
-]
-
 REQUIRED_TEST_PACKAGES = [
+'nose>=1.3.7',
 'pyhamcrest>=1.9,<2.0',
 ]
 
@@ -177,6 +174,9 @@ def generate_common_urns():
 + '\n')
 generate_common_urns()
 
+python_requires = '>=2.7'
+if os.environ.get('BEAM_EXPERIMENTAL_PY3') is None:
+  python_requires += ',<3.0'
 
 setuptools.setup(
 name=PACKAGE_NAME,
@@ -202,9 +202,8 @@ def generate_common_urns():
 'apache_beam/utils/counters.py',
 'apache_beam/utils/windowed_value.py',
 ]),
-setup_requires=REQUIRED_SETUP_PACKAGES,
 install_requires=REQUIRED_PACKAGES,
-python_requires='>=2.7,<3.0',
+python_requires=python_requires,
 test_suite='nose.collector',
 tests_require=REQUIRED_TEST_PACKAGES,
 extras_require={
diff --git a/sdks/python/tox.ini b/sdks/python/tox.ini
index 857e7b0e571..add9bbfb2cc 100644
--- a/sdks/python/tox.ini
+++ b/sdks/python/tox.ini
@@ -17,8 +17,7 @@
 
 [tox]
 # new environments will be excluded by default unless explicitly added 

Build failed in Jenkins: beam_PerformanceTests_Spark #1502

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[szewinho] [BEAM-3060] Fixing mvn dependency issue when runnning filebasedIOIT

[szewinho] Jax-api added to dependencyManagement.

[szewinho] Jaxb-api added to beam root pom.xml

[szewinho] Removed empty line

[szewinho] Updated gradle build to use jaxb-api. Jaxb-api version set to 2.2.12

[szewinho] Scope removed from root pom.xml and jaxb-api dependency set to

[tgroh] Move getId to the top-level pipeline node

[tgroh] Add FusedPipeline#toPipeline

[markliu] [BEAM-3861] Complete streaming wordcount test in Python SDK

--
[...truncated 66.82 KB...]
2018-03-23 18:27:35,838 7f48997d MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-03-23 18:28:00,910 7f48997d MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-03-23 18:28:04,405 7f48997d MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r73999cdf961146dd_0162541e18d1_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: Upload complete.Waiting on bqjob_r73999cdf961146dd_0162541e18d1_1 
... (0s) Current status: RUNNING
  Waiting on 
bqjob_r73999cdf961146dd_0162541e18d1_1 ... (0s) Current status: DONE   
2018-03-23 18:28:04,405 7f48997d MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-03-23 18:28:23,938 7f48997d MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-03-23 18:28:27,477 7f48997d MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r2d994aa20a68ff88_0162541e72c6_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: Upload complete.Waiting on bqjob_r2d994aa20a68ff88_0162541e72c6_1 
... (0s) Current status: RUNNING
  Waiting on 
bqjob_r2d994aa20a68ff88_0162541e72c6_1 ... (0s) Current status: DONE   
2018-03-23 18:28:27,477 7f48997d MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-03-23 18:28:45,539 7f48997d MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-03-23 18:28:48,853 7f48997d MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r9c182458a320c39_0162541ec70d_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: Upload complete.Waiting on bqjob_r9c182458a320c39_0162541ec70d_1 
... (0s) Current status: RUNNING
 Waiting on 
bqjob_r9c182458a320c39_0162541ec70d_1 ... (0s) Current status: DONE   
2018-03-23 18:28:48,853 7f48997d MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-03-23 18:29:10,531 7f48997d MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-03-23 18:29:14,046 7f48997d MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job

[beam] branch master updated: [BEAM-3738] Enable py3 lint and cleanup tox.ini. (#4877)

2018-03-23 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new dbadcab  [BEAM-3738] Enable py3 lint and cleanup tox.ini. (#4877)
dbadcab is described below

commit dbadcabd413af4990f439e441eba1c987d318fd0
Author: Udi Meiri (Ehud) 
AuthorDate: Fri Mar 23 11:36:05 2018 -0700

[BEAM-3738] Enable py3 lint and cleanup tox.ini. (#4877)

* Enable py3 lint and cleanup tox.ini.
- Add BEAM_EXPERIMENTAL_PY3 environment variable. Currently only used to
run the Tox py3-lint environment.
- Rename environments to use factors, such that py27-lint and py3-lint
use Python versions 2.7.x and 3.x.x respectively.
- Remove some redundant package requirements such as nose from being
specified in tox.ini if already present in setup.py.
- Use shared environment settings in tox.ini, under "[testenv]".
- Factor out *.{pyc,c,so} cleanup into a separate script.
---
 sdks/python/apache_beam/typehints/typehints.py |  1 -
 sdks/python/build.gradle   |  6 +-
 sdks/python/run_tox_cleanup.sh | 30 
 sdks/python/setup.py   | 11 ++-
 sdks/python/tox.ini| 99 +-
 5 files changed, 70 insertions(+), 77 deletions(-)

diff --git a/sdks/python/apache_beam/typehints/typehints.py 
b/sdks/python/apache_beam/typehints/typehints.py
index 5436415..af9e1fe 100644
--- a/sdks/python/apache_beam/typehints/typehints.py
+++ b/sdks/python/apache_beam/typehints/typehints.py
@@ -1127,7 +1127,6 @@ def coerce_to_kv_type(element_type, label=None):
 Union[tuple(t.tuple_types[1] for t in union_types)]]
   else:
 # TODO: Possibly handle other valid types.
-print "element_type", element_type
 raise ValueError(
 "Input to %r must be compatible with KV[Any, Any]. "
 "Found %s." % (label, element_type))
diff --git a/sdks/python/build.gradle b/sdks/python/build.gradle
index 484adb3..2c6637a 100644
--- a/sdks/python/build.gradle
+++ b/sdks/python/build.gradle
@@ -64,7 +64,7 @@ build.dependsOn buildPython
 task lint (dependsOn: 'setupTest') {
   doLast {
 exec {
-  commandLine 'tox', '-e', 'lint', '-c', 'tox.ini'
+  commandLine 'tox', '-e', '{py27,py3}-lint', '-c', 'tox.ini'
 }
   }
 }
@@ -74,7 +74,7 @@ check.dependsOn lint
 task testCython (dependsOn: ['setupTest', 'testPython', 'testGcp']) {
   doLast {
 exec {
-  commandLine 'tox', '-e', 'py27cython', '-c', 'tox.ini'
+  commandLine 'tox', '-e', 'py27-cython2', '-c', 'tox.ini'
 }
   }
 }
@@ -83,7 +83,7 @@ test.dependsOn testCython
 task testGcp (dependsOn: 'setupTest') {
   doLast {
 exec {
-  commandLine 'tox', '-e', 'py27gcp', '-c', 'tox.ini'
+  commandLine 'tox', '-e', 'py27-gcp', '-c', 'tox.ini'
 }
   }
 }
diff --git a/sdks/python/run_tox_cleanup.sh b/sdks/python/run_tox_cleanup.sh
new file mode 100755
index 000..76b683c
--- /dev/null
+++ b/sdks/python/run_tox_cleanup.sh
@@ -0,0 +1,30 @@
+#!/bin/bash
+#
+#Licensed to the Apache Software Foundation (ASF) under one or more
+#contributor license agreements.  See the NOTICE file distributed with
+#this work for additional information regarding copyright ownership.
+#The ASF licenses this file to You under the Apache License, Version 2.0
+#(the "License"); you may not use this file except in compliance with
+#the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+#
+
+# This script is used to remove generated files in previous Tox runs so that
+# they are not picked up by tests.
+
+set -e
+
+for dir in apache_beam target/build; do
+if [ -d "${dir}" ]; then
+for ext in pyc c so; do
+find ${dir} -type f -name "*.${ext}" -delete
+done
+fi
+done
diff --git a/sdks/python/setup.py b/sdks/python/setup.py
index 02c02f5..ffc4df7 100644
--- a/sdks/python/setup.py
+++ b/sdks/python/setup.py
@@ -113,11 +113,8 @@ REQUIRED_PACKAGES = [
 'futures>=3.1.1,<4.0.0',
 ]
 
-REQUIRED_SETUP_PACKAGES = [
-'nose>=1.0',
-]
-
 REQUIRED_TEST_PACKAGES = [
+'nose>=1.3.7',
 'pyhamcrest>=1.9,<2.0',
 ]
 
@@ -177,6 +174,9 @@ def generate_common_urns():
 + '\n')
 generate_common_urns()
 
+python_requires = '>=2.7'
+if os.environ.get('BEAM_EXPERIMENTAL_PY3') is None:
+  python_requires += ',<3.0'
 
 setuptools.setup(
 name=PACKAGE_NAME,
@@ -202,9 +202,8 @@ 

[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83748=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83748
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:35
Start Date: 23/Mar/18 18:35
Worklog Time Spent: 10m 
  Work Description: aaltay commented on a change in pull request #4877: 
[BEAM-3738] Enable py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877#discussion_r176829348
 
 

 ##
 File path: sdks/python/setup.py
 ##
 @@ -113,11 +113,8 @@ def get_version():
 'futures>=3.1.1,<4.0.0',
 ]
 
-REQUIRED_SETUP_PACKAGES = [
-'nose>=1.0',
-]
-
 REQUIRED_TEST_PACKAGES = [
+'nose>=1.3.7',
 
 Review comment:
   Sounds good.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83748)
Time Spent: 12h  (was: 11h 50m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1171

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[szewinho] [BEAM-3060] Fixing mvn dependency issue when runnning filebasedIOIT

[szewinho] Jax-api added to dependencyManagement.

[szewinho] Jaxb-api added to beam root pom.xml

[szewinho] Removed empty line

[szewinho] Updated gradle build to use jaxb-api. Jaxb-api version set to 2.2.12

[szewinho] Scope removed from root pom.xml and jaxb-api dependency set to

[tgroh] Move getId to the top-level pipeline node

[tgroh] Add FusedPipeline#toPipeline

[markliu] [BEAM-3861] Complete streaming wordcount test in Python SDK

--
[...truncated 781.45 KB...]
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": "kind:pair", 
  "component_encodings": [
{
  "@type": "kind:bytes"
}, 
{
  "@type": 
"VarIntCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxhiUWeeSXOIA5XIYNmYyFjbSFTkh4A89cR+g==",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "compute/MapToVoidKey0.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s2"
}, 
"serialized_fn": "", 
"user_name": "compute/MapToVoidKey0"
  }
}
  ], 
  "type": "JOB_TYPE_BATCH"
}
root: INFO: Create job: 
root: INFO: Created job with id: [2018-03-23_11_27_43-12858485123179117769]
root: INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-03-23_11_27_43-12858485123179117769?project=apache-beam-testing
root: INFO: Job 2018-03-23_11_27_43-12858485123179117769 is in state 
JOB_STATE_PENDING
root: INFO: 2018-03-23T18:27:43.884Z: JOB_MESSAGE_WARNING: Job 
2018-03-23_11_27_43-12858485123179117769 might autoscale up to 250 workers.
root: INFO: 2018-03-23T18:27:43.908Z: JOB_MESSAGE_DETAILED: Autoscaling is 
enabled for job 2018-03-23_11_27_43-12858485123179117769. The number of workers 
will be between 1 and 250.
root: INFO: 2018-03-23T18:27:43.919Z: JOB_MESSAGE_DETAILED: Autoscaling was 
automatically enabled for job 2018-03-23_11_27_43-12858485123179117769.
root: INFO: 2018-03-23T18:27:48.943Z: JOB_MESSAGE_DETAILED: Checking required 
Cloud APIs are enabled.
root: INFO: 2018-03-23T18:27:49.104Z: JOB_MESSAGE_DETAILED: Checking 
permissions granted to controller Service Account.
root: INFO: 2018-03-23T18:27:49.959Z: JOB_MESSAGE_DETAILED: Expanding 
CoGroupByKey operations into optimizable parts.
root: INFO: 2018-03-23T18:27:50.008Z: JOB_MESSAGE_DEBUG: Combiner lifting 
skipped for step assert_that/Group/GroupByKey: GroupByKey not followed by a 
combiner.
root: INFO: 2018-03-23T18:27:50.033Z: JOB_MESSAGE_DETAILED: Expanding 
GroupByKey operations into optimizable parts.
root: INFO: 2018-03-23T18:27:50.063Z: JOB_MESSAGE_DETAILED: Lifting 
ValueCombiningMappingFns into MergeBucketsMappingFns
root: INFO: 2018-03-23T18:27:50.120Z: JOB_MESSAGE_DEBUG: Annotating graph with 
Autotuner information.
root: INFO: 2018-03-23T18:27:50.194Z: JOB_MESSAGE_DETAILED: Fusing adjacent 
ParDo, Read, Write, and Flatten operations
root: INFO: 2018-03-23T18:27:50.226Z: JOB_MESSAGE_DETAILED: Unzipping flatten 
s11 for input s10.out
root: INFO: 2018-03-23T18:27:50.303Z: JOB_MESSAGE_DETAILED: Fusing unzipped 
copy of assert_that/Group/GroupByKey/Reify, through flatten 
assert_that/Group/Flatten, into producer assert_that/Group/pair_with_1
root: INFO: 2018-03-23T18:27:50.337Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Group/GroupByKey/GroupByWindow into 
assert_that/Group/GroupByKey/Read
root: INFO: 2018-03-23T18:27:50.607Z: JOB_MESSAGE_DETAILED: Fusing consumer 
assert_that/Unkey into assert_that/Group/Map(_merge_tagged_vals_under_key)
root: INFO: 2018-03-23T18:27:50.631Z: 

[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83747=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83747
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:35
Start Date: 23/Mar/18 18:35
Worklog Time Spent: 10m 
  Work Description: aaltay commented on a change in pull request #4877: 
[BEAM-3738] Enable py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877#discussion_r176829272
 
 

 ##
 File path: sdks/python/tox.ini
 ##
 @@ -17,142 +17,107 @@
 
 [tox]
 # new environments will be excluded by default unless explicitly added to 
envlist.
-# TODO (after BEAM-3671) add lint_py3 back in.
-envlist = py27,py27gcp,py27cython,lint_py2,docs
+envlist = py27,py27-{gcp,cython2,lint},py3-lint,docs
 toxworkdir = {toxinidir}/target/.tox
 
 [pycodestyle]
 # Disable all errors and warnings except for the ones related to blank lines.
 # pylint does not check the number of blank lines.
 select = E3
 
+# Shared environment options.
+[testenv]
+# Set [] options for pip installation of apache-beam tarball.
+extras = test
+# Don't warn that these commands aren't installed.
+whitelist_externals =
+  find
+  time
+
 [testenv:py27]
-# autocomplete_test depends on nose when invoked directly.
-deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-whitelist_externals=find
 commands =
   python --version
   pip --version
-  # Clean up all previous python generated files.
-  - find apache_beam -type f -name '*.pyc' -delete
-  pip install -e .[test]
+  {toxinidir}/run_tox_cleanup.sh
   python apache_beam/examples/complete/autocomplete_test.py
   python setup.py test
-passenv = TRAVIS*
+  {toxinidir}/run_tox_cleanup.sh
 
-[testenv:py27cython]
+# This environment will fail in Jenkins if named "py27-cython".
+[testenv:py27-cython2]
 # cython tests are only expected to work in linux (2.x and 3.x)
 # If we want to add other platforms in the future, it should be:
 # `platform = linux2|darwin|...`
 # See https://docs.python.org/2/library/sys.html#sys.platform for platform 
codes
 platform = linux2
-# autocomplete_test depends on nose when invoked directly.
 deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-  cython==0.25.2
-whitelist_externals=
-  find
-  time
+  cython==0.26.1
 
 Review comment:
   That is fair to not change the version in this PR. We should consider 
upgrading it.
   
   It would be nice to have a single source of truth. Although I do not see how 
pip-compile can update the Dockerfile. It is worth filing a JIRA for that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83747)
Time Spent: 11h 50m  (was: 11h 40m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Put the String literal first when comparing it to an Object

2018-03-23 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 93cf5ad580d3424b06bd1ab595ff3215c3d6c3d3
Merge: b6e95a0 daf82fc
Author: Lukasz Cwik 
AuthorDate: Fri Mar 23 11:33:54 2018 -0700

Put the String literal first when comparing it to an Object

 .../java/org/apache/beam/examples/complete/TrafficRoutes.java  |  2 +-
 .../java/org/apache/beam/examples/complete/game/UserScore.java |  2 +-
 .../apache/beam/examples/complete/game/injector/Injector.java  |  4 ++--
 .../runners/gearpump/translators/functions/DoFnFunction.java   |  2 +-
 .../streaming/ResumeFromCheckpointStreamingTest.java   |  4 ++--
 .../test/java/org/apache/beam/sdk/transforms/CombineTest.java  |  2 +-
 .../org/apache/beam/sdk/extensions/gcp/options/GcpOptions.java |  2 +-
 .../org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java  |  2 +-
 .../java/org/apache/beam/sdk/io/gcp/spanner/SpannerSchema.java | 10 +-
 .../src/main/java/org/apache/beam/sdk/io/redis/RedisIO.java|  2 +-
 .../main/java/org/apache/beam/sdk/nexmark/queries/Query3.java  |  6 +++---
 .../java/org/apache/beam/sdk/nexmark/queries/Query3Model.java  |  4 ++--
 12 files changed, 21 insertions(+), 21 deletions(-)


-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[beam] branch master updated (b6e95a0 -> 93cf5ad)

2018-03-23 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from b6e95a0  [BEAM-3060] Fixing mvn dependency issue when runnning 
filebasedIO tests on HDFS
 add daf82fc  Put the String literal first when comparing it to an Object
 new 93cf5ad  Put the String literal first when comparing it to an Object

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../java/org/apache/beam/examples/complete/TrafficRoutes.java  |  2 +-
 .../java/org/apache/beam/examples/complete/game/UserScore.java |  2 +-
 .../apache/beam/examples/complete/game/injector/Injector.java  |  4 ++--
 .../runners/gearpump/translators/functions/DoFnFunction.java   |  2 +-
 .../streaming/ResumeFromCheckpointStreamingTest.java   |  4 ++--
 .../test/java/org/apache/beam/sdk/transforms/CombineTest.java  |  2 +-
 .../org/apache/beam/sdk/extensions/gcp/options/GcpOptions.java |  2 +-
 .../org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java  |  2 +-
 .../java/org/apache/beam/sdk/io/gcp/spanner/SpannerSchema.java | 10 +-
 .../src/main/java/org/apache/beam/sdk/io/redis/RedisIO.java|  2 +-
 .../main/java/org/apache/beam/sdk/nexmark/queries/Query3.java  |  6 +++---
 .../java/org/apache/beam/sdk/nexmark/queries/Query3Model.java  |  4 ++--
 12 files changed, 21 insertions(+), 21 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83744=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83744
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:31
Start Date: 23/Mar/18 18:31
Worklog Time Spent: 10m 
  Work Description: cclauss commented on a change in pull request #4877: 
[BEAM-3738] Enable py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877#discussion_r176827405
 
 

 ##
 File path: sdks/python/tox.ini
 ##
 @@ -17,142 +17,107 @@
 
 [tox]
 # new environments will be excluded by default unless explicitly added to 
envlist.
-# TODO (after BEAM-3671) add lint_py3 back in.
-envlist = py27,py27gcp,py27cython,lint_py2,docs
+envlist = py27,py27-{gcp,cython2,lint},py3-lint,docs
 toxworkdir = {toxinidir}/target/.tox
 
 [pycodestyle]
 # Disable all errors and warnings except for the ones related to blank lines.
 # pylint does not check the number of blank lines.
 select = E3
 
+# Shared environment options.
+[testenv]
+# Set [] options for pip installation of apache-beam tarball.
+extras = test
+# Don't warn that these commands aren't installed.
+whitelist_externals =
+  find
+  time
+
 [testenv:py27]
-# autocomplete_test depends on nose when invoked directly.
-deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-whitelist_externals=find
 commands =
   python --version
   pip --version
-  # Clean up all previous python generated files.
-  - find apache_beam -type f -name '*.pyc' -delete
-  pip install -e .[test]
+  {toxinidir}/run_tox_cleanup.sh
   python apache_beam/examples/complete/autocomplete_test.py
   python setup.py test
-passenv = TRAVIS*
+  {toxinidir}/run_tox_cleanup.sh
 
-[testenv:py27cython]
+# This environment will fail in Jenkins if named "py27-cython".
+[testenv:py27-cython2]
 # cython tests are only expected to work in linux (2.x and 3.x)
 # If we want to add other platforms in the future, it should be:
 # `platform = linux2|darwin|...`
 # See https://docs.python.org/2/library/sys.html#sys.platform for platform 
codes
 platform = linux2
-# autocomplete_test depends on nose when invoked directly.
 deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-  cython==0.25.2
-whitelist_externals=
-  find
-  time
+  cython==0.26.1
 
 Review comment:
   +1 on a single __requirements.txt__ file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83744)
Time Spent: 11h 40m  (was: 11.5h)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83742
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:29
Start Date: 23/Mar/18 18:29
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4877: 
[BEAM-3738] Enable py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877#discussion_r176827577
 
 

 ##
 File path: sdks/python/setup.py
 ##
 @@ -113,11 +113,8 @@ def get_version():
 'futures>=3.1.1,<4.0.0',
 ]
 
-REQUIRED_SETUP_PACKAGES = [
-'nose>=1.0',
-]
-
 REQUIRED_TEST_PACKAGES = [
+'nose>=1.3.7',
 
 Review comment:
   I also wanted to merge requirements. Now only setup.py has the nose 
requirement.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83742)
Time Spent: 11h 20m  (was: 11h 10m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83743
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:29
Start Date: 23/Mar/18 18:29
Worklog Time Spent: 10m 
  Work Description: cclauss commented on a change in pull request #4877: 
[BEAM-3738] Enable py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877#discussion_r175250556
 
 

 ##
 File path: sdks/python/setup.py
 ##
 @@ -113,11 +113,8 @@ def get_version():
 'futures>=3.1.1,<4.0.0',
 ]
 
-REQUIRED_SETUP_PACKAGES = [
-'nose>=1.0',
 
 Review comment:
   Does it make sense to raise __REQUIRED_PIP_VERSION = '7.0.0'__ to something 
a closer to the current 9.0.3 or do we leave that alone for now?  
Alternatively, should the process do __pip install --upgrade pip__?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83743)
Time Spent: 11.5h  (was: 11h 20m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83741
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:28
Start Date: 23/Mar/18 18:28
Worklog Time Spent: 10m 
  Work Description: cclauss commented on a change in pull request #4877: 
[BEAM-3738] Enable py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877#discussion_r176827405
 
 

 ##
 File path: sdks/python/tox.ini
 ##
 @@ -17,142 +17,107 @@
 
 [tox]
 # new environments will be excluded by default unless explicitly added to 
envlist.
-# TODO (after BEAM-3671) add lint_py3 back in.
-envlist = py27,py27gcp,py27cython,lint_py2,docs
+envlist = py27,py27-{gcp,cython2,lint},py3-lint,docs
 toxworkdir = {toxinidir}/target/.tox
 
 [pycodestyle]
 # Disable all errors and warnings except for the ones related to blank lines.
 # pylint does not check the number of blank lines.
 select = E3
 
+# Shared environment options.
+[testenv]
+# Set [] options for pip installation of apache-beam tarball.
+extras = test
+# Don't warn that these commands aren't installed.
+whitelist_externals =
+  find
+  time
+
 [testenv:py27]
-# autocomplete_test depends on nose when invoked directly.
-deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-whitelist_externals=find
 commands =
   python --version
   pip --version
-  # Clean up all previous python generated files.
-  - find apache_beam -type f -name '*.pyc' -delete
-  pip install -e .[test]
+  {toxinidir}/run_tox_cleanup.sh
   python apache_beam/examples/complete/autocomplete_test.py
   python setup.py test
-passenv = TRAVIS*
+  {toxinidir}/run_tox_cleanup.sh
 
-[testenv:py27cython]
+# This environment will fail in Jenkins if named "py27-cython".
+[testenv:py27-cython2]
 # cython tests are only expected to work in linux (2.x and 3.x)
 # If we want to add other platforms in the future, it should be:
 # `platform = linux2|darwin|...`
 # See https://docs.python.org/2/library/sys.html#sys.platform for platform 
codes
 platform = linux2
-# autocomplete_test depends on nose when invoked directly.
 deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-  cython==0.25.2
-whitelist_externals=
-  find
-  time
+  cython==0.26.1
 
 Review comment:
   +1 on a single __requirement.txt__ file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83741)
Time Spent: 11h 10m  (was: 11h)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_Verify #4491

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[szewinho] [BEAM-3060] Fixing mvn dependency issue when runnning filebasedIOIT

[szewinho] Jax-api added to dependencyManagement.

[szewinho] Jaxb-api added to beam root pom.xml

[szewinho] Removed empty line

[szewinho] Updated gradle build to use jaxb-api. Jaxb-api version set to 2.2.12

[szewinho] Scope removed from root pom.xml and jaxb-api dependency set to

[tgroh] Move getId to the top-level pipeline node

[tgroh] Add FusedPipeline#toPipeline

[markliu] [BEAM-3861] Complete streaming wordcount test in Python SDK

--
[...truncated 1.12 MB...]
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s15"
}, 
"serialized_fn": "", 
"user_name": "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey1"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s33", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": "kind:pair", 
  "component_encodings": [
{
  "@type": "kind:bytes"
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey2.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s20"
}, 
"serialized_fn": "", 
"user_name": "write/Write/WriteImpl/FinalizeWrite/MapToVoidKey2"
  }
}
  ], 
  "type": "JOB_TYPE_BATCH"
}
root: INFO: Create job: 
root: INFO: Created job with id: [2018-03-23_11_18_11-14760280691419771484]
root: INFO: 

[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83737=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83737
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:23
Start Date: 23/Mar/18 18:23
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4877: 
[BEAM-3738] Enable py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877#discussion_r176825065
 
 

 ##
 File path: sdks/python/tox.ini
 ##
 @@ -17,142 +17,107 @@
 
 [tox]
 # new environments will be excluded by default unless explicitly added to 
envlist.
-# TODO (after BEAM-3671) add lint_py3 back in.
-envlist = py27,py27gcp,py27cython,lint_py2,docs
+envlist = py27,py27-{gcp,cython2,lint},py3-lint,docs
 toxworkdir = {toxinidir}/target/.tox
 
 [pycodestyle]
 # Disable all errors and warnings except for the ones related to blank lines.
 # pylint does not check the number of blank lines.
 select = E3
 
+# Shared environment options.
+[testenv]
+# Set [] options for pip installation of apache-beam tarball.
+extras = test
+# Don't warn that these commands aren't installed.
+whitelist_externals =
+  find
+  time
+
 [testenv:py27]
-# autocomplete_test depends on nose when invoked directly.
-deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-whitelist_externals=find
 commands =
   python --version
   pip --version
-  # Clean up all previous python generated files.
-  - find apache_beam -type f -name '*.pyc' -delete
-  pip install -e .[test]
+  {toxinidir}/run_tox_cleanup.sh
   python apache_beam/examples/complete/autocomplete_test.py
   python setup.py test
-passenv = TRAVIS*
+  {toxinidir}/run_tox_cleanup.sh
 
-[testenv:py27cython]
+# This environment will fail in Jenkins if named "py27-cython".
+[testenv:py27-cython2]
 # cython tests are only expected to work in linux (2.x and 3.x)
 # If we want to add other platforms in the future, it should be:
 # `platform = linux2|darwin|...`
 # See https://docs.python.org/2/library/sys.html#sys.platform for platform 
codes
 platform = linux2
-# autocomplete_test depends on nose when invoked directly.
 deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-  cython==0.25.2
-whitelist_externals=
-  find
-  time
+  cython==0.26.1
 
 Review comment:
   This is the minimum version required by `setup.py` 
(`REQUIRED_CYTHON_VERSION`). I didn't want to increase the version beyond that 
and risk breaking things in this PR.
   
   Side note: `sdks/python/container/Dockerfile` specifies `0.27.2`. *shrug*
   I wish all these pinned versions were synced to a single requirements.txt 
file. (see [pip-compile](https://pypi.python.org/pypi/pip-tools))


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83737)
Time Spent: 11h  (was: 10h 50m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Python #1058

2018-03-23 Thread Apache Jenkins Server
See 


Changes:

[szewinho] [BEAM-3060] Fixing mvn dependency issue when runnning filebasedIOIT

[szewinho] Jax-api added to dependencyManagement.

[szewinho] Jaxb-api added to beam root pom.xml

[szewinho] Removed empty line

[szewinho] Updated gradle build to use jaxb-api. Jaxb-api version set to 2.2.12

[szewinho] Scope removed from root pom.xml and jaxb-api dependency set to

[tgroh] Move getId to the top-level pipeline node

[tgroh] Add FusedPipeline#toPipeline

[markliu] [BEAM-3861] Complete streaming wordcount test in Python SDK

--
[...truncated 2.97 KB...]
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4010297507203666870.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Collecting absl-py (from -r PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Collecting colorlog[windows]==2.6.0 (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
  Using cached colorlog-2.6.0-py2.py3-none-any.whl
Collecting blinker>=1.3 (from -r PerfKitBenchmarker/requirements.txt (line 18))
Collecting futures>=3.0.3 (from -r PerfKitBenchmarker/requirements.txt (line 
19))
  Using cached futures-3.2.0-py2-none-any.whl
Requirement already satisfied: PyYAML==3.12 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Collecting pint>=0.7 (from -r PerfKitBenchmarker/requirements.txt (line 21))
Collecting numpy==1.13.3 (from -r PerfKitBenchmarker/requirements.txt (line 22))
  Using cached numpy-1.13.3-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied: functools32 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Collecting contextlib2>=0.5.1 (from -r PerfKitBenchmarker/requirements.txt 
(line 24))
  Using cached contextlib2-0.5.5-py2.py3-none-any.whl
Collecting pywinrm (from -r PerfKitBenchmarker/requirements.txt (line 25))
  Using cached pywinrm-0.3.0-py2.py3-none-any.whl
Requirement already satisfied: six in /usr/local/lib/python2.7/dist-packages 
(from absl-py->-r PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
Collecting xmltodict (from pywinrm->-r PerfKitBenchmarker/requirements.txt 
(line 25))
  Using cached xmltodict-0.11.0-py2.py3-none-any.whl
Collecting requests-ntlm>=0.3.0 (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
  Using cached requests_ntlm-1.1.0-py2.py3-none-any.whl
Requirement already satisfied: requests>=2.9.1 in 
/usr/local/lib/python2.7/dist-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Collecting ntlm-auth>=1.0.2 (from requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
  Using cached ntlm_auth-1.1.0-py2.py3-none-any.whl
Collecting cryptography>=1.3 (from requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
  Using cached cryptography-2.2.1-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied: urllib3<1.23,>=1.21.1 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: idna<2.7,>=2.5 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: certifi>=2017.4.17 in 
/usr/local/lib/python2.7/dist-packages (from requests>=2.9.1->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Collecting cffi>=1.7; platform_python_implementation != "PyPy" (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
  Using cached cffi-1.11.5-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied: enum34; python_version < "3" in 
/usr/local/lib/python2.7/dist-packages (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Collecting asn1crypto>=0.21.0 (from 
cryptography>=1.3->requests-ntlm>=0.3.0->pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
  Using cached asn1crypto-0.24.0-py2.py3-none-any.whl

[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83739
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:23
Start Date: 23/Mar/18 18:23
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4877: 
[BEAM-3738] Enable py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877#discussion_r176823318
 
 

 ##
 File path: sdks/python/tox.ini
 ##
 @@ -17,142 +17,107 @@
 
 [tox]
 # new environments will be excluded by default unless explicitly added to 
envlist.
-# TODO (after BEAM-3671) add lint_py3 back in.
-envlist = py27,py27gcp,py27cython,lint_py2,docs
+envlist = py27,py27-{gcp,cython2,lint},py3-lint,docs
 toxworkdir = {toxinidir}/target/.tox
 
 [pycodestyle]
 # Disable all errors and warnings except for the ones related to blank lines.
 # pylint does not check the number of blank lines.
 select = E3
 
+# Shared environment options.
+[testenv]
+# Set [] options for pip installation of apache-beam tarball.
+extras = test
+# Don't warn that these commands aren't installed.
+whitelist_externals =
+  find
+  time
+
 [testenv:py27]
-# autocomplete_test depends on nose when invoked directly.
-deps =
-  nose==1.3.7
-  grpcio-tools==1.3.5
-whitelist_externals=find
 commands =
   python --version
   pip --version
-  # Clean up all previous python generated files.
-  - find apache_beam -type f -name '*.pyc' -delete
-  pip install -e .[test]
+  {toxinidir}/run_tox_cleanup.sh
   python apache_beam/examples/complete/autocomplete_test.py
   python setup.py test
-passenv = TRAVIS*
+  {toxinidir}/run_tox_cleanup.sh
 
-[testenv:py27cython]
+# This environment will fail in Jenkins if named "py27-cython".
 
 Review comment:
   No bleeding clue. Since I don't have ssh access I can't really debug it.
   Honestly I'm glad I got it working.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83739)
Time Spent: 11h  (was: 10h 50m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3738) Enable Py3 linting in Jenkins

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3738?focusedWorklogId=83738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83738
 ]

ASF GitHub Bot logged work on BEAM-3738:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:23
Start Date: 23/Mar/18 18:23
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #4877: 
[BEAM-3738] Enable py3 lint and cleanup tox.ini.
URL: https://github.com/apache/beam/pull/4877#discussion_r176823498
 
 

 ##
 File path: sdks/python/setup.py
 ##
 @@ -113,11 +113,8 @@ def get_version():
 'futures>=3.1.1,<4.0.0',
 ]
 
-REQUIRED_SETUP_PACKAGES = [
-'nose>=1.0',
-]
-
 REQUIRED_TEST_PACKAGES = [
+'nose>=1.3.7',
 
 Review comment:
   tox.ini already required 1.3.7. I thought this would be a no-op.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83738)
Time Spent: 11h  (was: 10h 50m)

> Enable Py3 linting in Jenkins
> -
>
> Key: BEAM-3738
> URL: https://issues.apache.org/jira/browse/BEAM-3738
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> After BEAM-3671 is finished enable linting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3913) Support custom runner primitives in Fusion

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3913?focusedWorklogId=83733=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83733
 ]

ASF GitHub Bot logged work on BEAM-3913:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:20
Start Date: 23/Mar/18 18:20
Worklog Time Spent: 10m 
  Work Description: tgroh commented on issue #4938: [BEAM-3913] Allow 
Fusion to Continue with unknown PTransforms
URL: https://github.com/apache/beam/pull/4938#issuecomment-375757466
 
 
   Added a test.
   
   Only runner-executed unknown URNs are allowed right now in the fuser, as we 
assume that the runner can handle them properly


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83733)
Time Spent: 40m  (was: 0.5h)

> Support custom runner primitives in Fusion
> --
>
> Key: BEAM-3913
> URL: https://issues.apache.org/jira/browse/BEAM-3913
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Effectively, register runner-provided transform URNs in the 
> GreedyPCollectionFusers to understand runner-specific primitives, and don't 
> ever fuse them.
>  
> It's probably appropriate to continue to crash when encountering truly 
> unknown URNs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3545) Fn API metrics in Go SDK harness

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3545?focusedWorklogId=83727=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83727
 ]

ASF GitHub Bot logged work on BEAM-3545:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:12
Start Date: 23/Mar/18 18:12
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #4899: [BEAM-3545] Go SDK 
UserCounters
URL: https://github.com/apache/beam/pull/4899#issuecomment-375755146
 
 
   R: @wcn3 @herohde 
   PTAL, I missed that counters should be segregated by bundles, which lead to 
dramatic over counting, resolved. Added rough design summary/rationale to the 
internal metrics package documentation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83727)
Time Spent: 2.5h  (was: 2h 20m)

> Fn API metrics in Go SDK harness
> 
>
> Key: BEAM-3545
> URL: https://issues.apache.org/jira/browse/BEAM-3545
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Kenneth Knowles
>Assignee: Robert Burke
>Priority: Major
>  Labels: portability
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3060) Add performance tests for commonly used file-based I/O PTransforms

2018-03-23 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3060?focusedWorklogId=83726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-83726
 ]

ASF GitHub Bot logged work on BEAM-3060:


Author: ASF GitHub Bot
Created on: 23/Mar/18 18:10
Start Date: 23/Mar/18 18:10
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #4861: [BEAM-3060] 
Jenkins configuration allowing to run FilebasedIO tests on HDFS.
URL: https://github.com/apache/beam/pull/4861#issuecomment-375754624
 
 
   Kamil, please let me know when this is ready for review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 83726)
Time Spent: 5h 10m  (was: 5h)

> Add performance tests for commonly used file-based I/O PTransforms
> --
>
> Key: BEAM-3060
> URL: https://issues.apache.org/jira/browse/BEAM-3060
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Szymon Nieradka
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> We recently added a performance testing framework [1] that can be used to do 
> following.
> (1) Execute Beam tests using PerfkitBenchmarker
> (2) Manage Kubernetes-based deployments of data stores.
> (3) Easily publish benchmark results. 
> I think it will be useful to add performance tests for commonly used 
> file-based I/O PTransforms using this framework. I suggest looking into 
> following formats initially.
> (1) AvroIO
> (2) TextIO
> (3) Compressed text using TextIO
> (4) TFRecordIO
> It should be possibly to run these tests for various Beam runners (Direct, 
> Dataflow, Flink, Spark, etc.) and file-systems (GCS, local, HDFS, etc.) 
> easily.
> In the initial version, tests can be made manually triggerable for PRs 
> through Jenkins. Later, we could make some of these tests run periodically 
> and publish benchmark results (to BigQuery) through PerfkitBenchmarker.
> [1] https://beam.apache.org/documentation/io/testing/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-3895) Side Inputs should be available on ExecutableStage

2018-03-23 Thread Thomas Groh (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Groh resolved BEAM-3895.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Side Inputs should be available on ExecutableStage
> --
>
> Key: BEAM-3895
> URL: https://issues.apache.org/jira/browse/BEAM-3895
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>Priority: Major
>  Labels: portability
> Fix For: Not applicable
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Doing this ensures that the runner will have the side inputs immediately 
> available.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >