[jira] [Commented] (BEAM-5160) Fix failing `ReduceWindow` test

2018-08-17 Thread Vaclav Plajt (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583460#comment-16583460
 ] 

Vaclav Plajt commented on BEAM-5160:


Pull request: [https://github.com/seznam/beam/pull/20]

 

> Fix failing `ReduceWindow` test
> ---
>
> Key: BEAM-5160
> URL: https://issues.apache.org/jira/browse/BEAM-5160
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-euphoria
>Reporter: Vaclav Plajt
>Assignee: Vaclav Plajt
>Priority: Major
>
> `...testkit.ReduceWindowTest` contains one failing now ignored test test. Fix 
> it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5161) Enable FindBugs

2018-08-17 Thread Vaclav Plajt (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaclav Plajt resolved BEAM-5161.

   Resolution: Fixed
Fix Version/s: Not applicable

> Enable FindBugs
> ---
>
> Key: BEAM-5161
> URL: https://issues.apache.org/jira/browse/BEAM-5161
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-euphoria
>Reporter: Vaclav Plajt
>Assignee: David Moravek
>Priority: Major
> Fix For: Not applicable
>
>
> {color:#33}Euphoia's `build.gradle` contains 
> `applyJavaNature(enableFindbugs: false)`. Enable  F{color}indBugs and solve 
> all the warnings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5161) Enable FindBugs

2018-08-17 Thread Vaclav Plajt (JIRA)
Vaclav Plajt created BEAM-5161:
--

 Summary: Enable FindBugs
 Key: BEAM-5161
 URL: https://issues.apache.org/jira/browse/BEAM-5161
 Project: Beam
  Issue Type: Sub-task
  Components: dsl-euphoria
Reporter: Vaclav Plajt
Assignee: David Moravek


{color:#33}Euphoia's `build.gradle` contains 
`applyJavaNature(enableFindbugs: false)`. Enable  F{color}indBugs and solve all 
the warnings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PreCommit_Java_Cron #236

2018-08-17 Thread Apache Jenkins Server
See 


--
[...truncated 14.18 MB...]
BeamIOSourceRel(table=[[beam, Bid]])

Aug 17, 2018 6:43:27 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..4=[{inputs}], proj#0..1=[{exprs}])
  BeamJoinRel(condition=[AND(=($2, $4), >=($1, $3))], joinType=[inner])
BeamCalcRel(expr#0..2=[{inputs}], auction=[$t0], num=[$t2], 
starttime=[$t1])
  BeamAggregationRel(group=[{0, 1}], num=[COUNT()])
BeamCalcRel(expr#0..4=[{inputs}], expr#5=[5000], expr#6=[1], 
expr#7=[HOP($t3, $t5, $t6)], auction=[$t0], $f1=[$t7])
  BeamIOSourceRel(table=[[beam, Bid]])
BeamCalcRel(expr#0..1=[{inputs}], maxnum=[$t1], starttime=[$t0])
  BeamAggregationRel(group=[{1}], maxnum=[MAX($0)])
BeamCalcRel(expr#0..2=[{inputs}], num=[$t2], starttime=[$t1])
  BeamAggregationRel(group=[{0, 1}], num=[COUNT()])
BeamCalcRel(expr#0..4=[{inputs}], expr#5=[5000], 
expr#6=[1], expr#7=[HOP($t3, $t5, $t6)], auction=[$t0], $f1=[$t7])
  BeamIOSourceRel(table=[[beam, Bid]])


org.apache.beam.sdk.nexmark.queries.sql.SqlQuery3Test > 
testJoinsPeopleWithAuctions STANDARD_ERROR
Aug 17, 2018 6:43:27 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `P`.`name`, `P`.`city`, `P`.`state`, `A`.`id`
FROM `beam`.`Auction` AS `A`
INNER JOIN `beam`.`Person` AS `P` ON `A`.`seller` = `P`.`id`
WHERE `A`.`category` = 10 AND (`P`.`state` = 'OR' OR `P`.`state` = 'ID' OR 
`P`.`state` = 'CA')
Aug 17, 2018 6:43:27 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(name=[$11], city=[$14], state=[$15], id=[$0])
  LogicalFilter(condition=[AND(=($8, 10), OR(=($15, 'OR'), =($15, 'ID'), 
=($15, 'CA')))])
LogicalJoin(condition=[=($7, $10)], joinType=[inner])
  BeamIOSourceRel(table=[[beam, Auction]])
  BeamIOSourceRel(table=[[beam, Person]])

Aug 17, 2018 6:43:27 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..17=[{inputs}], name=[$t11], city=[$t14], state=[$t15], 
id=[$t0])
  BeamJoinRel(condition=[=($7, $10)], joinType=[inner])
BeamCalcRel(expr#0..9=[{inputs}], expr#10=[10], expr#11=[=($t8, $t10)], 
proj#0..9=[{exprs}], $condition=[$t11])
  BeamIOSourceRel(table=[[beam, Auction]])
BeamCalcRel(expr#0..7=[{inputs}], expr#8=['OR'], expr#9=[=($t5, $t8)], 
expr#10=['ID'], expr#11=[=($t5, $t10)], expr#12=['CA'], expr#13=[=($t5, $t12)], 
expr#14=[OR($t9, $t11, $t13)], proj#0..7=[{exprs}], $condition=[$t14])
  BeamIOSourceRel(table=[[beam, Person]])


org.apache.beam.sdk.nexmark.queries.sql.SqlQuery7Test > testBids STANDARD_ERROR
Aug 17, 2018 6:43:28 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `B`.`auction`, `B`.`price`, `B`.`bidder`, `B`.`dateTime`, `B`.`extra`
FROM (SELECT `B`.`auction`, `B`.`price`, `B`.`bidder`, `B`.`dateTime`, 
`B`.`extra`, TUMBLE_START(`B`.`dateTime`, INTERVAL '10' SECOND) AS `starttime`
FROM `beam`.`Bid` AS `B`
GROUP BY `B`.`auction`, `B`.`price`, `B`.`bidder`, `B`.`dateTime`, 
`B`.`extra`, TUMBLE(`B`.`dateTime`, INTERVAL '10' SECOND)) AS `B`
INNER JOIN (SELECT MAX(`B1`.`price`) AS `maxprice`, 
TUMBLE_START(`B1`.`dateTime`, INTERVAL '10' SECOND) AS `starttime`
FROM `beam`.`Bid` AS `B1`
GROUP BY TUMBLE(`B1`.`dateTime`, INTERVAL '10' SECOND)) AS `B1` ON 
`B`.`starttime` = `B1`.`starttime` AND `B`.`price` = `B1`.`maxprice`
Aug 17, 2018 6:43:28 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(auction=[$0], price=[$1], bidder=[$2], dateTime=[$3], 
extra=[$4])
  LogicalJoin(condition=[AND(=($5, $7), =($1, $6))], joinType=[inner])
LogicalProject(auction=[$0], price=[$1], bidder=[$2], dateTime=[$3], 
extra=[$4], starttime=[$5])
  LogicalAggregate(group=[{0, 1, 2, 3, 4, 5}])
LogicalProject(auction=[$0], price=[$2], bidder=[$1], 
dateTime=[$3], extra=[$4], $f5=[TUMBLE($3, 1)])
  BeamIOSourceRel(table=[[beam, Bid]])
LogicalProject(maxprice=[$1], starttime=[$0])
  LogicalAggregate(group=[{0}], maxprice=[MAX($1)])
LogicalProject($f0=[TUMBLE($3, 1)], price=[$2])
  BeamIOSourceRel(table=[[beam, Bid]])

Aug 17, 2018 6:43:28 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..7=[{inputs}], proj#0..4=[{exprs}])
  BeamJoinRel(condition=[AND(=($5, $7), =($1, $6))], joinType=[inner])
BeamCalcRel(expr#0..5=[{inputs}], proj#0..5=[{exprs}])
  

[jira] [Commented] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread Etienne Chauchot (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583569#comment-16583569
 ] 

Etienne Chauchot commented on BEAM-3026:


You're right [~timrobertson100] 429 is not a retry code in ES because retrying 
it can amplify the load of the cluster. We need a very exponential back off to 
avoid so. I'll check in Ravi's PR if it's exponential enough.

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5132) Composite windowing fail with exception: AttributeError: 'NoneType' object has no attribute 'time'

2018-08-17 Thread Subhash (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583582#comment-16583582
 ] 

Subhash commented on BEAM-5132:
---

Also, on a side note, I just wanted to add that, for my use case, I was trying 
to apply this windowing over a ReadFromText input into my Apache Beam pipeline, 
so that I can window the bounded source (as it's a really big file: ~10GB), as 
opposed to an unbounded input (like a Pubsub queue). Would this be the problem 
for causing this?

> Composite windowing fail with exception: AttributeError: 'NoneType' object 
> has no attribute 'time'
> --
>
> Key: BEAM-5132
> URL: https://issues.apache.org/jira/browse/BEAM-5132
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.5.0
> Environment: Windows machine
>Reporter: Bui Nguyen Thang
>Priority: Major
>
> Tried to apply a window function to the existing pipeline that was running 
> fine. Got the following error:
> {code:java}
> Traceback (most recent call last):
>   File "E:\soft\ide\PyCharm 2018.1.2\helpers\pydev\pydevd.py", line 1664, in 
> 
> main()
>   File "E:\soft\ide\PyCharm 2018.1.2\helpers\pydev\pydevd.py", line 1658, in 
> main
> globals = debugger.run(setup['file'], None, None, is_module)
>   File "E:\soft\ide\PyCharm 2018.1.2\helpers\pydev\pydevd.py", line 1068, in 
> run
> pydev_imports.execfile(file, globals, locals)  # execute the script
>   File 
> "E:/work/source/ai-data-pipeline-research/metric_pipeline/batch_beam/batch_pipeline_main.py",
>  line 97, in 
> result.wait_until_finish()
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\direct\direct_runner.py",
>  line 421, in wait_until_finish
> self._executor.await_completion()
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\direct\executor.py",
>  line 398, in await_completion
> self._executor.await_completion()
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\direct\executor.py",
>  line 444, in await_completion
> six.reraise(t, v, tb)
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\direct\executor.py",
>  line 341, in call
> finish_state)
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\direct\executor.py",
>  line 378, in attempt_call
> evaluator.process_element(value)
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\direct\transform_evaluator.py",
>  line 574, in process_element
> self.runner.process(element)
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\common.py",
>  line 577, in process
> self._reraise_augmented(exn)
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\common.py",
>  line 618, in _reraise_augmented
> six.reraise(type(new_exn), new_exn, original_traceback)
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\common.py",
>  line 575, in process
> self.do_fn_invoker.invoke_process(windowed_value)
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\common.py",
>  line 353, in invoke_process
> windowed_value, self.process_method(windowed_value.value))
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\runners\common.py",
>  line 651, in process_outputs
> for result in results:
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\transforms\trigger.py",
>  line 942, in process_entire_key
> state, windowed_values, output_watermark):
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\transforms\trigger.py",
>  line 1098, in process_elements
> self.trigger_fn.on_element(value, window, context)
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\transforms\trigger.py",
>  line 488, in on_element
> self.underlying.on_element(element, window, context)
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\transforms\trigger.py",
>  line 535, in on_element
> trigger.on_element(element, window, self._sub_context(context, ix))
>   File 
> "C:\Users\thang\.virtualenvs\metric_pipeline-Mpf2nmv6\lib\site-packages\apache_beam\transforms\trigger.py",
>  line 286, in on_element
> '', TimeDomain.REAL_TIME, 

[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=135618=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135618
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 17/Aug/18 08:28
Start Date: 17/Aug/18 08:28
Worklog Time Spent: 10m 
  Work Description: vectorijk removed a comment on issue #5926: [BEAM-4723] 
[SQL] Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-413691262
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135618)
Time Spent: 3.5h  (was: 3h 20m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=135619=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135619
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 17/Aug/18 08:29
Start Date: 17/Aug/18 08:29
Worklog Time Spent: 10m 
  Work Description: vectorijk commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-413795824
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135619)
Time Spent: 3h 40m  (was: 3.5h)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_GradleBuild #1277

2018-08-17 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #5768

2018-08-17 Thread Apache Jenkins Server
See 


--
[...truncated 1.05 MB...]
test_compatibility (apache_beam.typehints.typehints_test.IterableHintTestCase) 
... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_tuple_compatibility 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_must_be_iterable 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_invalid_composite_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_invalid_simple_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_valid_composite_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_valid_simple_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_enforce_kv_type_constraint 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_param_must_be_tuple 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_param_must_have_length_2 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_proxy_to_tuple 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_enforce_list_type_constraint_invalid_composite_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_enforce_list_type_constraint_invalid_simple_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_enforce_list_type_constraint_valid_composite_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_enforce_list_type_constraint_valid_simple_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_list_constraint_compatibility 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_list_repr (apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_getitem_proxy_to_union 
(apache_beam.typehints.typehints_test.OptionalHintTestCase) ... ok
test_getitem_sequence_not_allowed 
(apache_beam.typehints.typehints_test.OptionalHintTestCase) ... ok
test_any_return_type_hint 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_must_be_primitive_type_or_type_constraint 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_must_be_single_return_type 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_no_kwargs_accepted 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_type_check_composite_type 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_type_check_simple_type 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_type_check_violation 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_compatibility (apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_invalid_elem_type 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_must_be_set 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_valid_elem_composite_type 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_valid_elem_simple_type 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_any_argument_type_hint 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_basic_type_assertion 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_composite_type_assertion 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_invalid_only_positional_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_must_be_primitive_type_or_constraint 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_valid_mix_positional_and_keyword_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_valid_only_positional_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_valid_simple_type_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_functions_as_regular_generator 
(apache_beam.typehints.typehints_test.TestGeneratorWrapper) ... ok
test_compatibility (apache_beam.typehints.typehints_test.TupleHintTestCase) ... 
ok
test_compatibility_arbitrary_length 
(apache_beam.typehints.typehints_test.TupleHintTestCase) ... ok
test_getitem_invalid_ellipsis_type_param 

[jira] [Commented] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-08-17 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583602#comment-16583602
 ] 

Maximilian Michels commented on BEAM-4130:
--

For local runs, where Flink is bootstrapped inside the Docker container, we 
need to default to run the SDK harness in the same container 
{{InProcessEnvironmentFactory}}. This is necessary because we can't start 
Docker containers inside the Docker container.

For remote Flink clusters, we can use the regular {{DockerEnvironmentFactory}}.

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135656=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135656
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210835997
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/common/retry/BaseRetryConfiguration.java
 ##
 @@ -0,0 +1,52 @@
+/*
 
 Review comment:
   As this PR touches more than ESIO, please rename the PR name to "factorise 
retry behavior and add it to ESIO"


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135656)
Time Spent: 7.5h  (was: 7h 20m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135670=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135670
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210864144
 
 

 ##
 File path: 
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 ##
 @@ -1025,11 +1172,40 @@ private void flushBatch() throws IOException {
 HttpEntity requestBody =
 new NStringEntity(bulkRequest.toString(), 
ContentType.APPLICATION_JSON);
 response = restClient.performRequest("POST", endPoint, 
Collections.emptyMap(), requestBody);
+if (spec.getRetryConfiguration() != null
+&& spec.getRetryConfiguration()
+.getRetryPredicate()
+.test(new ResponseException(response))) {
+  response = handleRetry("POST", endPoint, Collections.emptyMap(), 
requestBody);
+}
 checkForErrors(response, backendVersion);
   }
 
+  /** retry request based on retry configuration policy. */
+  private Response handleRetry(
+  String method, String endpoint, Map params, 
HttpEntity requestBody)
+  throws IOException, InterruptedException {
+Response response = null;
+Sleeper sleeper = Sleeper.DEFAULT;
+BackOff backoff = retryBackoff.backoff();
+int attempt = 0;
+//while retry policy exists
+while (BackOffUtils.next(sleeper, backoff)) {
+  LOG.warn(String.format(RETRY_ATTEMPT_LOG, ++attempt));
+  response = restClient.performRequest(method, endpoint, params, 
requestBody);
+  if (spec.getRetryConfiguration()
+  .getRetryPredicate()
+  .test(new ResponseException(response))) {
+continue;
+  }
+  //if request was successful or has other errors
+  return response;
+}
+throw new IOException(String.format(RETRY_FAILED_LOG, attempt));
+  }
+
   @Teardown
-  public void closeClient() throws Exception {
+  public void closeClient() throws IOException {
 
 Review comment:
   good to narrow down exception ! thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135670)
Time Spent: 9h  (was: 8h 50m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135664=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135664
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210874230
 
 

 ##
 File path: 
sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTestCommon.java
 ##
 @@ -512,4 +527,44 @@ public Void apply(Iterable input) {
   return null;
 }
   }
+
+  /** Test that the default predicate correctly parses chosen error code. */
+  public void testDefaultRetryPredicate(RestClient restClient) throws 
IOException {
+assertFalse(DEFAULT_RETRY_PREDICATE.test(new IOException("test")));
+String x =
+"{ \"index\" : { \"_index\" : \"test\", \"_type\" : \"doc\", \"_id\" : 
\"1\" } }\n"
++ "{ \"field1\" : @ }\n";
+HttpEntity entity = new NStringEntity(x, ContentType.APPLICATION_JSON);
+
+Response response = restClient.performRequest("POST", "/_bulk", 
Collections.emptyMap(), entity);
+assertTrue(CUSTOM_RETRY_PREDICATE.test(new ResponseException(response)));
+  }
+
+  /**
+   * Test that retries are invoked when Elasticsearch returns a specific error 
code. We invoke this
+   * by issuing corrupt data and retrying on the `400` error code. Normal 
behaviour is to retry on
+   * `429` only but that is difficult to simulate reliably. The logger is used 
to verify expected
+   * behavior.
+   */
+  public void testWriteRetry() throws Throwable {
+expectedException.expect(IOException.class);
+expectedException.expectMessage(
+String.format(ElasticsearchIO.Write.WriteFn.RETRY_FAILED_LOG, 2));
+
+String data[] = {"{ \"x\" :a,\"y\":\"ab\" }"};
+ElasticsearchIO.Write write =
+ElasticsearchIO.write()
+.withConnectionConfiguration(connectionConfiguration)
+.withRetryConfiguration(
+ElasticsearchIO.RetryConfiguration.create(3, 
Duration.millis(35000))
+.withRetryPredicate(CUSTOM_RETRY_PREDICATE));
 
 Review comment:
   Very good, I like your predicate thing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135664)
Time Spent: 8h 40m  (was: 8.5h)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135669=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135669
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210863705
 
 

 ##
 File path: 
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 ##
 @@ -1025,11 +1172,40 @@ private void flushBatch() throws IOException {
 HttpEntity requestBody =
 new NStringEntity(bulkRequest.toString(), 
ContentType.APPLICATION_JSON);
 response = restClient.performRequest("POST", endPoint, 
Collections.emptyMap(), requestBody);
+if (spec.getRetryConfiguration() != null
+&& spec.getRetryConfiguration()
+.getRetryPredicate()
+.test(new ResponseException(response))) {
+  response = handleRetry("POST", endPoint, Collections.emptyMap(), 
requestBody);
+}
 checkForErrors(response, backendVersion);
   }
 
+  /** retry request based on retry configuration policy. */
+  private Response handleRetry(
+  String method, String endpoint, Map params, 
HttpEntity requestBody)
+  throws IOException, InterruptedException {
+Response response = null;
+Sleeper sleeper = Sleeper.DEFAULT;
+BackOff backoff = retryBackoff.backoff();
+int attempt = 0;
+//while retry policy exists
+while (BackOffUtils.next(sleeper, backoff)) {
+  LOG.warn(String.format(RETRY_ATTEMPT_LOG, ++attempt));
+  response = restClient.performRequest(method, endpoint, params, 
requestBody);
+  if (spec.getRetryConfiguration()
+  .getRetryPredicate()
+  .test(new ResponseException(response))) {
 
 Review comment:
   it is very strange to create an Exception outside of a catch block. You 
create an exception whereas there may be no problem at all. Rather I think you 
should parametrize the Predicate to take a Response object and not a 
ResponseException object and remove the `t instanceof ResponseException` in the 
implementation of predicate#test because it will always be true as you create 
your own ResponseException.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135669)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135662=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135662
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210861856
 
 

 ##
 File path: 
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 ##
 @@ -1013,7 +1160,7 @@ private void flushBatch() throws IOException {
 }
 batch.clear();
 currentBatchSizeBytes = 0;
-Response response;
+Response response = null;
 
 Review comment:
   redundant


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135662)
Time Spent: 8.5h  (was: 8h 20m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135653=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135653
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210634383
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/common/retry/BaseRetryConfiguration.java
 ##
 @@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.common.retry;
+
+import java.io.Serializable;
+import org.joda.time.Duration;
+
+/**
+ * A POJO encapsulating a configuration for retry behavior when issuing 
requests. A retry will be
+ * attempted until the maxAttempts or maxDuration is exceeded, whichever comes 
first.
+ */
+public class BaseRetryConfiguration implements Serializable {
+
+  private int maxAttempts;
+  private Duration maxDuration;
+  protected RetryPredicate retryPredicate;
+
+  protected BaseRetryConfiguration(
+  int maxAttempts, Duration maxDuration, RetryPredicate retryPredicate) {
+super();
 
 Review comment:
   It is useless, you're calling the constructor of Object.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135653)
Time Spent: 7h 10m  (was: 7h)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135655=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135655
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210640533
 
 

 ##
 File path: 
sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java
 ##
 @@ -220,56 +221,33 @@ private HttpClient createHttpClient() {
*   
* 
*/
-  @AutoValue
-  public abstract static class RetryConfiguration implements Serializable {
-@VisibleForTesting
-static final RetryPredicate DEFAULT_RETRY_PREDICATE = new 
DefaultRetryPredicate();
-
-abstract int getMaxAttempts();
-
-abstract Duration getMaxDuration();
+  public static class RetryConfiguration extends BaseRetryConfiguration {
 
-abstract RetryPredicate getRetryPredicate();
-
-abstract Builder builder();
-
-@AutoValue.Builder
-abstract static class Builder {
-  abstract SolrIO.RetryConfiguration.Builder setMaxAttempts(int 
maxAttempts);
-
-  abstract SolrIO.RetryConfiguration.Builder setMaxDuration(Duration 
maxDuration);
-
-  abstract SolrIO.RetryConfiguration.Builder 
setRetryPredicate(RetryPredicate retryPredicate);
-
-  abstract SolrIO.RetryConfiguration build();
+private RetryConfiguration(
+int maxAttempts, Duration maxDuration, RetryPredicate 
defaultRetryPredicate) {
+  super(maxAttempts, maxDuration, defaultRetryPredicate);
 }
 
+/**
+ * Creates RetryConfiguration for {@link SolrIO} with provided 
maxAttempts, maxDurations and
+ * exponential backoff based retries.
+ */
 public static RetryConfiguration create(int maxAttempts, Duration 
maxDuration) {
   checkArgument(maxAttempts > 0, "maxAttempts must be greater than 0");
   checkArgument(
   maxDuration != null && maxDuration.isLongerThan(Duration.ZERO),
   "maxDuration must be greater than 0");
-  return new AutoValue_SolrIO_RetryConfiguration.Builder()
-  .setMaxAttempts(maxAttempts)
-  .setMaxDuration(maxDuration)
-  .setRetryPredicate(DEFAULT_RETRY_PREDICATE)
-  .build();
+  return new RetryConfiguration(maxAttempts, maxDuration, 
DEFAULT_RETRY_PREDICATE);
 }
 
-// Exposed only to allow tests to easily simulate server errors
 @VisibleForTesting
 RetryConfiguration withRetryPredicate(RetryPredicate predicate) {
 
 Review comment:
   As you are using with* builder pattern in `BaseRetryConfiguration` 
subclasses, I would prefer that you use autovalue to conform to beam pattern.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135655)
Time Spent: 7h 20m  (was: 7h 10m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135654=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135654
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210833245
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/common/retry/BaseRetryConfiguration.java
 ##
 @@ -0,0 +1,52 @@
+/*
 
 Review comment:
   BaseRetryConfiguration has 2 sub-classes called RetryConfiguration. They are 
declared as nested classes in ESIO and SolrIO classes. But they are the same 
except the DefaultRetryPredicate. We need to avoid code duplication. Try to 
move check code in create and withRetryPredicate method to 
BaseRetryConfiguration 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135654)
Time Spent: 7h 20m  (was: 7h 10m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135660=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135660
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210871752
 
 

 ##
 File path: 
sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTestCommon.java
 ##
 @@ -512,4 +527,44 @@ public Void apply(Iterable input) {
   return null;
 }
   }
+
+  /** Test that the default predicate correctly parses chosen error code. */
+  public void testDefaultRetryPredicate(RestClient restClient) throws 
IOException {
+assertFalse(DEFAULT_RETRY_PREDICATE.test(new IOException("test")));
+String x =
+"{ \"index\" : { \"_index\" : \"test\", \"_type\" : \"doc\", \"_id\" : 
\"1\" } }\n"
++ "{ \"field1\" : @ }\n";
+HttpEntity entity = new NStringEntity(x, ContentType.APPLICATION_JSON);
+
+Response response = restClient.performRequest("POST", "/_bulk", 
Collections.emptyMap(), entity);
+assertTrue(CUSTOM_RETRY_PREDICATE.test(new ResponseException(response)));
+  }
+
+  /**
+   * Test that retries are invoked when Elasticsearch returns a specific error 
code. We invoke this
+   * by issuing corrupt data and retrying on the `400` error code. Normal 
behaviour is to retry on
+   * `429` only but that is difficult to simulate reliably. The logger is used 
to verify expected
 
 Review comment:
   very good !


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135660)
Time Spent: 8h 10m  (was: 8h)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135668=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135668
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210873894
 
 

 ##
 File path: 
sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTestCommon.java
 ##
 @@ -512,4 +527,44 @@ public Void apply(Iterable input) {
   return null;
 }
   }
+
+  /** Test that the default predicate correctly parses chosen error code. */
+  public void testDefaultRetryPredicate(RestClient restClient) throws 
IOException {
+assertFalse(DEFAULT_RETRY_PREDICATE.test(new IOException("test")));
+String x =
+"{ \"index\" : { \"_index\" : \"test\", \"_type\" : \"doc\", \"_id\" : 
\"1\" } }\n"
++ "{ \"field1\" : @ }\n";
+HttpEntity entity = new NStringEntity(x, ContentType.APPLICATION_JSON);
+
+Response response = restClient.performRequest("POST", "/_bulk", 
Collections.emptyMap(), entity);
+assertTrue(CUSTOM_RETRY_PREDICATE.test(new ResponseException(response)));
+  }
+
+  /**
+   * Test that retries are invoked when Elasticsearch returns a specific error 
code. We invoke this
+   * by issuing corrupt data and retrying on the `400` error code. Normal 
behaviour is to retry on
+   * `429` only but that is difficult to simulate reliably. The logger is used 
to verify expected
+   * behavior.
+   */
+  public void testWriteRetry() throws Throwable {
+expectedException.expect(IOException.class);
+expectedException.expectMessage(
+String.format(ElasticsearchIO.Write.WriteFn.RETRY_FAILED_LOG, 2));
+
+String data[] = {"{ \"x\" :a,\"y\":\"ab\" }"};
+ElasticsearchIO.Write write =
+ElasticsearchIO.write()
+.withConnectionConfiguration(connectionConfiguration)
+.withRetryConfiguration(
+ElasticsearchIO.RetryConfiguration.create(3, 
Duration.millis(35000))
+.withRetryPredicate(CUSTOM_RETRY_PREDICATE));
+pipeline.apply(Create.of(Arrays.asList(data))).apply(write);
+try {
+  pipeline.run();
+} catch (Exception ex) {
+  throw ex.getCause();
+}
+
+fail();
 
 Review comment:
   not sure it is needed as you throw the exception


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135668)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135658
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210872468
 
 

 ##
 File path: 
sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTestCommon.java
 ##
 @@ -512,4 +527,44 @@ public Void apply(Iterable input) {
   return null;
 }
   }
+
+  /** Test that the default predicate correctly parses chosen error code. */
+  public void testDefaultRetryPredicate(RestClient restClient) throws 
IOException {
+assertFalse(DEFAULT_RETRY_PREDICATE.test(new IOException("test")));
+String x =
+"{ \"index\" : { \"_index\" : \"test\", \"_type\" : \"doc\", \"_id\" : 
\"1\" } }\n"
++ "{ \"field1\" : @ }\n";
+HttpEntity entity = new NStringEntity(x, ContentType.APPLICATION_JSON);
+
+Response response = restClient.performRequest("POST", "/_bulk", 
Collections.emptyMap(), entity);
+assertTrue(CUSTOM_RETRY_PREDICATE.test(new ResponseException(response)));
+  }
+
+  /**
+   * Test that retries are invoked when Elasticsearch returns a specific error 
code. We invoke this
+   * by issuing corrupt data and retrying on the `400` error code. Normal 
behaviour is to retry on
+   * `429` only but that is difficult to simulate reliably. The logger is used 
to verify expected
 
 Review comment:
   also test that retry stops before the end of backoff in case 429 is gone 
(400 in your test case)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135658)
Time Spent: 7h 50m  (was: 7h 40m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135665=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135665
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210848274
 
 

 ##
 File path: 
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 ##
 @@ -893,6 +1013,18 @@ public PDone expand(PCollection input) {
   private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
   private static final int DEFAULT_RETRY_ON_CONFLICT = 5; // race 
conditions on updates
 
+  private static final Duration RETRY_INITIAL_BACKOFF = 
Duration.standardSeconds(5);
+  private static final Duration RETRY_MAX_BACKOFF = 
Duration.standardDays(365);
 
 Review comment:
   way too long. This value will never be hit otherwise we would retry for one 
year.  It will always be max number of attempts that will be hit first. Put a 
more reasonable value otherwise this parameter is useless.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135665)
Time Spent: 8h 50m  (was: 8h 40m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135666
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210876306
 
 

 ##
 File path: 
sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTestCommon.java
 ##
 @@ -512,4 +527,44 @@ public Void apply(Iterable input) {
   return null;
 }
   }
+
+  /** Test that the default predicate correctly parses chosen error code. */
+  public void testDefaultRetryPredicate(RestClient restClient) throws 
IOException {
+assertFalse(DEFAULT_RETRY_PREDICATE.test(new IOException("test")));
+String x =
+"{ \"index\" : { \"_index\" : \"test\", \"_type\" : \"doc\", \"_id\" : 
\"1\" } }\n"
++ "{ \"field1\" : @ }\n";
+HttpEntity entity = new NStringEntity(x, ContentType.APPLICATION_JSON);
+
+Response response = restClient.performRequest("POST", "/_bulk", 
Collections.emptyMap(), entity);
+assertTrue(CUSTOM_RETRY_PREDICATE.test(new ResponseException(response)));
+  }
+
+  /**
+   * Test that retries are invoked when Elasticsearch returns a specific error 
code. We invoke this
+   * by issuing corrupt data and retrying on the `400` error code. Normal 
behaviour is to retry on
+   * `429` only but that is difficult to simulate reliably. The logger is used 
to verify expected
+   * behavior.
+   */
+  public void testWriteRetry() throws Throwable {
+expectedException.expect(IOException.class);
+expectedException.expectMessage(
+String.format(ElasticsearchIO.Write.WriteFn.RETRY_FAILED_LOG, 2));
 
 Review comment:
   why is this 2? 3 -1 ?
   please use constants


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135666)
Time Spent: 8h 40m  (was: 8h 50m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135659=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135659
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210838651
 
 

 ##
 File path: 
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 ##
 @@ -879,6 +972,33 @@ public Write withUsePartialUpdate(boolean 
usePartialUpdate) {
   return builder().setUsePartialUpdate(usePartialUpdate).build();
 }
 
+/**
+ * Provides configuration to retry a failed batch call to Elastic Search. 
A batch is considered
 
 Review comment:
   Elasticsearch


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135659)
Time Spent: 8h  (was: 7h 50m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135661=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135661
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210839163
 
 

 ##
 File path: 
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 ##
 @@ -879,6 +972,33 @@ public Write withUsePartialUpdate(boolean 
usePartialUpdate) {
   return builder().setUsePartialUpdate(usePartialUpdate).build();
 }
 
+/**
+ * Provides configuration to retry a failed batch call to Elastic Search. 
A batch is considered
+ * as failed if the underlying {@link RestClient} surfaces 429 HTTP status 
code as error for one
+ * or more of the items in the {@link Response}. Users should consider 
that retrying might
+ * compound the underlying problem which caused the initial failure. Users 
should also be aware
+ * that once retrying is exhausted the error is surfaced to the runner 
which may then
+ * opt to retry the current partition in entirety or abort if the max 
number of retries of the
 
 Review comment:
   partition => bundle


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135661)
Time Spent: 8h 20m  (was: 8h 10m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135663
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210867546
 
 

 ##
 File path: 
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 ##
 @@ -1025,11 +1172,40 @@ private void flushBatch() throws IOException {
 HttpEntity requestBody =
 new NStringEntity(bulkRequest.toString(), 
ContentType.APPLICATION_JSON);
 response = restClient.performRequest("POST", endPoint, 
Collections.emptyMap(), requestBody);
+if (spec.getRetryConfiguration() != null
+&& spec.getRetryConfiguration()
+.getRetryPredicate()
+.test(new ResponseException(response))) {
+  response = handleRetry("POST", endPoint, Collections.emptyMap(), 
requestBody);
+}
 checkForErrors(response, backendVersion);
   }
 
+  /** retry request based on retry configuration policy. */
+  private Response handleRetry(
+  String method, String endpoint, Map params, 
HttpEntity requestBody)
+  throws IOException, InterruptedException {
+Response response = null;
+Sleeper sleeper = Sleeper.DEFAULT;
+BackOff backoff = retryBackoff.backoff();
+int attempt = 0;
+//while retry policy exists
+while (BackOffUtils.next(sleeper, backoff)) {
+  LOG.warn(String.format(RETRY_ATTEMPT_LOG, ++attempt));
+  response = restClient.performRequest(method, endpoint, params, 
requestBody);
+  if (spec.getRetryConfiguration()
+  .getRetryPredicate()
+  .test(new ResponseException(response))) {
+continue;
 
 Review comment:
   I think the condition is the other way around: with this code no matter the 
value of predicate.test() you will always go until the end of the loop. Should 
it be `if (! ... test()) {break;}` ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135663)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135667=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135667
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210861902
 
 

 ##
 File path: 
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 ##
 @@ -1025,11 +1172,40 @@ private void flushBatch() throws IOException {
 HttpEntity requestBody =
 new NStringEntity(bulkRequest.toString(), 
ContentType.APPLICATION_JSON);
 response = restClient.performRequest("POST", endPoint, 
Collections.emptyMap(), requestBody);
+if (spec.getRetryConfiguration() != null
+&& spec.getRetryConfiguration()
+.getRetryPredicate()
+.test(new ResponseException(response))) {
+  response = handleRetry("POST", endPoint, Collections.emptyMap(), 
requestBody);
+}
 checkForErrors(response, backendVersion);
   }
 
+  /** retry request based on retry configuration policy. */
+  private Response handleRetry(
+  String method, String endpoint, Map params, 
HttpEntity requestBody)
+  throws IOException, InterruptedException {
+Response response = null;
 
 Review comment:
   redundant


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135667)
Time Spent: 8h 50m  (was: 8h 40m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135657
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:05
Start Date: 17/Aug/18 11:05
Worklog Time Spent: 10m 
  Work Description: echauchot commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210838378
 
 

 ##
 File path: 
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 ##
 @@ -731,6 +744,81 @@ public void close() throws IOException {
   return source;
 }
   }
+  /**
+   * A POJO encapsulating a configuration for retry behavior when issuing 
requests to ES. A retry
+   * will be attempted until the maxAttempts or maxDuration is exceeded, 
whichever comes first, for
+   * 429 TOO_MANY_REQUESTS error.
+   */
+  public static class RetryConfiguration extends BaseRetryConfiguration {
+
+private RetryConfiguration(
+int maxAttempts, Duration maxDuration, RetryPredicate retryPredicate) {
+  super(maxAttempts, maxDuration, retryPredicate);
+}
+
+/**
+ * Creates RetryConfiguration for {@link ElasticsearchIO} with provided 
maxAttempts,
+ * maxDurations and exponential backoff based retries.
+ */
+public static RetryConfiguration create(int maxAttempts, Duration 
maxDuration) {
+  checkArgument(maxAttempts > 0, "maxAttempts must be greater than 0");
+  checkArgument(
+  maxDuration != null && maxDuration.isLongerThan(Duration.ZERO),
+  "maxDuration must be greater than 0");
+  return new RetryConfiguration(maxAttempts, maxDuration, 
DEFAULT_RETRY_PREDICATE);
+}
+
+@VisibleForTesting
+RetryConfiguration withRetryPredicate(RetryPredicate predicate) {
+  this.retryPredicate = predicate;
+  return this;
+}
+
+@VisibleForTesting
+static final RetryPredicate DEFAULT_RETRY_PREDICATE = new 
DefaultRetryPredicate();
+
+/**
+ * This is the default predicate used to test if a failed ES operation 
should be retried. A
+ * retry will be attempted until the maxAttempts or maxDuration is 
exceeded, whichever comes
+ * first, for TOO_MANY_REQUESTS(429) error.
+ */
+@VisibleForTesting
+static class DefaultRetryPredicate implements RetryPredicate {
+
+  private int errorCode;
+
+  DefaultRetryPredicate(int code) {
+this.errorCode = code;
+  }
+
+  DefaultRetryPredicate() {
+this(429);
+  }
+
+  /** Returns true if the response has the error code for any mutation. */
+  private static boolean errorCodePresent(Response response, int 
errorCode) {
+try {
+  JsonNode json = parseResponse(response);
+  if (json.path("errors").asBoolean()) {
+for (JsonNode item : json.path("items")) {
+  if (item.findValue("status").asInt() == errorCode) {
+return true;
+  }
+}
+  }
+} catch (IOException e) {
+  LOG.warn("Could not extract error codes from response {}", response);
+}
+return false;
+  }
+
+  @Override
+  public boolean test(Throwable t) {
+return (t instanceof ResponseException)
+&& errorCodePresent(((ResponseException) t).getResponse(), 
this.errorCode);
 
 Review comment:
   nit: remove `this.` as it is unambiguous.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135657)
Time Spent: 7h 40m  (was: 7.5h)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   

[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135672
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:28
Start Date: 17/Aug/18 11:28
Worklog Time Spent: 10m 
  Work Description: aalbatross commented on issue #6146: [BEAM-3026] Adding 
retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#issuecomment-413836816
 
 
   Thank you @echauchot for your comments, i will incorporate them and come up 
with AutoValue Builder solution for ESIO.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135672)
Time Spent: 9h 10m  (was: 9h)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135673=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135673
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:39
Start Date: 17/Aug/18 11:39
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#6146: [BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210882987
 
 

 ##
 File path: 
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
 ##
 @@ -893,6 +1013,18 @@ public PDone expand(PCollection input) {
   private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
   private static final int DEFAULT_RETRY_ON_CONFLICT = 5; // race 
conditions on updates
 
+  private static final Duration RETRY_INITIAL_BACKOFF = 
Duration.standardSeconds(5);
+  private static final Duration RETRY_MAX_BACKOFF = 
Duration.standardDays(365);
 
 Review comment:
   Is not needed - see SolrIO:
   
   
https://github.com/apache/beam/blob/master/sdks/java/io/solr/src/main/java/org/apache/beam/sdk/io/solr/SolrIO.java#L610


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135673)
Time Spent: 9h 20m  (was: 9h 10m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135676=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135676
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 11:48
Start Date: 17/Aug/18 11:48
Worklog Time Spent: 10m 
  Work Description: timrobertson100 commented on a change in pull request 
#6146: [BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210884836
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/common/retry/BaseRetryConfiguration.java
 ##
 @@ -0,0 +1,52 @@
+/*
 
 Review comment:
   @echauchot 
   Moving withRetryPredicate to the BaseRetryConfiguration will expose it in 
the API. Currently it is package private, visibleForTesting which was by design 
in SolrIO.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135676)
Time Spent: 9.5h  (was: 9h 20m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1278

2018-08-17 Thread Apache Jenkins Server
See 


--
[...truncated 20.88 MB...]
INFO: 2018-08-17T13:57:10.967Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/View.AsList/ParDo(ToIsmRecordForGlobalWindow) into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap
Aug 17, 2018 1:57:12 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:57:11.004Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForKeys/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 17, 2018 1:57:12 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:57:11.051Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Values/Values/Map
Aug 17, 2018 1:57:12 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:57:11.096Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Values/Values/Map into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Extract
Aug 17, 2018 1:57:12 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:57:11.127Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/ParDo(ToIsmMetadataRecordForKey) 
into SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForKeys/Read
Aug 17, 2018 1:57:12 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:57:11.165Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Extract
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues
Aug 17, 2018 1:57:12 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:57:11.201Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/WithKeys/AddKeys/Map into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/ParDo(CollectWindows)
Aug 17, 2018 1:57:12 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:57:11.247Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Read
Aug 17, 2018 1:57:12 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:57:11.285Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Write mutations to Spanner into 
SpannerIO.Write/Write mutations to Cloud Spanner/Batch mutations together
Aug 17, 2018 1:57:12 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:57:11.330Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Reify
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey+SpannerIO.Write/Write
 mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Partial
Aug 17, 2018 1:57:12 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:57:11.372Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey+SpannerIO.Write/Write
 mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Partial
 into SpannerIO.Write/Write mutations to 

Jenkins build is back to normal : beam_PostCommit_Python_Verify #5770

2018-08-17 Thread Apache Jenkins Server
See 




[beam] branch master updated: [BEAM-4130] Build Docker image for Flink's JobServer

2018-08-17 Thread mxm
This is an automated email from the ASF dual-hosted git repository.

mxm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 102a972  [BEAM-4130] Build Docker image for Flink's JobServer
 new d44a54d  Merge pull request #6238 from mxm/job-server-startup
102a972 is described below

commit 102a97293af08db5d639406879c87c4f3a6ddc40
Author: Maximilian Michels 
AuthorDate: Thu Aug 16 14:55:56 2018 +0200

[BEAM-4130] Build Docker image for Flink's JobServer

This adds a new Gradle module flink-job-server-container which builds a 
docker
image during the `docker` task. The image contains the FlinkJobServerDriver
which is the entry point for submitting Beam pipelines to the cluster.

The image can then be used to spawn a JobServer container when executing a 
Beam
pipeline, i.e. `p.run()`. The SDKs (Java/Python/Go) need to be updated to 
either
spawn up a container or use the address of a remote JobServer.
---
 runners/flink/job-server-container/Dockerfile  | 26 +++
 runners/flink/job-server-container/build.gradle| 54 ++
 .../flink/job-server-container/flink-job-server.sh | 29 
 settings.gradle|  2 +
 4 files changed, 111 insertions(+)

diff --git a/runners/flink/job-server-container/Dockerfile 
b/runners/flink/job-server-container/Dockerfile
new file mode 100644
index 000..a9aff21
--- /dev/null
+++ b/runners/flink/job-server-container/Dockerfile
@@ -0,0 +1,26 @@
+###
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+###
+
+FROM openjdk:8
+MAINTAINER "Apache Beam "
+
+ADD target/beam-runners-flink_2.11-job-server.jar /opt/apache/beam/jars/
+ADD target/flink-job-server.sh /opt/apache/beam/
+
+WORKDIR /opt/apache/beam
+ENTRYPOINT ["./flink-job-server.sh"]
diff --git a/runners/flink/job-server-container/build.gradle 
b/runners/flink/job-server-container/build.gradle
new file mode 100644
index 000..4d5f533
--- /dev/null
+++ b/runners/flink/job-server-container/build.gradle
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Build a Docker image to bootstrap FlinkJobServerDriver which requires a 
Java environment.
+ * Alternatively, it can also be bootstrapped through 
:beam-runners-flink_2.11-job-server:runShadow
+ * or by directly running the generated JAR file.
+ */
+
+apply plugin: org.apache.beam.gradle.BeamModulePlugin
+applyDockerNature()
+
+description = "Apache Beam :: Runners :: Flink :: Job Server :: Container"
+
+configurations {
+  dockerDependency
+}
+
+dependencies {
+  dockerDependency project(path: ":beam-runners-flink_2.11-job-server", 
configuration: "shadow")
+}
+
+task copyDockerfileDependencies(type: Copy) {
+  // Required Jars
+  from configurations.dockerDependency
+  rename 'beam-runners-flink_2.11-job-server.*.jar', 
'beam-runners-flink_2.11-job-server.jar'
+  into "build/target"
+  // Entry script
+  from file("./flink-job-server.sh")
+  into "build/target"
+}
+
+docker {
+  name containerImageName(name: "flink-job-server")
+  files "./build/"
+}
+
+// Ensure that we build the required resources and copy and file dependencies 
from related projects

Build failed in Jenkins: beam_PostCommit_Python_Verify #5769

2018-08-17 Thread Apache Jenkins Server
See 


--
[...truncated 1.04 MB...]
test_compatibility (apache_beam.typehints.typehints_test.IterableHintTestCase) 
... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_tuple_compatibility 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_must_be_iterable 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_invalid_composite_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_invalid_simple_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_valid_composite_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_valid_simple_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_enforce_kv_type_constraint 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_param_must_be_tuple 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_param_must_have_length_2 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_proxy_to_tuple 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_enforce_list_type_constraint_invalid_composite_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_enforce_list_type_constraint_invalid_simple_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_enforce_list_type_constraint_valid_composite_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_enforce_list_type_constraint_valid_simple_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_list_constraint_compatibility 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_list_repr (apache_beam.typehints.typehints_test.ListHintTestCase) ... ok
test_getitem_proxy_to_union 
(apache_beam.typehints.typehints_test.OptionalHintTestCase) ... ok
test_getitem_sequence_not_allowed 
(apache_beam.typehints.typehints_test.OptionalHintTestCase) ... ok
test_any_return_type_hint 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_must_be_primitive_type_or_type_constraint 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_must_be_single_return_type 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_no_kwargs_accepted 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_type_check_composite_type 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_type_check_simple_type 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_type_check_violation 
(apache_beam.typehints.typehints_test.ReturnsDecoratorTestCase) ... ok
test_compatibility (apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_invalid_elem_type 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_must_be_set 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_valid_elem_composite_type 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_type_check_valid_elem_simple_type 
(apache_beam.typehints.typehints_test.SetHintTestCase) ... ok
test_any_argument_type_hint 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_basic_type_assertion 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_composite_type_assertion 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_invalid_only_positional_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_must_be_primitive_type_or_constraint 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_valid_mix_positional_and_keyword_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_valid_only_positional_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_valid_simple_type_arguments 
(apache_beam.typehints.typehints_test.TakesDecoratorTestCase) ... ok
test_functions_as_regular_generator 
(apache_beam.typehints.typehints_test.TestGeneratorWrapper) ... ok
test_compatibility (apache_beam.typehints.typehints_test.TupleHintTestCase) ... 
ok
test_compatibility_arbitrary_length 
(apache_beam.typehints.typehints_test.TupleHintTestCase) ... ok
test_getitem_invalid_ellipsis_type_param 

[jira] [Created] (BEAM-5162) Document Metrics API for users

2018-08-17 Thread Maximilian Michels (JIRA)
Maximilian Michels created BEAM-5162:


 Summary: Document Metrics API for users
 Key: BEAM-5162
 URL: https://issues.apache.org/jira/browse/BEAM-5162
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core, website
Reporter: Maximilian Michels
Assignee: Kenneth Knowles


The Metrics API is currently only documented in Beam's JavaDocs. A 
complementary user documentation with examples would be desirable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PreCommit_Java_Cron #237

2018-08-17 Thread Apache Jenkins Server
See 


--
[...truncated 17.50 MB...]
INFO: 2018-08-17T13:32:08.370Z: Fusing unzipped copy of 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/GatherTempFileResults/Reshuffle/Window.Into()/Window.Assign,
 through flatten s13-u58, into producer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/GatherTempFileResults/Add void 
key/AddKeys/Map
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:08.498Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Pair
 with random key into 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Finalize
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:08.531Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/ExpandIterable
 into 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/GroupByKey/GroupByWindow
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:08.566Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Values/Values/Map
 into 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/ExpandIterable
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:08.672Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/GatherTempFileResults/Reshuffle.ViaRandomKey/Reshuffle/ExpandIterable
 into 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/GatherTempFileResults/Reshuffle.ViaRandomKey/Reshuffle/GroupByKey/GroupByWindow
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:08.723Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/Window.Into()/Window.Assign
 into 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Pair
 with random key
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:08.832Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/GroupByKey/Write
 into 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/GroupByKey/Reify
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:08.878Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/GroupByKey/Reify
 into 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/Window.Into()/Window.Assign
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:08.960Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/GatherTempFileResults/Reshuffle.ViaRandomKey/Values/Values/Map
 into 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/GatherTempFileResults/Reshuffle.ViaRandomKey/Reshuffle/ExpandIterable
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:09.010Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/FinalizeTempFileBundles/Finalize 
into 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/GatherTempFileResults/Reshuffle.ViaRandomKey/Values/Values/Map
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:09.055Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/GatherTempFileResults/Add void 
key/AddKeys/Map into 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/WriteUnshardedBundlesToTempFiles/WriteUnshardedBundles
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:09.199Z: Fusing consumer 
WriteOneFilePerWindow/TextIO.Write/WriteFiles/GatherTempFileResults/Reshuffle/Window.Into()/Window.Assign
 into WriteOneFilePerWindow/TextIO.Write/WriteFiles/GatherTempFileResults/Add 
void key/AddKeys/Map
Aug 17, 2018 1:32:19 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T13:32:09.247Z: Fusing consumer 

[jira] [Work logged] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4130?focusedWorklogId=135710=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135710
 ]

ASF GitHub Bot logged work on BEAM-4130:


Author: ASF GitHub Bot
Created on: 17/Aug/18 14:38
Start Date: 17/Aug/18 14:38
Worklog Time Spent: 10m 
  Work Description: asfgit closed pull request #6238: [BEAM-4130] Build 
Docker image for Flink's JobServer
URL: https://github.com/apache/beam/pull/6238
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/runners/flink/job-server-container/Dockerfile 
b/runners/flink/job-server-container/Dockerfile
new file mode 100644
index 000..a9aff21b6d6
--- /dev/null
+++ b/runners/flink/job-server-container/Dockerfile
@@ -0,0 +1,26 @@
+###
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+###
+
+FROM openjdk:8
+MAINTAINER "Apache Beam "
+
+ADD target/beam-runners-flink_2.11-job-server.jar /opt/apache/beam/jars/
+ADD target/flink-job-server.sh /opt/apache/beam/
+
+WORKDIR /opt/apache/beam
+ENTRYPOINT ["./flink-job-server.sh"]
diff --git a/runners/flink/job-server-container/build.gradle 
b/runners/flink/job-server-container/build.gradle
new file mode 100644
index 000..4d5f53316e3
--- /dev/null
+++ b/runners/flink/job-server-container/build.gradle
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Build a Docker image to bootstrap FlinkJobServerDriver which requires a 
Java environment.
+ * Alternatively, it can also be bootstrapped through 
:beam-runners-flink_2.11-job-server:runShadow
+ * or by directly running the generated JAR file.
+ */
+
+apply plugin: org.apache.beam.gradle.BeamModulePlugin
+applyDockerNature()
+
+description = "Apache Beam :: Runners :: Flink :: Job Server :: Container"
+
+configurations {
+  dockerDependency
+}
+
+dependencies {
+  dockerDependency project(path: ":beam-runners-flink_2.11-job-server", 
configuration: "shadow")
+}
+
+task copyDockerfileDependencies(type: Copy) {
+  // Required Jars
+  from configurations.dockerDependency
+  rename 'beam-runners-flink_2.11-job-server.*.jar', 
'beam-runners-flink_2.11-job-server.jar'
+  into "build/target"
+  // Entry script
+  from file("./flink-job-server.sh")
+  into "build/target"
+}
+
+docker {
+  name containerImageName(name: "flink-job-server")
+  files "./build/"
+}
+
+// Ensure that we build the required resources and copy and file dependencies 
from related projects
+dockerPrepare.dependsOn copyDockerfileDependencies
diff --git a/runners/flink/job-server-container/flink-job-server.sh 
b/runners/flink/job-server-container/flink-job-server.sh
new file mode 100755
index 000..399e5e4c64a
--- /dev/null
+++ b/runners/flink/job-server-container/flink-job-server.sh
@@ -0,0 +1,29 @@
+#!/bin/sh
+###
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with 

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1279

2018-08-17 Thread Apache Jenkins Server
See 


Changes:

[mxm] [BEAM-4130] Build Docker image for Flink's JobServer

--
[...truncated 20.07 MB...]
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.301Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample keys/GroupByKey/Reify
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.339Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Extract into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.381Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Reify into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/GroupByKey+SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Partial
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.418Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey+SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Partial into 
SpannerIO.Write/Write mutations to Cloud Spanner/Extract keys
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.473Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Reify
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.533Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForSize/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.580Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/View.AsList/ParDo(ToIsmRecordForGlobalWindow) into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.642Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForKeys/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.690Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Values/Values/Map
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.748Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Values/Values/Map into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Extract
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.804Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/ParDo(ToIsmMetadataRecordForKey) 
into SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForKeys/Read
Aug 17, 2018 3:20:21 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T15:20:08.861Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Extract
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #865

2018-08-17 Thread Apache Jenkins Server
See 


--
[...truncated 19.02 MB...]
INFO: Adding 
View.AsSingleton/Combine.GloballyAsSingletonView/CreateDataflowView as step s9
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding Create123/Read(CreateSource) as step s10
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding OutputSideInputs as step s11
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Window.Into()/Window.Assign as step 
s12
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/GatherAllOutputs/Reify.Window/ParDo(Anonymous) as step 
s13
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map 
as step s14
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign as step 
s15
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/GroupByKey as step 
s16
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GatherAllOutputs/Values/Values/Map as 
step s17
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/RewindowActuals/Window.Assign as step 
s18
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/KeyForDummy/AddKeys/Map as step s19
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveActualsTriggering/Flatten.PCollections as step 
s20
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Create.Values/Read(CreateSource) as 
step s21
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/WindowIntoDummy/Window.Assign as step 
s22
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding 
PAssert$33/GroupGlobally/RemoveDummyTriggering/Flatten.PCollections as step s23
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/FlattenDummyAndContents as step s24
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/NeverTrigger/Flatten.PCollections as 
step s25
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/GroupDummyAndContents as step s26
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/Values/Values/Map as step s27
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GroupGlobally/ParDo(Concat) as step s28
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/GetPane/Map as step s29
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/RunChecks as step s30
Aug 17, 2018 3:19:45 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding PAssert$33/VerifyAssertions/ParDo(DefaultConclude) as step s31
Aug 17, 2018 3:19:45 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0817151925-b79738cc/output/results/staging/
Aug 17, 2018 3:19:45 PM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <71190 bytes, hash Zo_7AKz-qoWVDEJTCNVW-g> to 
gs://temp-storage-for-validates-runner-tests//viewtest0testsingletonsideinput-jenkins-0817151925-b79738cc/output/results/staging/pipeline-Zo_7AKz-qoWVDEJTCNVW-g.pb


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135717=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135717
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 15:10
Start Date: 17/Aug/18 15:10
Worklog Time Spent: 10m 
  Work Description: aalbatross commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210941303
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/common/retry/BaseRetryConfiguration.java
 ##
 @@ -0,0 +1,52 @@
+/*
 
 Review comment:
   @timrobertson100 @echauchot i agree moving withRetryPredicate to 
BaseRetryConfiguration will expose API, the only way i can avoid is by 
overriding the method with protected final key word in ESIO or SolrIO. This 
will lead to duplicate code in all the implementations which use the common 
configuration classes.
   
   I think it is better to have duplicates with AutoValue Builder in ESIO and 
SolrIO instead of using common config objects. I will remove common code.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135717)
Time Spent: 9h 40m  (was: 9.5h)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135718=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135718
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 15:10
Start Date: 17/Aug/18 15:10
Worklog Time Spent: 10m 
  Work Description: aalbatross commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r210941303
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/io/common/retry/BaseRetryConfiguration.java
 ##
 @@ -0,0 +1,52 @@
+/*
 
 Review comment:
   @timrobertson100 @echauchot i agree moving withRetryPredicate to 
BaseRetryConfiguration will expose API, the only way i can avoid it is by 
overriding the method with protected final key word in ESIO or SolrIO. This 
will lead to duplicate code in all the implementations which use the common 
configuration classes.
   
   I think it is better to have duplicates with AutoValue Builder in ESIO and 
SolrIO instead of using common config objects. I will remove common code.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135718)
Time Spent: 9h 50m  (was: 9h 40m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3371) Add ability to stage directories with compiled classes to Spark

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3371?focusedWorklogId=135735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135735
 ]

ASF GitHub Bot logged work on BEAM-3371:


Author: ASF GitHub Bot
Created on: 17/Aug/18 16:02
Start Date: 17/Aug/18 16:02
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #6244: [BEAM-3371] Enable 
running integration tests on Spark
URL: https://github.com/apache/beam/pull/6244#issuecomment-413911850
 
 
   Created issues for failing tests: 
   https://issues.apache.org/jira/browse/BEAM-5165
   https://issues.apache.org/jira/browse/BEAM-5164
   https://issues.apache.org/jira/browse/BEAM-5163


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135735)
Time Spent: 50m  (was: 40m)

> Add ability to stage directories with compiled classes to Spark
> ---
>
> Key: BEAM-3371
> URL: https://issues.apache.org/jira/browse/BEAM-3371
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-spark
>Reporter: Lukasz Gajowy
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This one is basically the same issue as
>  [this Flink's one|https://issues.apache.org/jira/browse/BEAM-3370], except 
> of two things:
> - a detection of files to stage has to be provided in Spark, which is already 
> being developed [here|https://issues.apache.org/jira/browse/BEAM-981]
> - the test execution is not interrupted by FileNotFoundException but by *the 
> effect* of the directory not being staged (absence of test classes on the 
> Spark's classpath, hence ClassNotFoundException).
> Again, this probably could be resolved analogously as in flink, while 
> BEAM-981 issue is resolved. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=135747=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135747
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 17/Aug/18 17:05
Start Date: 17/Aug/18 17:05
Worklog Time Spent: 10m 
  Work Description: vectorijk commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-413928828
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135747)
Time Spent: 4h 10m  (was: 4h)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5163) TFRecordIOIT fails on Spark and Flink

2018-08-17 Thread Lukasz Gajowy (JIRA)
Lukasz Gajowy created BEAM-5163:
---

 Summary: TFRecordIOIT fails on Spark and Flink
 Key: BEAM-5163
 URL: https://issues.apache.org/jira/browse/BEAM-5163
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Lukasz Gajowy


When run on Spark or Flink remote cluster, TFRecordIOIT fails with the 
following stacktrace:

 
{code:java}
org.apache.beam.sdk.io.tfrecord.TFRecordIOIT > writeThenReadAll FAILED
org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
java.io.FileNotFoundException: No files found for spec: PREFIX_1534520885787*.
at 
org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:66)
at 
org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:99)
at 
org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:87)
at org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:116)
at org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:61)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
at 
org.apache.beam.sdk.io.tfrecord.TFRecordIOIT.writeThenReadAll(TFRecordIOIT.java:128)

Caused by:
java.io.FileNotFoundException: No files found for spec: 
PREFIX_1534520885787*.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=135737=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135737
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 17/Aug/18 16:18
Start Date: 17/Aug/18 16:18
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6208: [WIP] [BEAM-2930] Side 
input support for Flink portable streaming.
URL: https://github.com/apache/beam/pull/6208#issuecomment-413916534
 
 
   What is supposed to happen when the windowing strategy allows early/repeated 
firing? Won't that lead to indeterministic behavior? 
   
   If repeated firing is possible, then my current approach of just 
accumulating records into an Iterable won't work. 
   
   With the non portable runner, views are materialized via combine. While that 
would ensure that the Iterable is constructed correctly, repeated firing would 
still lead to indeterministic results. Anyone with better understanding of the 
old Flink runner can offer any insight here?
   
   
https://github.com/apache/beam/blob/master/runners/flink/src/main/java/org/apache/beam/runners/flink/CreateStreamingFlinkView.java#L52
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135737)
Time Spent: 3.5h  (was: 3h 20m)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle #866

2018-08-17 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PreCommit_Java_Cron #238

2018-08-17 Thread Apache Jenkins Server
See 


Changes:

[mxm] [BEAM-4130] Build Docker image for Flink's JobServer

--
[...truncated 16.00 MB...]
org.apache.beam.sdk.nexmark.queries.sql.SqlQuery5Test > testBids STANDARD_ERROR
Aug 17, 2018 6:14:24 PM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `AuctionBids`.`auction`, `AuctionBids`.`num`
FROM (SELECT `B1`.`auction`, COUNT(*) AS `num`, HOP_START(`B1`.`dateTime`, 
INTERVAL '5' SECOND, INTERVAL '10' SECOND) AS `starttime`
FROM `beam`.`Bid` AS `B1`
GROUP BY `B1`.`auction`, HOP(`B1`.`dateTime`, INTERVAL '5' SECOND, INTERVAL 
'10' SECOND)) AS `AuctionBids`
INNER JOIN (SELECT MAX(`CountBids`.`num`) AS `maxnum`, 
`CountBids`.`starttime`
FROM (SELECT COUNT(*) AS `num`, HOP_START(`B2`.`dateTime`, INTERVAL '5' 
SECOND, INTERVAL '10' SECOND) AS `starttime`
FROM `beam`.`Bid` AS `B2`
GROUP BY `B2`.`auction`, HOP(`B2`.`dateTime`, INTERVAL '5' SECOND, INTERVAL 
'10' SECOND)) AS `CountBids`
GROUP BY `CountBids`.`starttime`) AS `MaxBids` ON `AuctionBids`.`starttime` 
= `MaxBids`.`starttime` AND `AuctionBids`.`num` >= `MaxBids`.`maxnum`
Aug 17, 2018 6:14:24 PM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(auction=[$0], num=[$1])
  LogicalJoin(condition=[AND(=($2, $4), >=($1, $3))], joinType=[inner])
LogicalProject(auction=[$0], num=[$2], starttime=[$1])
  LogicalAggregate(group=[{0, 1}], num=[COUNT()])
LogicalProject(auction=[$0], $f1=[HOP($3, 5000, 1)])
  BeamIOSourceRel(table=[[beam, Bid]])
LogicalProject(maxnum=[$1], starttime=[$0])
  LogicalAggregate(group=[{0}], maxnum=[MAX($1)])
LogicalProject(starttime=[$1], num=[$0])
  LogicalProject(num=[$2], starttime=[$1])
LogicalAggregate(group=[{0, 1}], num=[COUNT()])
  LogicalProject(auction=[$0], $f1=[HOP($3, 5000, 1)])
BeamIOSourceRel(table=[[beam, Bid]])

Aug 17, 2018 6:14:24 PM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..4=[{inputs}], proj#0..1=[{exprs}])
  BeamJoinRel(condition=[AND(=($2, $4), >=($1, $3))], joinType=[inner])
BeamCalcRel(expr#0..2=[{inputs}], auction=[$t0], num=[$t2], 
starttime=[$t1])
  BeamAggregationRel(group=[{0, 1}], num=[COUNT()])
BeamCalcRel(expr#0..4=[{inputs}], expr#5=[5000], expr#6=[1], 
expr#7=[HOP($t3, $t5, $t6)], auction=[$t0], $f1=[$t7])
  BeamIOSourceRel(table=[[beam, Bid]])
BeamCalcRel(expr#0..1=[{inputs}], maxnum=[$t1], starttime=[$t0])
  BeamAggregationRel(group=[{1}], maxnum=[MAX($0)])
BeamCalcRel(expr#0..2=[{inputs}], num=[$t2], starttime=[$t1])
  BeamAggregationRel(group=[{0, 1}], num=[COUNT()])
BeamCalcRel(expr#0..4=[{inputs}], expr#5=[5000], 
expr#6=[1], expr#7=[HOP($t3, $t5, $t6)], auction=[$t0], $f1=[$t7])
  BeamIOSourceRel(table=[[beam, Bid]])


org.apache.beam.sdk.nexmark.queries.sql.SqlQuery3Test > 
testJoinsPeopleWithAuctions STANDARD_ERROR
Aug 17, 2018 6:14:24 PM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `P`.`name`, `P`.`city`, `P`.`state`, `A`.`id`
FROM `beam`.`Auction` AS `A`
INNER JOIN `beam`.`Person` AS `P` ON `A`.`seller` = `P`.`id`
WHERE `A`.`category` = 10 AND (`P`.`state` = 'OR' OR `P`.`state` = 'ID' OR 
`P`.`state` = 'CA')
Aug 17, 2018 6:14:24 PM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(name=[$11], city=[$14], state=[$15], id=[$0])
  LogicalFilter(condition=[AND(=($8, 10), OR(=($15, 'OR'), =($15, 'ID'), 
=($15, 'CA')))])
LogicalJoin(condition=[=($7, $10)], joinType=[inner])
  BeamIOSourceRel(table=[[beam, Auction]])
  BeamIOSourceRel(table=[[beam, Person]])

Aug 17, 2018 6:14:24 PM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..17=[{inputs}], name=[$t11], city=[$t14], state=[$t15], 
id=[$t0])
  BeamJoinRel(condition=[=($7, $10)], joinType=[inner])
BeamCalcRel(expr#0..9=[{inputs}], expr#10=[10], expr#11=[=($t8, $t10)], 
proj#0..9=[{exprs}], $condition=[$t11])
  BeamIOSourceRel(table=[[beam, Auction]])
BeamCalcRel(expr#0..7=[{inputs}], expr#8=['OR'], expr#9=[=($t5, $t8)], 
expr#10=['ID'], expr#11=[=($t5, $t10)], expr#12=['CA'], expr#13=[=($t5, $t12)], 
expr#14=[OR($t9, $t11, $t13)], proj#0..7=[{exprs}], $condition=[$t14])
  BeamIOSourceRel(table=[[beam, Person]])


org.apache.beam.sdk.nexmark.queries.sql.SqlQuery7Test > testBids STANDARD_ERROR
Aug 17, 2018 6:14:24 PM 

[jira] [Work logged] (BEAM-3371) Add ability to stage directories with compiled classes to Spark

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3371?focusedWorklogId=135730=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135730
 ]

ASF GitHub Bot logged work on BEAM-3371:


Author: ASF GitHub Bot
Created on: 17/Aug/18 15:50
Start Date: 17/Aug/18 15:50
Worklog Time Spent: 10m 
  Work Description: lgajowy opened a new pull request #6244: [BEAM-3371] 
Enable running integration tests on Spark
URL: https://github.com/apache/beam/pull/6244
 
 
   This PR enables running IOIT on Spark runner. One can do that by setting up 
remote spark cluster (spark master) and running this command: 
   
   ```
   ./gradlew clean integrationTest -p sdks/java/io/file-based-io-tests/ 
-DintegrationTestPipelineOptions='["--numberOfRecords=1000", 
"--filenamePrefix=PREFIX", "--runner=TestSparkRunner", 
"--sparkMaster=spark://LGs-Mac.local:7077", "--tempLocation=/tmp/"]' 
-DintegrationTestRunner=spark --tests org.apache.beam.sdk.io.text.TextIOIT 
--info
   ```
   
   I experienced some difficulties with TFRecordIOIT, ParquetIOIT and XmlIOIT 
tests: 
- XmlIOIT fails on assertion (hashcode of PCollection is different that it 
should be)
- ParquetIOIT suffers for dependency missmatch 
(`java.lang.NoSuchMethodError: 
org.apache.parquet.hadoop.ParquetWriter$Builder.(Lorg/apache/parquet/io/OutputFile;)V`)
- TFRecordIOIT cannot find the created file: 
`java.io.FileNotFoundException: No files found for spec: PREFIX_1534520885787*`
   
   Those issues are of different nature than the one described in BEAM-3371 and 
have to be tackled separately. 
   
   @iemejia Could you take a look when you're back? There's no need to rush - 
I'll be off for 2 weeks now. 
   
   CC: @pabloem 
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 

[jira] [Created] (BEAM-5164) ParquetIOIT fails on Spark and Flink

2018-08-17 Thread Lukasz Gajowy (JIRA)
Lukasz Gajowy created BEAM-5164:
---

 Summary: ParquetIOIT fails on Spark and Flink
 Key: BEAM-5164
 URL: https://issues.apache.org/jira/browse/BEAM-5164
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Lukasz Gajowy


When run on Spark or Flink remote cluster, ParquetIOIT fails with the following 
stacktrace: 


{code:java}
org.apache.beam.sdk.io.parquet.ParquetIOIT > writeThenReadAll FAILED
org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
java.lang.NoSuchMethodError: 
org.apache.parquet.hadoop.ParquetWriter$Builder.(Lorg/apache/parquet/io/OutputFile;)V
at 
org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:66)
at 
org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:99)
at 
org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:87)
at org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:116)
at org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:61)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
at 
org.apache.beam.sdk.io.parquet.ParquetIOIT.writeThenReadAll(ParquetIOIT.java:133)

Caused by:
java.lang.NoSuchMethodError: 
org.apache.parquet.hadoop.ParquetWriter$Builder.(Lorg/apache/parquet/io/OutputFile;)V{code}
 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=135758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135758
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 17/Aug/18 17:37
Start Date: 17/Aug/18 17:37
Worklog Time Spent: 10m 
  Work Description: vectorijk commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-413937489
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135758)
Time Spent: 4.5h  (was: 4h 20m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5165) XmlIOIT fails on Spark and Flink

2018-08-17 Thread Lukasz Gajowy (JIRA)
Lukasz Gajowy created BEAM-5165:
---

 Summary: XmlIOIT fails on Spark and Flink
 Key: BEAM-5165
 URL: https://issues.apache.org/jira/browse/BEAM-5165
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Lukasz Gajowy


When run on Spark or Flink remote cluster, XmlIOIT fails with the following 
stacktrace:

 
{code:java}
org.apache.beam.sdk.io.xml.XmlIOIT > writeThenReadAll FAILED
java.lang.AssertionError: Calculate hashcode/Flatten.PCollections.out:
Expected: "7f51adaf701441ee83459a3f705c1b86"
but: was "f45a386eeab2ea5c4a66c6aae07c88ae"
at org.apache.beam.sdk.testing.PAssert$PAssertionSite.capture(PAssert.java:168)
at org.apache.beam.sdk.testing.PAssert.thatSingleton(PAssert.java:413)
at org.apache.beam.sdk.testing.PAssert.thatSingleton(PAssert.java:404)
at org.apache.beam.sdk.io.xml.XmlIOIT.writeThenReadAll(XmlIOIT.java:140)

Caused by:
java.lang.AssertionError:
Expected: "7f51adaf701441ee83459a3f705c1b86"
but: was "f45a386eeab2ea5c4a66c6aae07c88ae"{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=135738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135738
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 17/Aug/18 16:19
Start Date: 17/Aug/18 16:19
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-413916829
 
 
   LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135738)
Time Spent: 3h 50m  (was: 3h 40m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=135739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135739
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 17/Aug/18 16:20
Start Date: 17/Aug/18 16:20
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-413917169
 
 
   `BeamSqlDatetimeMinusIntervalExpressionTest.java` is failing the rat test. 
It needs a copyright header.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135739)
Time Spent: 4h  (was: 3h 50m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3371) Add ability to stage directories with compiled classes to Spark

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3371?focusedWorklogId=135733=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135733
 ]

ASF GitHub Bot logged work on BEAM-3371:


Author: ASF GitHub Bot
Created on: 17/Aug/18 15:54
Start Date: 17/Aug/18 15:54
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #6244: [BEAM-3371] Enable 
running integration tests on Spark
URL: https://github.com/apache/beam/pull/6244#issuecomment-413909600
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135733)
Time Spent: 40m  (was: 0.5h)

> Add ability to stage directories with compiled classes to Spark
> ---
>
> Key: BEAM-3371
> URL: https://issues.apache.org/jira/browse/BEAM-3371
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-spark
>Reporter: Lukasz Gajowy
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This one is basically the same issue as
>  [this Flink's one|https://issues.apache.org/jira/browse/BEAM-3370], except 
> of two things:
> - a detection of files to stage has to be provided in Spark, which is already 
> being developed [here|https://issues.apache.org/jira/browse/BEAM-981]
> - the test execution is not interrupted by FileNotFoundException but by *the 
> effect* of the directory not being staged (absence of test classes on the 
> Spark's classpath, hence ClassNotFoundException).
> Again, this probably could be resolved analogously as in flink, while 
> BEAM-981 issue is resolved. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3371) Add ability to stage directories with compiled classes to Spark

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3371?focusedWorklogId=135732=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135732
 ]

ASF GitHub Bot logged work on BEAM-3371:


Author: ASF GitHub Bot
Created on: 17/Aug/18 15:54
Start Date: 17/Aug/18 15:54
Worklog Time Spent: 10m 
  Work Description: lgajowy edited a comment on issue #6244: [BEAM-3371] 
Enable running integration tests on Spark
URL: https://github.com/apache/beam/pull/6244#issuecomment-413909472
 
 
   BTW: what is the reason of the fact that filesToStage option is duplicated 
in `DataflowWorkerPoolOptions`, `FlinkPipelineOptions` and 
`SparkPipelineOptions`? Do you think we can move this to `PipelineOptions` 
interface? IMO it would make things even easier (maybe not only for this fix)?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135732)
Time Spent: 0.5h  (was: 20m)

> Add ability to stage directories with compiled classes to Spark
> ---
>
> Key: BEAM-3371
> URL: https://issues.apache.org/jira/browse/BEAM-3371
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-spark
>Reporter: Lukasz Gajowy
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This one is basically the same issue as
>  [this Flink's one|https://issues.apache.org/jira/browse/BEAM-3370], except 
> of two things:
> - a detection of files to stage has to be provided in Spark, which is already 
> being developed [here|https://issues.apache.org/jira/browse/BEAM-981]
> - the test execution is not interrupted by FileNotFoundException but by *the 
> effect* of the directory not being staged (absence of test classes on the 
> Spark's classpath, hence ClassNotFoundException).
> Again, this probably could be resolved analogously as in flink, while 
> BEAM-981 issue is resolved. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3371) Add ability to stage directories with compiled classes to Spark

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3371?focusedWorklogId=135731=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135731
 ]

ASF GitHub Bot logged work on BEAM-3371:


Author: ASF GitHub Bot
Created on: 17/Aug/18 15:54
Start Date: 17/Aug/18 15:54
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #6244: [BEAM-3371] Enable 
running integration tests on Spark
URL: https://github.com/apache/beam/pull/6244#issuecomment-413909472
 
 
   BTW: what is the reason of the fact that filesToStage option is duplicated 
in `DataflowWorkerPoolOptions`, `FlinkPipelineOptions` and 
`SparkPipelineOptions`? Do you think we can move this to `PipelineOptions` 
interface? IMO it would make things even easier (maybe not only for this fix). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135731)
Time Spent: 20m  (was: 10m)

> Add ability to stage directories with compiled classes to Spark
> ---
>
> Key: BEAM-3371
> URL: https://issues.apache.org/jira/browse/BEAM-3371
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-spark
>Reporter: Lukasz Gajowy
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This one is basically the same issue as
>  [this Flink's one|https://issues.apache.org/jira/browse/BEAM-3370], except 
> of two things:
> - a detection of files to stage has to be provided in Spark, which is already 
> being developed [here|https://issues.apache.org/jira/browse/BEAM-981]
> - the test execution is not interrupted by FileNotFoundException but by *the 
> effect* of the directory not being staged (absence of test classes on the 
> Spark's classpath, hence ClassNotFoundException).
> Again, this probably could be resolved analogously as in flink, while 
> BEAM-981 issue is resolved. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_GradleBuild #1280

2018-08-17 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=135749=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135749
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 17/Aug/18 17:14
Start Date: 17/Aug/18 17:14
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-413931220
 
 
   LGTM Thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135749)
Time Spent: 4h 20m  (was: 4h 10m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4679) Support portable combiner lifting in Java Reference Runner

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4679?focusedWorklogId=135904=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135904
 ]

ASF GitHub Bot logged work on BEAM-4679:


Author: ASF GitHub Bot
Created on: 17/Aug/18 22:54
Start Date: 17/Aug/18 22:54
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6141: [BEAM-4679] Re-adding 
Flatten roots in ULR.
URL: https://github.com/apache/beam/pull/6141
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/EmptyInputProvider.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/EmptyInputProvider.java
new file mode 100644
index 000..82edcef7b1a
--- /dev/null
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/EmptyInputProvider.java
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.direct.portable;
+
+import java.util.Collection;
+import java.util.Collections;
+import 
org.apache.beam.runners.core.construction.graph.PipelineNode.PTransformNode;
+
+/** A {@link RootInputProvider} that provides no input bundles. */
+class EmptyInputProvider implements RootInputProvider {
+  EmptyInputProvider() {}
+
+  /**
+   * {@inheritDoc}.
+   *
+   * Returns an empty collection.
+   */
+  @Override
+  public Collection> getInitialInputs(
+  PTransformNode transform, int targetParallelism) {
+return Collections.emptyList();
+  }
+}
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java
index 079c0d94fa4..ae4eb57fb8a 100644
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/ReferenceRunner.java
@@ -155,7 +155,7 @@ public void execute() throws Exception {
 BundleFactory bundleFactory = ImmutableListBundleFactory.create();
 EvaluationContext ctxt =
 EvaluationContext.create(Instant::new, bundleFactory, graph, 
getKeyedPCollections(graph));
-RootProviderRegistry rootRegistry = 
RootProviderRegistry.impulseRegistry(bundleFactory);
+RootProviderRegistry rootRegistry = 
RootProviderRegistry.javaPortableRegistry(bundleFactory);
 int targetParallelism = 
Math.max(Runtime.getRuntime().availableProcessors(), 3);
 ServerFactory serverFactory = createServerFactory();
 ControlClientPool controlClientPool = MapControlClientPool.create();
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/RootProviderRegistry.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/RootProviderRegistry.java
index e9702ba34fe..6b14a1a93d6 100644
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/RootProviderRegistry.java
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/portable/RootProviderRegistry.java
@@ -18,6 +18,7 @@
 package org.apache.beam.runners.direct.portable;
 
 import static com.google.common.base.Preconditions.checkNotNull;
+import static 
org.apache.beam.runners.core.construction.PTransformTranslation.FLATTEN_TRANSFORM_URN;
 import static 
org.apache.beam.runners.core.construction.PTransformTranslation.IMPULSE_TRANSFORM_URN;
 
 import com.google.common.collect.ImmutableMap;
@@ -25,7 +26,6 @@
 import java.util.Map;
 import org.apache.beam.runners.core.construction.PTransformTranslation;
 import 
org.apache.beam.runners.core.construction.graph.PipelineNode.PTransformNode;
-import org.apache.beam.sdk.transforms.Impulse;
 import org.apache.beam.sdk.transforms.PTransform;
 
 /**
@@ -33,13 +33,18 @@
  * based on the type of {@link PTransform} of 

[beam] branch master updated (350ccd9 -> 577b036)

2018-08-17 Thread thw
This is an automated email from the ASF dual-hosted git repository.

thw pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 350ccd9  [BEAM-5155] Check sdk absolute path before installing
 add 538521e  [BEAM-4679] Re-adding Flatten roots in ULR.
 new 577b036  Merge pull request #6141: [BEAM-4679] Re-adding Flatten roots 
in ULR.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../runners/direct/{ => portable}/EmptyInputProvider.java  | 14 --
 .../beam/runners/direct/portable/ReferenceRunner.java  |  2 +-
 .../beam/runners/direct/portable/RootProviderRegistry.java | 11 ---
 3 files changed, 13 insertions(+), 14 deletions(-)
 copy runners/direct-java/src/main/java/org/apache/beam/runners/direct/{ => 
portable}/EmptyInputProvider.java (69%)



[beam] 01/01: Merge pull request #6141: [BEAM-4679] Re-adding Flatten roots in ULR.

2018-08-17 Thread thw
This is an automated email from the ASF dual-hosted git repository.

thw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 577b03624b38d52cf0e47d9b7c5364c87b63017c
Merge: 350ccd9 538521e
Author: Thomas Weise 
AuthorDate: Fri Aug 17 15:54:44 2018 -0700

Merge pull request #6141: [BEAM-4679] Re-adding Flatten roots in ULR.

 .../direct/portable/EmptyInputProvider.java| 38 ++
 .../runners/direct/portable/ReferenceRunner.java   |  2 +-
 .../direct/portable/RootProviderRegistry.java  | 11 +--
 3 files changed, 47 insertions(+), 4 deletions(-)



[beam] branch master updated: Pipeline Graph from Interactive Beam -- made faster

2018-08-17 Thread pabloem
This is an automated email from the ASF dual-hosted git repository.

pabloem pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 0e14965  Pipeline Graph from Interactive Beam -- made faster
0e14965 is described below

commit 0e14965707b5d48a3de7fa69f09d88ef0aa48c09
Author: Sindy Li 
AuthorDate: Tue Aug 14 16:31:05 2018 -0700

Pipeline Graph from Interactive Beam -- made faster

* Optimization
Changed filtering top level PTransform by string manipulation to
searching for them directly through looking into subtransforms of root
Ptransforms. Makes PipelineGraph faster.

* Generalization
* Moved display_graph() method to PipelineGraph
* PipelineGraph now takes pipeline obj or proto
---
 .../interactive/interactive_pipeline_graph.py  | 34 +++
 .../runners/interactive/pipeline_graph.py  | 68 +-
 2 files changed, 60 insertions(+), 42 deletions(-)

diff --git 
a/sdks/python/apache_beam/runners/interactive/interactive_pipeline_graph.py 
b/sdks/python/apache_beam/runners/interactive/interactive_pipeline_graph.py
index 229848c..2ad7c1b 100644
--- a/sdks/python/apache_beam/runners/interactive/interactive_pipeline_graph.py
+++ b/sdks/python/apache_beam/runners/interactive/interactive_pipeline_graph.py
@@ -53,33 +53,27 @@ class 
InteractivePipelineGraph(pipeline_graph.PipelineGraph):
   """Creates the DOT representation of an interactive pipeline. Thread-safe."""
 
   def __init__(self,
-   pipeline_proto,
+   pipeline,
required_transforms=None,
referenced_pcollections=None,
cached_pcollections=None):
 """Constructor of PipelineGraph.
 
-Examples:
-  pipeline_graph = PipelineGraph(pipeline_proto)
-  print(pipeline_graph.get_dot())
-  pipeline_graph.display_graph()
-
 Args:
-  pipeline_proto: (Pipeline proto) Pipeline to be rendered.
+  pipeline: (Pipeline proto) or (Pipeline) pipeline to be rendered.
   required_transforms: (dict from str to PTransform proto) Mapping from
   transform ID to transforms that leads to visible results.
   referenced_pcollections: (dict from str to PCollection proto) PCollection
   ID mapped to PCollection referenced during pipeline execution.
-  cached_pcollections: (set of str) A set of PCollection IDs of those whose
+  cached_pcollections: (set of str) a set of PCollection IDs of those whose
   cached results are used in the execution.
 """
-self._pipeline_proto = pipeline_proto
 self._required_transforms = required_transforms or {}
 self._referenced_pcollections = referenced_pcollections or {}
 self._cached_pcollections = cached_pcollections or set()
 
 super(InteractivePipelineGraph, self).__init__(
-pipeline_proto=pipeline_proto,
+pipeline=pipeline,
 default_vertex_attrs={'color': 'gray', 'fontcolor': 'gray'},
 default_edge_attrs={'color': 'gray'}
 )
@@ -87,14 +81,6 @@ class InteractivePipelineGraph(pipeline_graph.PipelineGraph):
 transform_updates, pcollection_updates = 
self._generate_graph_update_dicts()
 self._update_graph(transform_updates, pcollection_updates)
 
-  def display_graph(self):
-"""Displays graph via IPython or prints DOT if not possible."""
-try:
-  from IPython.core import display  # pylint: disable=import-error
-  display.display(display.HTML(self._get_graph().create_svg()))  # pylint: 
disable=protected-access
-except ImportError:
-  print(str(self._get_graph()))
-
   def update_pcollection_stats(self, pcollection_stats):
 """Updates PCollection stats.
 
@@ -123,21 +109,15 @@ class 
InteractivePipelineGraph(pipeline_graph.PipelineGraph):
   vertex_dict: (Dict[str, Dict[str, str]]) maps vertex name to attributes
   edge_dict: (Dict[str, Dict[str, str]]) maps vertex name to attributes
 """
-transforms = self._pipeline_proto.components.transforms
-
 transform_dict = {}  # maps PTransform IDs to properties
 pcoll_dict = {}  # maps PCollection IDs to properties
 
-for transform_id, transform in transforms.items():
-  if not super(
-  InteractivePipelineGraph, self)._is_top_level_transform(transform):
-continue
-
-  transform_dict[transform.unique_name] = {
+for transform_id, transform_proto in self._top_level_transforms():
+  transform_dict[transform_proto.unique_name] = {
   'required': transform_id in self._required_transforms
   }
 
-  for pcoll_id in transform.outputs.values():
+  for pcoll_id in transform_proto.outputs.values():
 pcoll_dict[pcoll_id] = {
 'cached': pcoll_id in self._cached_pcollections,
 'referenced': pcoll_id in self._referenced_pcollections
diff --git 

[jira] [Work logged] (BEAM-3286) Go SDK support for portable side input

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3286?focusedWorklogId=135915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135915
 ]

ASF GitHub Bot logged work on BEAM-3286:


Author: ASF GitHub Bot
Created on: 17/Aug/18 23:17
Start Date: 17/Aug/18 23:17
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #6197: 
[BEAM-3286] Add preliminary Go support for side input
URL: https://github.com/apache/beam/pull/6197#discussion_r211052045
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/harness/statemgr.go
 ##
 @@ -0,0 +1,310 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package harness
+
+import (
+   "context"
+   "fmt"
+   "io"
+   "sync"
+   "sync/atomic"
+   "time"
+
+   "github.com/apache/beam/sdks/go/pkg/beam/core/runtime/exec"
+   "github.com/apache/beam/sdks/go/pkg/beam/log"
+   pb "github.com/apache/beam/sdks/go/pkg/beam/model/fnexecution_v1"
+   "github.com/golang/protobuf/proto"
+   "github.com/pkg/errors"
+)
+
+// ScopedStateManager scopes the global gRPC state manager to a single 
instruction.
+// The indirection makes it easier to control access.
+type ScopedSideInputReader struct {
+   mgr*StateChannelManager
+   instID string
+
+   opened []io.Closer // track open readers to force close all
+   closed bool
+   mu sync.Mutex
+}
+
+func NewScopedSideInputReader(mgr *StateChannelManager, instID string) 
*ScopedSideInputReader {
 
 Review comment:
   Today no. But I think we should tease out the fnapi-facing code from the 
harness, so that alternative harnesses and runners could be built. That's why 
I'm leaning towards keeping this public similarly to a lot of the Exec stuff.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135915)
Time Spent: 1.5h  (was: 1h 20m)

> Go SDK support for portable side input
> --
>
> Key: BEAM-3286
> URL: https://issues.apache.org/jira/browse/BEAM-3286
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3286) Go SDK support for portable side input

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3286?focusedWorklogId=135919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135919
 ]

ASF GitHub Bot logged work on BEAM-3286:


Author: ASF GitHub Bot
Created on: 17/Aug/18 23:24
Start Date: 17/Aug/18 23:24
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #6197: 
[BEAM-3286] Add preliminary Go support for side input
URL: https://github.com/apache/beam/pull/6197#discussion_r211052736
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/sideinput.go
 ##
 @@ -0,0 +1,37 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package graphx
+
+import (
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/coder"
+)
+
+// Side input support
+//
+// The Beam model supports only MultiMap side input, so for
+// iterable side input (= the only kind in Go so far) we must
+// implicitly use a fixed key (""). We use a special execution
+// unit as well to handle nested KVs.
+
+const (
+   URNIterableSideInputKey = "beam:go:transform:iterablesideinputkey:v1"
+)
+
+// MakeBytesKeyedCoder returns KV<[]byte,A,> for any coder,
+// even if the coder is already a KV coder.
+func MakeBytesKeyedCoder(c *coder.Coder) *coder.Coder {
 
 Review comment:
   Fixed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135919)
Time Spent: 1h 50m  (was: 1h 40m)

> Go SDK support for portable side input
> --
>
> Key: BEAM-3286
> URL: https://issues.apache.org/jira/browse/BEAM-3286
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3286) Go SDK support for portable side input

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3286?focusedWorklogId=135926=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135926
 ]

ASF GitHub Bot logged work on BEAM-3286:


Author: ASF GitHub Bot
Created on: 17/Aug/18 23:37
Start Date: 17/Aug/18 23:37
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #6197: [BEAM-3286] Add 
preliminary Go support for side input
URL: https://github.com/apache/beam/pull/6197#issuecomment-414013053
 
 
   Thanks for the review, @lostluck!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135926)
Time Spent: 2h 10m  (was: 2h)

> Go SDK support for portable side input
> --
>
> Key: BEAM-3286
> URL: https://issues.apache.org/jira/browse/BEAM-3286
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3286) Go SDK support for portable side input

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3286?focusedWorklogId=135925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135925
 ]

ASF GitHub Bot logged work on BEAM-3286:


Author: ASF GitHub Bot
Created on: 17/Aug/18 23:37
Start Date: 17/Aug/18 23:37
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #6197: [BEAM-3286] Add 
preliminary Go support for side input
URL: https://github.com/apache/beam/pull/6197#issuecomment-414013016
 
 
   Run Go PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135925)
Time Spent: 2h  (was: 1h 50m)

> Go SDK support for portable side input
> --
>
> Key: BEAM-3286
> URL: https://issues.apache.org/jira/browse/BEAM-3286
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1282

2018-08-17 Thread Apache Jenkins Server
See 


Changes:

[daniel.o.programmer] [BEAM-4679] Re-adding Flatten roots in ULR.

--
[...truncated 19.67 MB...]
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.467Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike) into SpannerIO.Write/Write mutations to 
Cloud Spanner/Keys sample as 
view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.490Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow)
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.508Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow) into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Extract
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.531Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample keys/GroupByKey/Reify
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.565Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Extract into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.587Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey/Reify into 
SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/GroupByKey+SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues/Partial
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.610Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/GroupByKey+SpannerIO.Write/Write 
mutations to Cloud Spanner/Sample keys/Combine.GroupedValues/Partial into 
SpannerIO.Write/Write mutations to Cloud Spanner/Extract keys
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.635Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Reify
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.666Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForSize/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.689Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/View.AsList/ParDo(ToIsmRecordForGlobalWindow) into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.712Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/GBKaSVForKeys/Write into 
SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/ParMultiDo(ToIsmRecordForMapLike)
Aug 17, 2018 11:39:09 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T23:39:07.737Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Values/Values/Map

[jira] [Work logged] (BEAM-3194) Support annotating that a DoFn requires stable / deterministic input for replay/retry

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3194?focusedWorklogId=135942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135942
 ]

ASF GitHub Bot logged work on BEAM-3194:


Author: ASF GitHub Bot
Created on: 18/Aug/18 00:41
Start Date: 18/Aug/18 00:41
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #6220: 
[BEAM-3194] Add ValidatesRunner test for support of @RequiresStableInput
URL: https://github.com/apache/beam/pull/6220#discussion_r211058610
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoRequiresStableInputTest.java
 ##
 @@ -0,0 +1,166 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.channels.WritableByteChannel;
+import java.nio.charset.Charset;
+import java.util.Date;
+import java.util.UUID;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MatchResult;
+import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.options.Validation.Required;
+import org.apache.beam.sdk.testing.FileChecksumMatcher;
+import org.apache.beam.sdk.testing.RetryFailures;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.TestPipelineOptions;
+import org.apache.beam.sdk.testing.ValidatesRunner;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TupleTag;
+import org.apache.beam.sdk.values.TupleTagList;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * ValidatesRunner test for the support of {@link
+ * org.apache.beam.sdk.transforms.DoFn.RequiresStableInput} annotation.
+ */
+@RunWith(JUnit4.class)
+public class ParDoRequiresStableInputTest {
+
+  private static final String VALUE = "value";
+  // SHA-1 hash of string "value"
+  private static final String VALUE_CHECKSUM = 
"f32b67c7e26342af42efabc674d441dca0a281c5";
+
+  private static class PairWithRandomKeyFn extends SimpleFunction> {
+@Override
+public KV apply(String value) {
+  String key = UUID.randomUUID().toString();
+  return KV.of(key, value);
+}
+  }
+
+  private static class MakeSideEffectAndThenFailFn extends DoFn, String> {
+private final String outputPrefix;
+
+private MakeSideEffectAndThenFailFn(String outputPrefix) {
+  this.outputPrefix = outputPrefix;
+}
+
+@RequiresStableInput
+@ProcessElement
+public void processElement(ProcessContext c) throws Exception {
+  MatchResult matchResult = FileSystems.match(outputPrefix + "*");
+  boolean firstTime = (matchResult.metadata().size() == 0);
+
+  KV kv = c.element();
+  writeTextToFileSideEffect(kv.getValue(), outputPrefix + kv.getKey());
+  if (firstTime) {
+throw new Exception("Deliberate failure: should happen only once.");
+  }
+}
+
+private static void writeTextToFileSideEffect(String text, String 
filename) throws IOException {
+  ResourceId rid = FileSystems.matchNewResource(filename, false);
+  WritableByteChannel chan = FileSystems.create(rid, "text/plain");
+  chan.write(ByteBuffer.wrap(text.getBytes(Charset.defaultCharset(;
+  chan.close();
+}
+  }
+
+  private static void 
runRequiresStableInputPipeline(RequiresStableInputTestOptions options) {
+Pipeline p = Pipeline.create(options);
+
+PCollection singleton = p.apply("CreatePCollectionOfOneValue", 
Create.of(VALUE));
+singleton
+.apply("Single-PairWithRandomKey", MapElements.via(new 
PairWithRandomKeyFn()))
+.apply(
+"Single-MakeSideEffectAndThenFail",
+ParDo.of(new 

[jira] [Work logged] (BEAM-3194) Support annotating that a DoFn requires stable / deterministic input for replay/retry

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3194?focusedWorklogId=135939=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135939
 ]

ASF GitHub Bot logged work on BEAM-3194:


Author: ASF GitHub Bot
Created on: 18/Aug/18 00:41
Start Date: 18/Aug/18 00:41
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #6220: 
[BEAM-3194] Add ValidatesRunner test for support of @RequiresStableInput
URL: https://github.com/apache/beam/pull/6220#discussion_r211057120
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/ExplicitShardedFile.java
 ##
 @@ -49,18 +49,36 @@
   .withMaxRetries(MAX_READ_RETRIES);
 
   private final List files;
+  private final String filePattern;
 
   /** Constructs an {@link ExplicitShardedFile} for the given files. */
   public ExplicitShardedFile(Collection files) throws IOException {
 this.files = new ArrayList<>();
+this.filePattern = null;
 for (String file : files) {
   this.files.add(FileSystems.matchSingleFileSpec(file));
 }
   }
 
+  /**
+   * Constructs an {@link ExplicitShardedFile} for the given file pattern.
+   *
+   * For internal use only.
+   */
+  public ExplicitShardedFile(String filePattern) {
 
 Review comment:
   Adding this constructor is specifically against why this class exists. The 
intent is to not require a filesystem match.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135939)
Time Spent: 10m
Remaining Estimate: 0h

> Support annotating that a DoFn requires stable / deterministic input for 
> replay/retry
> -
>
> Key: BEAM-3194
> URL: https://issues.apache.org/jira/browse/BEAM-3194
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Kenneth Knowles
>Assignee: Yueyang Qiu
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> See the thread: 
> https://lists.apache.org/thread.html/5fd81ce371aeaf642665348f8e6940e308e04275dd7072f380f9f945@%3Cdev.beam.apache.org%3E
> We need this in order to have truly cross-runner end-to-end exactly once via 
> replay + idempotence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3194) Support annotating that a DoFn requires stable / deterministic input for replay/retry

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3194?focusedWorklogId=135941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135941
 ]

ASF GitHub Bot logged work on BEAM-3194:


Author: ASF GitHub Bot
Created on: 18/Aug/18 00:41
Start Date: 18/Aug/18 00:41
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #6220: 
[BEAM-3194] Add ValidatesRunner test for support of @RequiresStableInput
URL: https://github.com/apache/beam/pull/6220#discussion_r211057036
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/testing/RetryFailures.java
 ##
 @@ -0,0 +1,21 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.testing;
+
+/** Category tag for tests that apply to runners that retry under failures. */
+public interface RetryFailures {}
 
 Review comment:
   Since your relying on runners to retry failures and support 
RequiresStableInput, I believe you should add the 
`UsesTransformThatRequiresStableInput` class as well. This would express the 
need for a runner to support both to run the tests. Currently your assuming 
that RetryFailures == Support for RequiresStableInput


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135941)
Time Spent: 20m  (was: 10m)

> Support annotating that a DoFn requires stable / deterministic input for 
> replay/retry
> -
>
> Key: BEAM-3194
> URL: https://issues.apache.org/jira/browse/BEAM-3194
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Kenneth Knowles
>Assignee: Yueyang Qiu
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See the thread: 
> https://lists.apache.org/thread.html/5fd81ce371aeaf642665348f8e6940e308e04275dd7072f380f9f945@%3Cdev.beam.apache.org%3E
> We need this in order to have truly cross-runner end-to-end exactly once via 
> replay + idempotence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3194) Support annotating that a DoFn requires stable / deterministic input for replay/retry

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3194?focusedWorklogId=135940=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135940
 ]

ASF GitHub Bot logged work on BEAM-3194:


Author: ASF GitHub Bot
Created on: 18/Aug/18 00:41
Start Date: 18/Aug/18 00:41
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #6220: 
[BEAM-3194] Add ValidatesRunner test for support of @RequiresStableInput
URL: https://github.com/apache/beam/pull/6220#discussion_r211056566
 
 

 ##
 File path: runners/apex/build.gradle
 ##
 @@ -88,6 +88,8 @@ task validatesRunnerBatch(type: Test) {
   testClassesDirs = 
files(project(":beam-sdks-java-core").sourceSets.test.output.classesDirs)
   useJUnit {
 includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
+// TODO: support @RequiresStableInput then remove the next line
 
 Review comment:
   Having a JIRA link for each runner would be useful. This would give people a 
place to look to see if it is finished and what the progress is.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135940)
Time Spent: 10m
Remaining Estimate: 0h

> Support annotating that a DoFn requires stable / deterministic input for 
> replay/retry
> -
>
> Key: BEAM-3194
> URL: https://issues.apache.org/jira/browse/BEAM-3194
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Kenneth Knowles
>Assignee: Yueyang Qiu
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> See the thread: 
> https://lists.apache.org/thread.html/5fd81ce371aeaf642665348f8e6940e308e04275dd7072f380f9f945@%3Cdev.beam.apache.org%3E
> We need this in order to have truly cross-runner end-to-end exactly once via 
> replay + idempotence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-1251) Python 3 Support

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1251?focusedWorklogId=135944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135944
 ]

ASF GitHub Bot logged work on BEAM-1251:


Author: ASF GitHub Bot
Created on: 18/Aug/18 00:50
Start Date: 18/Aug/18 00:50
Worklog Time Spent: 10m 
  Work Description: cclauss commented on issue #6166: [BEAM-1251] print() 
is a function in Python 3
URL: https://github.com/apache/beam/pull/6166#issuecomment-414020142
 
 
   @charlesccychen @holdenk Your reviews please?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135944)
Time Spent: 19h 10m  (was: 19h)

> Python 3 Support
> 
>
> Key: BEAM-1251
> URL: https://issues.apache.org/jira/browse/BEAM-1251
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Eyad Sibai
>Assignee: Robbe
>Priority: Major
>  Time Spent: 19h 10m
>  Remaining Estimate: 0h
>
> I have been trying to use google datalab with python3. As I see there are 
> several packages that does not support python3 yet which google datalab 
> depends on. This is one of them.
> https://github.com/GoogleCloudPlatform/DataflowPythonSDK/issues/6



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1281

2018-08-17 Thread Apache Jenkins Server
See 


Changes:

[github] [BEAM-5155] Check sdk absolute path before installing

--
[...truncated 18.67 MB...]
Aug 17, 2018 10:48:16 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Batch 
mutations together as step s39
Aug 17, 2018 10:48:16 PM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Write 
mutations to Spanner as step s40
Aug 17, 2018 10:48:16 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testreportfailures-jenkins-0817224812-548d963e/output/results/staging/
Aug 17, 2018 10:48:16 PM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <115993 bytes, hash Rt1ZfWw6w9-k9Lhy3euzYA> to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testreportfailures-jenkins-0817224812-548d963e/output/results/staging/pipeline-Rt1ZfWw6w9-k9Lhy3euzYA.pb

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testReportFailures 
STANDARD_OUT
Dataflow SDK version: 2.7.0-SNAPSHOT

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testReportFailures 
STANDARD_ERROR
Aug 17, 2018 10:48:18 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-08-17_15_48_16-11164281257034985219?project=apache-beam-testing

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testReportFailures 
STANDARD_OUT
Submitted job: 2018-08-17_15_48_16-11164281257034985219

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testReportFailures 
STANDARD_ERROR
Aug 17, 2018 10:48:18 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To cancel the job using the 'gcloud' tool, run:
> gcloud dataflow jobs --project=apache-beam-testing cancel 
--region=us-central1 2018-08-17_15_48_16-11164281257034985219
Aug 17, 2018 10:48:18 PM 
org.apache.beam.runners.dataflow.TestDataflowRunner run
INFO: Running Dataflow job 2018-08-17_15_48_16-11164281257034985219 with 0 
expected assertions.
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:17.005Z: Autoscaling is enabled for job 
2018-08-17_15_48_16-11164281257034985219. The number of workers will be between 
1 and 1000.
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:17.030Z: Autoscaling was automatically enabled for 
job 2018-08-17_15_48_16-11164281257034985219.
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:19.882Z: Checking permissions granted to controller 
Service Account.
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:23.431Z: Worker configuration: n1-standard-1 in 
us-central1-b.
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:23.989Z: Expanding CoGroupByKey operations into 
optimizable parts.
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:24.310Z: Expanding GroupByKey operations into 
optimizable parts.
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:24.336Z: Lifting ValueCombiningMappingFns into 
MergeBucketsMappingFns
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:24.541Z: Fusing adjacent ParDo, Read, Write, and 
Flatten operations
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:24.574Z: Elided trivial flatten 
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:24.609Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map into SpannerIO.Write/Write 
mutations to Cloud Spanner/Create seed/Read(CreateSource)
Aug 17, 2018 10:48:29 PM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-17T22:48:24.644Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Read information schema into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map
Aug 17, 2018 10:48:29 PM 

[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=135903=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135903
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 17/Aug/18 22:53
Start Date: 17/Aug/18 22:53
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #6208: [WIP] [BEAM-2930] 
Side input support for Flink portable streaming.
URL: https://github.com/apache/beam/pull/6208#issuecomment-414007244
 
 
   Early/repeated firings provide inconsistent side input values between 
bundles. Side input access is meant to provide the latest firing at the time 
when the bundle starts processing and it is up to the runner to try to make the 
latency between side input firing being materialized and visible to the 
consumer to be minimal but there is no latency requirement other then best 
effort.
   
   Only side inputs that have a single firing can be reasoned concretely.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135903)
Time Spent: 3h 40m  (was: 3.5h)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=135910=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135910
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 17/Aug/18 23:02
Start Date: 17/Aug/18 23:02
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6208: [WIP] [BEAM-2930] Side 
input support for Flink portable streaming.
URL: https://github.com/apache/beam/pull/6208#issuecomment-414008492
 
 
   OK, then the idea of building up state as Iterable as shown in this PR is 
definitely not working.
   
   Repeated firing isn't idempotent at least with Flink. If pipeline reverts to 
a checkpoint then change in order of arrival between (updating) side input and 
main input can lead to different result even for the same bundle.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135910)
Time Spent: 3h 50m  (was: 3h 40m)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3286) Go SDK support for portable side input

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3286?focusedWorklogId=135918=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135918
 ]

ASF GitHub Bot logged work on BEAM-3286:


Author: ASF GitHub Bot
Created on: 17/Aug/18 23:22
Start Date: 17/Aug/18 23:22
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #6197: 
[BEAM-3286] Add preliminary Go support for side input
URL: https://github.com/apache/beam/pull/6197#discussion_r211052512
 
 

 ##
 File path: sdks/go/pkg/beam/core/runtime/graphx/sideinput.go
 ##
 @@ -0,0 +1,37 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package graphx
+
+import (
+   "github.com/apache/beam/sdks/go/pkg/beam/core/graph/coder"
+)
+
+// Side input support
+//
+// The Beam model supports only MultiMap side input, so for
+// iterable side input (= the only kind in Go so far) we must
+// implicitly use a fixed key (""). We use a special execution
+// unit as well to handle nested KVs.
+
+const (
+   URNIterableSideInputKey = "beam:go:transform:iterablesideinputkey:v1"
 
 Review comment:
   You're right. I added this file when I expected a lot more code support and 
complication for side input support. It looks rather silly now :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135918)
Time Spent: 1h 40m  (was: 1.5h)

> Go SDK support for portable side input
> --
>
> Key: BEAM-3286
> URL: https://issues.apache.org/jira/browse/BEAM-3286
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=135917=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135917
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 17/Aug/18 23:21
Start Date: 17/Aug/18 23:21
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #6208: [WIP] [BEAM-2930] 
Side input support for Flink portable streaming.
URL: https://github.com/apache/beam/pull/6208#issuecomment-414010931
 
 
   I believe having a different result is ok since it is a possible answer that 
could have happened had the order of the original events been different and the 
checkpoint revert didn't happen.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135917)
Time Spent: 4h  (was: 3h 50m)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 4h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5121) Investigate flattening issue of nested Row

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5121?focusedWorklogId=135833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135833
 ]

ASF GitHub Bot logged work on BEAM-5121:


Author: ASF GitHub Bot
Created on: 17/Aug/18 20:42
Start Date: 17/Aug/18 20:42
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #6246: [BEAM-5121] Test 
flattening behavior of nested Row
URL: https://github.com/apache/beam/pull/6246#issuecomment-413982923
 
 
   Check 
   
https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/impl/parser/BeamDDLNestedTypesTest.java#L49
   
and 
   
   
https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/utils/QuickCheckGenerators.java#L105
   
   for how Row is being tested in `BeamDDLNestedTypesTest`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135833)
Time Spent: 0.5h  (was: 20m)

> Investigate flattening issue of nested Row
> --
>
> Key: BEAM-5121
> URL: https://issues.apache.org/jira/browse/BEAM-5121
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135889=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135889
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 22:28
Start Date: 17/Aug/18 22:28
Worklog Time Spent: 10m 
  Work Description: aalbatross commented on a change in pull request #6146: 
[BEAM-3026] Adding retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#discussion_r211046256
 
 

 ##
 File path: 
sdks/java/io/elasticsearch-tests/elasticsearch-tests-common/src/test/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIOTestCommon.java
 ##
 @@ -512,4 +527,44 @@ public Void apply(Iterable input) {
   return null;
 }
   }
+
+  /** Test that the default predicate correctly parses chosen error code. */
+  public void testDefaultRetryPredicate(RestClient restClient) throws 
IOException {
+assertFalse(DEFAULT_RETRY_PREDICATE.test(new IOException("test")));
+String x =
+"{ \"index\" : { \"_index\" : \"test\", \"_type\" : \"doc\", \"_id\" : 
\"1\" } }\n"
++ "{ \"field1\" : @ }\n";
+HttpEntity entity = new NStringEntity(x, ContentType.APPLICATION_JSON);
+
+Response response = restClient.performRequest("POST", "/_bulk", 
Collections.emptyMap(), entity);
+assertTrue(CUSTOM_RETRY_PREDICATE.test(new ResponseException(response)));
+  }
+
+  /**
+   * Test that retries are invoked when Elasticsearch returns a specific error 
code. We invoke this
+   * by issuing corrupt data and retrying on the `400` error code. Normal 
behaviour is to retry on
+   * `429` only but that is difficult to simulate reliably. The logger is used 
to verify expected
 
 Review comment:
   @echauchot the test is interesting, but i am afraid i can't think of a way 
to simulate that condition(since the retry code is internal and i cant change 
the request when the pipeline is executing). Please suggest !


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135889)
Time Spent: 10h  (was: 9h 50m)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5155) Custom sdk_location parameter not working with fn_api

2018-08-17 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-5155.

   Resolution: Fixed
Fix Version/s: 2.7.0

> Custom sdk_location parameter not working with fn_api
> -
>
> Key: BEAM-5155
> URL: https://issues.apache.org/jira/browse/BEAM-5155
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The custom sdk_location is not taking affect in portability framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5121) Investigate flattening issue of nested Row

2018-08-17 Thread Rui Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584484#comment-16584484
 ] 

Rui Wang commented on BEAM-5121:


It turns out that SELECT row.row.row_member (or even row.row) does not work.

> Investigate flattening issue of nested Row
> --
>
> Key: BEAM-5121
> URL: https://issues.apache.org/jira/browse/BEAM-5121
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4012) Suppress tracebacks on failure-catching tests

2018-08-17 Thread Juan Carlos Cardenas (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584547#comment-16584547
 ] 

Juan Carlos Cardenas commented on BEAM-4012:


*Tests Modified:*
 * fn_api_runner_test.py
 * portable_runner_test.py

*Please check:* 

[https://builds.apache.org/job/beam_PreCommit_Python_Commit/874/consoleFull]

the only exceptions remaining in the STDERR are the ones related to 
https://issues.apache.org/jira/browse/BEAM-5044 
{code:java}
...
...
  File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/testing/util.py",
 line 119, in _equal
'Failed assert: %r == %r' % (sorted_expected, sorted_actual))
BeamAssertException: Failed assert: ['a'] == ['a', 'b'] [while running 
'assert_that/Match']
ok

{code}
this wrapper was created to suppress printing tracebacks on STDERR during 
tests 
{code:java}
from apache_beam.testing.test_utils import BlockStderr 

with BlockStderr() as b:
{code}

> Suppress tracebacks on failure-catching tests
> -
>
> Key: BEAM-4012
> URL: https://issues.apache.org/jira/browse/BEAM-4012
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Ahmet Altay
>Priority: Major
>  Labels: beginner
>
> Our tests of assert_that (e.g. in 
> apache_beam.runners.portability.fn_api_runner_test) dump the expected "error" 
> to stdout making our tests noisy (and they look like failures). It'd be good 
> to suppress these in tests (while making sure things are still properly 
> logged on workers. 
> There are probably other tests of similar nature. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4447) Python SDK assert_that keyword argument order change

2018-08-17 Thread Juan Carlos Cardenas (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584590#comment-16584590
 ] 

Juan Carlos Cardenas commented on BEAM-4447:


the pull request was merged, should we mark this Jira as resolved?

> Python SDK assert_that keyword argument order change
> 
>
> Key: BEAM-4447
> URL: https://issues.apache.org/jira/browse/BEAM-4447
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, testing
>Reporter: Luke Cwik
>Assignee: María GH
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/pull/5279] changed the argument order for 
> assert_that which was caught when importing the Apache Beam codebase 
> internally into Google.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3026) Improve retrying in ElasticSearch client

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3026?focusedWorklogId=135890=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135890
 ]

ASF GitHub Bot logged work on BEAM-3026:


Author: ASF GitHub Bot
Created on: 17/Aug/18 22:42
Start Date: 17/Aug/18 22:42
Worklog Time Spent: 10m 
  Work Description: aalbatross commented on issue #6146: [BEAM-3026] Adding 
retrying behavior on ElasticSearchIO
URL: https://github.com/apache/beam/pull/6146#issuecomment-414005728
 
 
   @echauchot 
   Updated the PR based on comments:
   
   Changes are as follows :
   1. Updated the design remove the common classes, used AutoValue Builder for 
RetryConfiguration In ESIO,
   2. RetryPredicate using Response object instead of using Throwable.
   3. Removed maxDuration default in ESIO.
   4. Updated the retry logic with similar logic provided in comment.
   5. Explanation of using 2 in expected retries in testWriteRetry : 3 is max 
attempt, which includes 1 original request and 2 retries. Updated the constants.
   6. Updated java doc specific comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135890)
Time Spent: 10h 10m  (was: 10h)

> Improve retrying in ElasticSearch client
> 
>
> Key: BEAM-3026
> URL: https://issues.apache.org/jira/browse/BEAM-3026
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Tim Robertson
>Assignee: Ravi Pathak
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
> ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>   .create(new String[]{"http://...:9200"}, "test", "test")
>   .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>   .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_GradleBuild #1283

2018-08-17 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-4579) Upgrade Dependency of Beam Python SDK

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4579?focusedWorklogId=135947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135947
 ]

ASF GitHub Bot logged work on BEAM-4579:


Author: ASF GitHub Bot
Created on: 18/Aug/18 01:03
Start Date: 18/Aug/18 01:03
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #5674: DO NOT MERGE, 
[BEAM-4579] Upgrade dependencies of Python SDK
URL: https://github.com/apache/beam/pull/5674#issuecomment-414021130
 
 
   This pull request has been marked as stale due to 60 days of inactivity. It 
will be closed in 1 week if no further activity occurs. If you think that’s 
incorrect or this pull request requires a review, please simply write any 
comment. If closed, you can revive the PR at any time and @mention a reviewer 
or discuss it on the d...@beam.apache.org list. Thank you for your 
contributions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135947)
Time Spent: 1h 20m  (was: 1h 10m)

> Upgrade Dependency of Beam Python SDK
> -
>
> Key: BEAM-4579
> URL: https://issues.apache.org/jira/browse/BEAM-4579
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, sdk-py-harness
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PreCommit_Java_Cron #239

2018-08-17 Thread Apache Jenkins Server
See 


Changes:

[github] [BEAM-5155] Check sdk absolute path before installing

[daniel.o.programmer] [BEAM-4679] Re-adding Flatten roots in ULR.

[pablo] Pipeline Graph from Interactive Beam -- made faster

--
[...truncated 15.29 MB...]
INFO: SQLPlan>
LogicalProject(auction=[$0], num=[$1])
  LogicalJoin(condition=[AND(=($2, $4), >=($1, $3))], joinType=[inner])
LogicalProject(auction=[$0], num=[$2], starttime=[$1])
  LogicalAggregate(group=[{0, 1}], num=[COUNT()])
LogicalProject(auction=[$0], $f1=[HOP($3, 5000, 1)])
  BeamIOSourceRel(table=[[beam, Bid]])
LogicalProject(maxnum=[$1], starttime=[$0])
  LogicalAggregate(group=[{0}], maxnum=[MAX($1)])
LogicalProject(starttime=[$1], num=[$0])
  LogicalProject(num=[$2], starttime=[$1])
LogicalAggregate(group=[{0, 1}], num=[COUNT()])
  LogicalProject(auction=[$0], $f1=[HOP($3, 5000, 1)])
BeamIOSourceRel(table=[[beam, Bid]])

Aug 18, 2018 1:28:56 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..4=[{inputs}], proj#0..1=[{exprs}])
  BeamJoinRel(condition=[AND(=($2, $4), >=($1, $3))], joinType=[inner])
BeamCalcRel(expr#0..2=[{inputs}], auction=[$t0], num=[$t2], 
starttime=[$t1])
  BeamAggregationRel(group=[{0, 1}], num=[COUNT()])
BeamCalcRel(expr#0..4=[{inputs}], expr#5=[5000], expr#6=[1], 
expr#7=[HOP($t3, $t5, $t6)], auction=[$t0], $f1=[$t7])
  BeamIOSourceRel(table=[[beam, Bid]])
BeamCalcRel(expr#0..1=[{inputs}], maxnum=[$t1], starttime=[$t0])
  BeamAggregationRel(group=[{1}], maxnum=[MAX($0)])
BeamCalcRel(expr#0..2=[{inputs}], num=[$t2], starttime=[$t1])
  BeamAggregationRel(group=[{0, 1}], num=[COUNT()])
BeamCalcRel(expr#0..4=[{inputs}], expr#5=[5000], 
expr#6=[1], expr#7=[HOP($t3, $t5, $t6)], auction=[$t0], $f1=[$t7])
  BeamIOSourceRel(table=[[beam, Bid]])


org.apache.beam.sdk.nexmark.queries.sql.SqlQuery3Test > 
testJoinsPeopleWithAuctions STANDARD_ERROR
Aug 18, 2018 1:28:56 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `P`.`name`, `P`.`city`, `P`.`state`, `A`.`id`
FROM `beam`.`Auction` AS `A`
INNER JOIN `beam`.`Person` AS `P` ON `A`.`seller` = `P`.`id`
WHERE `A`.`category` = 10 AND (`P`.`state` = 'OR' OR `P`.`state` = 'ID' OR 
`P`.`state` = 'CA')
Aug 18, 2018 1:28:56 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(name=[$11], city=[$14], state=[$15], id=[$0])
  LogicalFilter(condition=[AND(=($8, 10), OR(=($15, 'OR'), =($15, 'ID'), 
=($15, 'CA')))])
LogicalJoin(condition=[=($7, $10)], joinType=[inner])
  BeamIOSourceRel(table=[[beam, Auction]])
  BeamIOSourceRel(table=[[beam, Person]])

Aug 18, 2018 1:28:56 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..17=[{inputs}], name=[$t11], city=[$t14], state=[$t15], 
id=[$t0])
  BeamJoinRel(condition=[=($7, $10)], joinType=[inner])
BeamCalcRel(expr#0..9=[{inputs}], expr#10=[10], expr#11=[=($t8, $t10)], 
proj#0..9=[{exprs}], $condition=[$t11])
  BeamIOSourceRel(table=[[beam, Auction]])
BeamCalcRel(expr#0..7=[{inputs}], expr#8=['OR'], expr#9=[=($t5, $t8)], 
expr#10=['ID'], expr#11=[=($t5, $t10)], expr#12=['CA'], expr#13=[=($t5, $t12)], 
expr#14=[OR($t9, $t11, $t13)], proj#0..7=[{exprs}], $condition=[$t14])
  BeamIOSourceRel(table=[[beam, Person]])


org.apache.beam.sdk.nexmark.queries.sql.SqlQuery7Test > testBids STANDARD_ERROR
Aug 18, 2018 1:28:56 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `B`.`auction`, `B`.`price`, `B`.`bidder`, `B`.`dateTime`, `B`.`extra`
FROM (SELECT `B`.`auction`, `B`.`price`, `B`.`bidder`, `B`.`dateTime`, 
`B`.`extra`, TUMBLE_START(`B`.`dateTime`, INTERVAL '10' SECOND) AS `starttime`
FROM `beam`.`Bid` AS `B`
GROUP BY `B`.`auction`, `B`.`price`, `B`.`bidder`, `B`.`dateTime`, 
`B`.`extra`, TUMBLE(`B`.`dateTime`, INTERVAL '10' SECOND)) AS `B`
INNER JOIN (SELECT MAX(`B1`.`price`) AS `maxprice`, 
TUMBLE_START(`B1`.`dateTime`, INTERVAL '10' SECOND) AS `starttime`
FROM `beam`.`Bid` AS `B1`
GROUP BY TUMBLE(`B1`.`dateTime`, INTERVAL '10' SECOND)) AS `B1` ON 
`B`.`starttime` = `B1`.`starttime` AND `B`.`price` = `B1`.`maxprice`
Aug 18, 2018 1:28:56 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(auction=[$0], price=[$1], bidder=[$2], dateTime=[$3], 
extra=[$4])
  

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1284

2018-08-17 Thread Apache Jenkins Server
See 


--
[...truncated 20.25 MB...]
INFO: 2018-08-18T01:57:32.738Z: Checking required Cloud APIs are enabled.
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:32.929Z: Checking permissions granted to controller 
Service Account.
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:36.708Z: Worker configuration: n1-standard-1 in 
us-central1-b.
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:37.225Z: Expanding CoGroupByKey operations into 
optimizable parts.
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:37.469Z: Expanding GroupByKey operations into 
optimizable parts.
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:37.517Z: Lifting ValueCombiningMappingFns into 
MergeBucketsMappingFns
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:37.773Z: Fusing adjacent ParDo, Read, Write, and 
Flatten operations
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:37.810Z: Elided trivial flatten 
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:37.845Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map into SpannerIO.Write/Write 
mutations to Cloud Spanner/Create seed/Read(CreateSource)
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:37.879Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Read information schema into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:37.921Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/ParDo(UseWindowHashAsKeyAndWindowAsSortKey)
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:37.963Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/ParDo(IsmRecordForSingularValuePerWindow) 
into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:38.011Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Read information schema
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:38.057Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/GroupByKey/Read
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:38.093Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Values/Values/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues/Extract
Aug 18, 2018 1:57:43 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-08-18T01:57:38.132Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 

[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=135818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135818
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 17/Aug/18 20:00
Start Date: 17/Aug/18 20:00
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6209: [BEAM-4176] clean up 
artifacts per-job in portable VR tests
URL: https://github.com/apache/beam/pull/6209#issuecomment-413973409
 
 
   Ping for merge
   cc: @lukecwik @mxm 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135818)
Time Spent: 16h 10m  (was: 16h)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
> Attachments: Screen Shot 2018-08-14 at 4.18.31 PM.png
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5121) Investigate flattening issue of nested Row

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5121?focusedWorklogId=135832=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135832
 ]

ASF GitHub Bot logged work on BEAM-5121:


Author: ASF GitHub Bot
Created on: 17/Aug/18 20:41
Start Date: 17/Aug/18 20:41
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #6246: [BEAM-5121] Test 
flattening behavior of nested Row
URL: https://github.com/apache/beam/pull/6246#issuecomment-413982552
 
 
   R: @akedin 
   cc: @apilloud 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135832)
Time Spent: 20m  (was: 10m)

> Investigate flattening issue of nested Row
> --
>
> Key: BEAM-5121
> URL: https://issues.apache.org/jira/browse/BEAM-5121
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4679) Support portable combiner lifting in Java Reference Runner

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4679?focusedWorklogId=135817=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135817
 ]

ASF GitHub Bot logged work on BEAM-4679:


Author: ASF GitHub Bot
Created on: 17/Aug/18 19:58
Start Date: 17/Aug/18 19:58
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6141: [BEAM-4679] 
Re-adding Flatten roots in ULR.
URL: https://github.com/apache/beam/pull/6141#issuecomment-413973038
 
 
   Ping for merge 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135817)
Time Spent: 3h  (was: 2h 50m)

> Support portable combiner lifting in Java Reference Runner
> --
>
> Key: BEAM-4679
> URL: https://issues.apache.org/jira/browse/BEAM-4679
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-direct
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Adjust the Reference Runner in Java to support portable combiner lifting as 
> described in the following doc:
> https://s.apache.org/beam-runner-api-combine-model



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5121) Investigate flattening issue of nested Row

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5121?focusedWorklogId=135831=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135831
 ]

ASF GitHub Bot logged work on BEAM-5121:


Author: ASF GitHub Bot
Created on: 17/Aug/18 20:38
Start Date: 17/Aug/18 20:38
Worklog Time Spent: 10m 
  Work Description: amaliujia opened a new pull request #6246: [BEAM-5121] 
Test flattening behavior of nested Row
URL: https://github.com/apache/beam/pull/6246
 
 
   Test flattening behavior of nested Row.
   
   Note `BeamDDLNestedTypesTest` has tested ROW so not modifying it in this PR.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135831)
Time Spent: 10m
Remaining Estimate: 0h

> Investigate flattening issue of nested Row
> --
>
> Key: BEAM-5121
> URL: https://issues.apache.org/jira/browse/BEAM-5121
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message 

[jira] [Commented] (BEAM-5069) Incorrect string type check in _OutputProcessor.process_outputs.

2018-08-17 Thread Juan Carlos Cardenas (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584329#comment-16584329
 ] 

Juan Carlos Cardenas commented on BEAM-5069:


this change has already merged to the code, should we close this item?

> Incorrect string type check in _OutputProcessor.process_outputs.
> 
>
> Key: BEAM-5069
> URL: https://issues.apache.org/jira/browse/BEAM-5069
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Blocker
>
> Due to _# cython: language_level=3_ directive in 
> [common.py|https://github.com/apache/beam/blob/8f1d86ea7bd5b118bc7e638d4d484445101526a1/sdks/python/apache_beam/runners/common.py#L18],
>  the type check in 
> [common.py|https://github.com/apache/beam/blob/8f1d86ea7bd5b118bc7e638d4d484445101526a1/sdks/python/apache_beam/runners/common.py#L729]
>  does not recognize Python 2 strings, which have type 'bytes'.
> This causes pipelines to crash when this codepath is executed in Cythonized 
> version of the SDK.
> The regression happened during ongoing efforts to make Beam codebase Python 3 
> compatible.
> The fix will need to be cherry-picked into the 2.6.0 release.
> Working on the fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=135824=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135824
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 17/Aug/18 20:07
Start Date: 17/Aug/18 20:07
Worklog Time Spent: 10m 
  Work Description: vectorijk removed a comment on issue #5926: [BEAM-4723] 
[SQL] Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-413937489
 
 
   run java precommit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135824)
Time Spent: 4h 50m  (was: 4h 40m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4723) Enhance Datetime*Expression Datetime Type

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4723?focusedWorklogId=135825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135825
 ]

ASF GitHub Bot logged work on BEAM-4723:


Author: ASF GitHub Bot
Created on: 17/Aug/18 20:07
Start Date: 17/Aug/18 20:07
Worklog Time Spent: 10m 
  Work Description: vectorijk commented on issue #5926: [BEAM-4723] [SQL] 
Support datetime type minus time interval
URL: https://github.com/apache/beam/pull/5926#issuecomment-413975092
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135825)
Time Spent: 5h  (was: 4h 50m)

> Enhance Datetime*Expression Datetime Type
> -
>
> Key: BEAM-4723
> URL: https://issues.apache.org/jira/browse/BEAM-4723
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Datetime*Expression only supports timestamp type for first operand now. We 
> should let it accept all Datetime_Types



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5110) Reconile Flink JVM singleton management with deployment

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5110?focusedWorklogId=135822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135822
 ]

ASF GitHub Bot logged work on BEAM-5110:


Author: ASF GitHub Bot
Created on: 17/Aug/18 20:07
Start Date: 17/Aug/18 20:07
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6189: [BEAM-5110] 
Explicitly count the references for BatchFlinkExecutableStageContext …
URL: https://github.com/apache/beam/pull/6189#issuecomment-413975006
 
 
   This PR only addresses bug in the current implementation of environment 
management.
   @tweise Shall we wait for the discussion on environment management before 
pursuing this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135822)
Time Spent: 2h 20m  (was: 2h 10m)

> Reconile Flink JVM singleton management with deployment
> ---
>
> Key: BEAM-5110
> URL: https://issues.apache.org/jira/browse/BEAM-5110
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Ben Sidhom
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> [~angoenka] noticed through debugging that multiple instances of 
> BatchFlinkExecutableStageContext.BatchFactory are loaded for a given job when 
> executing in standalone cluster mode. This context factory is responsible for 
> maintaining singleton state across a TaskManager (JVM) in order to share SDK 
> Environments across workers in a given job. The multiple-loading breaks 
> singleton semantics and results in an indeterminate number of Environments 
> being created.
> It turns out that the [Flink classloading 
> mechanism|https://ci.apache.org/projects/flink/flink-docs-release-1.5/monitoring/debugging_classloading.html]
>  is determined by deployment mode. Note that "user code" as referenced by 
> this link is actually the Flink job server jar. Actual end-user code lives 
> inside of the SDK Environment and uploaded artifacts.
> In order to maintain singletons without resorting to IPC (for example, using 
> file locks and/or additional gRPC servers), we need to force non-dynamic 
> classloading. For example, this happens when jobs are submitted to YARN for 
> one-off deployments via `flink run`. However, connecting to an existing 
> (Flink standalone) deployment results in dynamic classloading.
> We should investigate this behavior and either document (and attempt to 
> enforce) deployment modes that are consistent with our requirements, or (if 
> possible) create a custom classloader that enforces singleton loading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5121) Investigate flattening issue of nested Row

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5121?focusedWorklogId=135837=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135837
 ]

ASF GitHub Bot logged work on BEAM-5121:


Author: ASF GitHub Bot
Created on: 17/Aug/18 20:49
Start Date: 17/Aug/18 20:49
Worklog Time Spent: 10m 
  Work Description: akedin commented on issue #6246: [BEAM-5121] Test 
flattening behavior of nested Row
URL: https://github.com/apache/beam/pull/6246#issuecomment-413984504
 
 
   This looks good. I would also add more complicated cases though e.g. arrays 
and field access:
   
   ```
   SELECT col.row_field2.array_field[3]
   
   SELECT top_level_array_field_after_first_row_field[3]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135837)
Time Spent: 40m  (was: 0.5h)

> Investigate flattening issue of nested Row
> --
>
> Key: BEAM-5121
> URL: https://issues.apache.org/jira/browse/BEAM-5121
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5155) Custom sdk_location parameter not working with fn_api

2018-08-17 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5155?focusedWorklogId=135886=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-135886
 ]

ASF GitHub Bot logged work on BEAM-5155:


Author: ASF GitHub Bot
Created on: 17/Aug/18 22:09
Start Date: 17/Aug/18 22:09
Worklog Time Spent: 10m 
  Work Description: herohde closed pull request #6233: [BEAM-5155] Check 
sdk absolute path before installing
URL: https://github.com/apache/beam/pull/6233
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/container/piputil.go b/sdks/python/container/piputil.go
index 1c0faa39c91..01c289ac2ae 100644
--- a/sdks/python/container/piputil.go
+++ b/sdks/python/container/piputil.go
@@ -160,7 +160,7 @@ func installSdk(files []string, workDir string, sdkSrcFile 
string, acceptableWhl
log.Printf("Could not install Apache Beam SDK from a wheel: %v, 
proceeding to install SDK from source tarball.", err)
}
if !required {
-   _, err := os.Stat(sdkSrcFile)
+   _, err := os.Stat(filepath.Join(workDir, sdkSrcFile))
if os.IsNotExist(err) {
return nil
}


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 135886)
Time Spent: 0.5h  (was: 20m)

> Custom sdk_location parameter not working with fn_api
> -
>
> Key: BEAM-5155
> URL: https://issues.apache.org/jira/browse/BEAM-5155
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The custom sdk_location is not taking affect in portability framework.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >