from:"Maximilian Michels $JIRA$"

[jira] [Commented] (BEAM-8483) Make beam_fn_api flag opt-out rather than opt-in for runners.

2019-10-25 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-8483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959632#comment-16959632
 ] 

Maximilian Michels commented on BEAM-8483:
--

[~altay] We can use this issue to track the changes for the 2.17.0 release.

>  Make beam_fn_api flag opt-out rather than opt-in for runners.
> --
>
> Key: BEAM-8483
> URL: https://issues.apache.org/jira/browse/BEAM-8483
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.17.0
>
>
> This prevents portable runners from forgetting to add the required 
> {{beam_fn_api}} flag.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-8483) Make beam_fn_api flag opt-out rather than opt-in for runners.

2019-10-25 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-8483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-8483:
-
Status: Open  (was: Triage Needed)

>  Make beam_fn_api flag opt-out rather than opt-in for runners.
> --
>
> Key: BEAM-8483
> URL: https://issues.apache.org/jira/browse/BEAM-8483
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.17.0
>
>
> This prevents portable runners from forgetting to add the required 
> {{beam_fn_api}} flag.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-8218) Implement Apache PulsarIO

2019-10-25 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16959909#comment-16959909
 ] 

Maximilian Michels commented on BEAM-8218:
--

Have you already started working on the connector?

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Minor
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (BEAM-8218) Implement Apache PulsarIO

2019-10-26 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned BEAM-8218:


Assignee: (was: Alex Van Boxel)

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Priority: Minor
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (BEAM-8218) Implement Apache PulsarIO

2019-10-26 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned BEAM-8218:


Assignee: Maximilian Michels

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Assignee: Maximilian Michels
>Priority: Minor
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (BEAM-8218) Implement Apache PulsarIO

2019-10-26 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned BEAM-8218:


Assignee: (was: Maximilian Michels)

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Priority: Minor
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (BEAM-8218) Implement Apache PulsarIO

2019-10-26 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned BEAM-8218:


Assignee: Taher Koitawala

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Assignee: Taher Koitawala
>Priority: Minor
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-8218) Implement Apache PulsarIO

2019-10-26 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-8218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16960316#comment-16960316
 ] 

Maximilian Michels commented on BEAM-8218:
--

Assigning this to [~taherk77] as per mailing list discussion.

> Implement Apache PulsarIO
> -
>
> Key: BEAM-8218
> URL: https://issues.apache.org/jira/browse/BEAM-8218
> Project: Beam
>  Issue Type: Task
>  Components: io-ideas
>Reporter: Alex Van Boxel
>Assignee: Taher Koitawala
>Priority: Minor
>
> Apache Pulsar is starting to gain popularity. Having a native Beam PulsarIO 
> could be beneficial.
> [https://pulsar.apache.org/|https://pulsar.apache.org/en/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-7544) [workaround available] Please provide a build against scala 2.12 for Flink runner

2019-07-04 Thread Maximilian Michels (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-7544.
--
   Resolution: Workaround
Fix Version/s: Not applicable

I'm resolving this because a sensible workaround is available. Most users won't 
need to worry about the Scala version anyways.

> [workaround available] Please provide a build against scala 2.12 for Flink 
> runner
> -
>
> Key: BEAM-7544
> URL: https://issues.apache.org/jira/browse/BEAM-7544
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Affects Versions: 2.11.0, 2.13.0
> Environment: scio 0.7.4 + scala 2.12.8
>Reporter: Cyrille Chépélov
>Priority: Minor
> Fix For: Not applicable
>
>
> Flink supports scala 2.12 since version 1.7, while BEAM uses Flink 1.8.
> It would be useful to begin supporting scala 2.12 as a preparation towards 
> scala 2.13 as well as soon as Flink supports it
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (BEAM-6747) Adding ExternalTransform in JavaSDK

2019-07-11 Thread Maximilian Michels (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882781#comment-16882781
 ] 

Maximilian Michels commented on BEAM-6747:
--

I think Ismael has a point. {{External}} should be part of the user-facing SDK. 
The {{core-construction-java}} module is only available to runners, not 
necessarily to users. Users should be able to construct pipelines with external 
transforms without a dependency on a runner. Not even all runners (e.g. direct 
runner) depend on {{core-construction-java}}. 

> Adding ExternalTransform in JavaSDK
> ---
>
> Key: BEAM-6747
> URL: https://issues.apache.org/jira/browse/BEAM-6747
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Adding Java counterpart of Python ExternalTransform for testing Python 
> transforms from pipelines in Java SDK.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-6747) Adding ExternalTransform in JavaSDK

2019-07-11 Thread Maximilian Michels (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882785#comment-16882785
 ] 

Maximilian Michels commented on BEAM-6747:
--

Actually, the direct runner depends on {{core-construction-java}}. Still, it 
looks like {{External}} should belong to {{java-core}}. I reckon that would 
require to decouple it from its dependency on the expansion service.

> Adding ExternalTransform in JavaSDK
> ---
>
> Key: BEAM-6747
> URL: https://issues.apache.org/jira/browse/BEAM-6747
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Adding Java counterpart of Python ExternalTransform for testing Python 
> transforms from pipelines in Java SDK.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7438) Distribution and Gauge metrics are not being exported to Flink dashboard neither Prometheus IO

2019-07-11 Thread Maximilian Michels (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16882795#comment-16882795
 ] 

Maximilian Michels commented on BEAM-7438:
--

The display issue would also affect other reporters, not only the web UI. I 
still believe the reason is that the Beam distribution type is reported as a 
Flink gauge which reporters do not understand correctly. The mapping that you 
linked actually just does that. So if you want to fix this, you need to report 
the Beam distribution as a Flink distribution type. Happy to review the PR.



> Distribution and Gauge metrics are not being exported to Flink dashboard 
> neither Prometheus IO
> --
>
> Key: BEAM-7438
> URL: https://issues.apache.org/jira/browse/BEAM-7438
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.12.0, 2.13.0
>Reporter: Ricardo Bordon
>Priority: Major
> Attachments: image-2019-05-29-11-24-36-911.png, 
> image-2019-05-29-11-26-49-685.png
>
>
> Distributions and gauge metrics are not visible at Flink dashboard neither 
> Prometheus IO.
> I was able to debug the runner code and see that these metrics are being 
> update over *FlinkMetricContainer#updateDistributions()* and 
> *FlinkMetricContainer#updateGauges()* (meaning they are treated as "attempted 
> metrics") but these are not visible when looking them over the Flink 
> Dashboard or Prometheus. In the other hand, *Counter* metrics work as 
> expected.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-07-12 Thread Maximilian Michels (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-7730:
-
Status: Open  (was: Triage Needed)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.14.0
>
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-07-12 Thread Maximilian Michels (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883736#comment-16883736
 ] 

Maximilian Michels commented on BEAM-7730:
--

Hey [~sunjincheng121]! Adding a new Flink build target is quite easy if there 
are no backwards-incompatible changes. You can see how it is done in the 
{{runners/flink/1.x}} directories which all inherit from the 
{{flink_runner.gradle}} base file with a few modifications to the source 
directories which may be version-specific.

Once we add the 1.9 target, we may want to remove 1.5 because we already build 
against four versions of Flink. There is some potential for simplification of 
the code because of 1.5 specific workarounds in the code. For example, there is 
the `preSnapshotBarrier` hook added in 1.6 that we had to workaround in 1.5 by 
buffering elements in `snapshotState` instead. However, you do not have to 
worry about this, we may do this separately.

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.14.0
>
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (BEAM-6587) Let StringUtf8 be a well-known coder.

2019-07-23 Thread Maximilian Michels (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-6587.
--
   Resolution: Fixed
Fix Version/s: 2.11.0

I believe this has been fixed in the meantime. There is STRING_UTF8 in the 
Proto and Python and Java understand it.

> Let StringUtf8 be a well-known coder.
> -
>
> Key: BEAM-6587
> URL: https://issues.apache.org/jira/browse/BEAM-6587
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-harness, sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Luke Cwik
>Priority: Major
> Fix For: 2.11.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> It doesn't have to be understood by the runner, but it would be good if 
> most/all SDKs understood it.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7850) Make Environment a top level attribute of PTransform

2019-07-31 Thread Maximilian Michels (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897119#comment-16897119
 ] 

Maximilian Michels commented on BEAM-7850:
--

If we make Environment a top-level attribute of PTransform, do we also remove 
Environment from SdkFunctionSpec? There are other transforms like WindowInto 
which make use of SdkFunctionSpec and Environment.

It seems nice to have the environment as a first-class citizen of PTransform, 
but it could be an invasive change considering it is used by multiple 
components of the Proto. Could you expand on the motivation for such a change?

> Make Environment a top level attribute of PTransform
> 
>
> Key: BEAM-7850
> URL: https://issues.apache.org/jira/browse/BEAM-7850
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Chamikara Jayalath
>Priority: Major
>
> Currently Environment is not a top level attribute of the PTransform (of 
> runner API proto).
> [https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L99]
> Instead it is hidden inside various payload objects. For example, for ParDo, 
> environment will be inside SdkFunctionSpec of ParDoPayload.
> [https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L99]
>  
> This makes tracking environment of different types of PTransforms harder and 
> we have to fork code (on the type of PTransform) to extract the Environment 
> where the PTransform should be executed. It will probably be simpler to just 
> make Environment a top level attribute of PTransform.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (BEAM-7870) Externally configured KafkaIO consumer causes coder problems

2019-08-01 Thread Maximilian Michels (JIRA)

Maximilian Michels created BEAM-7870:


 Summary: Externally configured KafkaIO consumer causes coder 
problems
 Key: BEAM-7870
 URL: https://issues.apache.org/jira/browse/BEAM-7870
 Project: Beam
  Issue Type: Bug
  Components: runner-flink, sdk-java-core
Reporter: Maximilian Michels
Assignee: Maximilian Michels


There are limitations for the consumer to work correctly. The biggest issue is 
the structure of KafkaIO itself, which uses a combination of the source 
interface and DoFns to generate the desired output. The problem is that the 
source interface is natively translated by the Flink Runner to support 
unbounded sources in portability, while the DoFn runs in a Java environment.

To transfer data between the two a coder needs to be involved. It happens to be 
that the initial read does not immediately drop the KafakRecord structure which 
does not work together well with our current assumption of only supporting 
"standard coders" present in all SDKs. Only the subsequent DoFn converts the 
KafkaRecord structure into a raw KV[byte, byte], but the DoFn won't have the 
coder available in its environment.

There are several possible solutions:
 1. Make the DoFn which drops the KafkaRecordCoder a native Java transform in 
the Flink Runner
 2. Modify KafkaIO to immediately drop the KafkaRecord structure
 3. Add the KafkaRecordCoder to all SDKs
 4. Add a generic coder, e.g. AvroCoder to all SDKs

For a workaround which uses (3), please see this patch which is not a proper 
fix but adds KafkaRecordCoder to the SDK such that it can be used encode/decode 
records: 
[https://github.com/mxm/beam/commit/b31cf99c75b3972018180d8ccc7e73d311f4cfed]

 

See also 
[https://github.com/apache/beam/pull/8251|https://github.com/apache/beam/pull/8251:]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7870) Externally configured KafkaIO consumer causes coder problems

2019-08-01 Thread Maximilian Michels (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-7870:
-
Status: Open  (was: Triage Needed)

> Externally configured KafkaIO consumer causes coder problems
> 
>
> Key: BEAM-7870
> URL: https://issues.apache.org/jira/browse/BEAM-7870
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, sdk-java-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>
> There are limitations for the consumer to work correctly. The biggest issue 
> is the structure of KafkaIO itself, which uses a combination of the source 
> interface and DoFns to generate the desired output. The problem is that the 
> source interface is natively translated by the Flink Runner to support 
> unbounded sources in portability, while the DoFn runs in a Java environment.
> To transfer data between the two a coder needs to be involved. It happens to 
> be that the initial read does not immediately drop the KafakRecord structure 
> which does not work together well with our current assumption of only 
> supporting "standard coders" present in all SDKs. Only the subsequent DoFn 
> converts the KafkaRecord structure into a raw KV[byte, byte], but the DoFn 
> won't have the coder available in its environment.
> There are several possible solutions:
>  1. Make the DoFn which drops the KafkaRecordCoder a native Java transform in 
> the Flink Runner
>  2. Modify KafkaIO to immediately drop the KafkaRecord structure
>  3. Add the KafkaRecordCoder to all SDKs
>  4. Add a generic coder, e.g. AvroCoder to all SDKs
> For a workaround which uses (3), please see this patch which is not a proper 
> fix but adds KafkaRecordCoder to the SDK such that it can be used 
> encode/decode records: 
> [https://github.com/mxm/beam/commit/b31cf99c75b3972018180d8ccc7e73d311f4cfed]
>  
> See also 
> [https://github.com/apache/beam/pull/8251|https://github.com/apache/beam/pull/8251:]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7850) Make Environment a top level attribute of PTransform

2019-08-01 Thread Maximilian Michels (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898260#comment-16898260
 ] 

Maximilian Michels commented on BEAM-7850:
--

Thanks for expanding on this. After taking a closer look, it makes perfect 
sense to move Environment to the top-level PTransform message because all other 
transforms are derived from this base message.

You mentioned WindowStrategy and SideInput still using SdkFunctionSpec with 
Environment. There is also the Coder message. If I'm not mistaken, we could 
remove the environment from SdkFunctionSpec entirely because all these messages 
should be implicitly bound to an environment from the PTransform. Perhaps 
somebody else could comment on whether this would be feasible? I can't think of 
a situation where this is not the case.

+1 Definitely seems like a sensible change.

> Make Environment a top level attribute of PTransform
> 
>
> Key: BEAM-7850
> URL: https://issues.apache.org/jira/browse/BEAM-7850
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Chamikara Jayalath
>Priority: Major
>
> Currently Environment is not a top level attribute of the PTransform (of 
> runner API proto).
> [https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L99]
> Instead it is hidden inside various payload objects. For example, for ParDo, 
> environment will be inside SdkFunctionSpec of ParDoPayload.
> [https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L99]
>  
> This makes tracking environment of different types of PTransforms harder and 
> we have to fork code (on the type of PTransform) to extract the Environment 
> where the PTransform should be executed. It will probably be simpler to just 
> make Environment a top level attribute of PTransform.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-08-01 Thread Maximilian Michels (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898265#comment-16898265
 ] 

Maximilian Michels commented on BEAM-7730:
--

That sounds great. Thanks for your work [~davidmoravek]. Looking forward to 
reviewing the code. I've reassigned the issue to you.

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.16.0
>
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-08-01 Thread Maximilian Michels (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned BEAM-7730:


Assignee: David Moravek  (was: sunjincheng)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: David Moravek
>Priority: Major
> Fix For: 2.16.0
>
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-08-01 Thread Maximilian Michels (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16898277#comment-16898277
 ] 

Maximilian Michels commented on BEAM-7730:
--

[~sunjincheng121] I hope reassigning the issue to David is ok since it looks 
like his implementation is almost complete. If possible we could combine any 
code from your side or you could help reviewing David's PR. Thank you.

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: David Moravek
>Priority: Major
> Fix For: 2.16.0
>
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (BEAM-9534) Flink testParDoRequiresStableInput timed out

2020-04-14 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9534.
--
Fix Version/s: Not applicable
   Resolution: Fixed

> Flink testParDoRequiresStableInput timed out
> 
>
> Key: BEAM-9534
> URL: https://issues.apache.org/jira/browse/BEAM-9534
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kyle Weaver
>Priority: Major
> Fix For: Not applicable
>
>
> Regression
> org.apache.beam.runners.flink.FlinkRequiresStableInputTest.testParDoRequiresStableInput
> Failing for the past 1 build (Since Failed#10395 )
> Took 30 sec.
> Error Message
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds
> Stacktrace
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds
>   at java.io.FileOutputStream.write(Native Method)
>   at java.io.FileOutputStream.write(FileOutputStream.java:290)
>   at java.util.zip.ZipOutputStream.writeInt(ZipOutputStream.java:723)
>   at java.util.zip.ZipOutputStream.writeLOC(ZipOutputStream.java:401)
>   at java.util.zip.ZipOutputStream.putNextEntry(ZipOutputStream.java:238)
>   at 
> org.apache.beam.sdk.util.ZipFiles.zipDirectoryInternal(ZipFiles.java:266)
>   at 
> org.apache.beam.sdk.util.ZipFiles.zipDirectoryInternal(ZipFiles.java:253)
>   at 
> org.apache.beam.sdk.util.ZipFiles.zipDirectoryInternal(ZipFiles.java:253)
>   at 
> org.apache.beam.sdk.util.ZipFiles.zipDirectoryInternal(ZipFiles.java:253)
>   at 
> org.apache.beam.sdk.util.ZipFiles.zipDirectoryInternal(ZipFiles.java:253)
>   at 
> org.apache.beam.sdk.util.ZipFiles.zipDirectoryInternal(ZipFiles.java:253)
>   at 
> org.apache.beam.sdk.util.ZipFiles.zipDirectoryInternal(ZipFiles.java:253)
>   at org.apache.beam.sdk.util.ZipFiles.zipDirectory(ZipFiles.java:223)
>   at 
> org.apache.beam.runners.core.construction.Environments.zipDirectory(Environments.java:234)
>   at 
> org.apache.beam.runners.core.construction.Environments.getArtifacts(Environments.java:219)
>   at 
> org.apache.beam.runners.core.construction.Environments.createOrGetDefaultEnvironment(Environments.java:110)
>   at 
> org.apache.beam.runners.core.construction.SdkComponents.create(SdkComponents.java:92)
>   at 
> org.apache.beam.runners.core.construction.ParDoTranslation.getParDoPayload(ParDoTranslation.java:735)
>   at 
> org.apache.beam.runners.core.construction.ParDoTranslation.getSchemaInformation(ParDoTranslation.java:344)
>   at 
> org.apache.beam.runners.flink.FlinkStreamingTransformTranslators$ParDoStreamingTranslator.translateNode(FlinkStreamingTransformTranslators.java:669)
>   at 
> org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.applyStreamingTransform(FlinkStreamingPipelineTranslator.java:157)
>   at 
> org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.visitPrimitiveTransform(FlinkStreamingPipelineTranslator.java:136)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:665)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317)
>   at 
> org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251)
>   at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:463)
>   at 
> org.apache.beam.runners.flink.FlinkPipelineTranslator.translate(FlinkPipelineTranslator.java:38)
>   at 
> org.apache.beam.runners.flink.FlinkStreamingPipelineTranslator.translate(FlinkStreamingPipelineTranslator.java:88)
>   at 
> org.apache.beam.runners.flink.FlinkPipelineExecutionEnvironment.translate(FlinkPipelineExecutionEnvironment.java:116)
>   at 
> org.apache.beam.runners.flink.FlinkPipelineExecutionEnvironment.getJobGraph(FlinkPipelineExecutionEnvironment.java:153)
>   at 
> org.apache.beam.runners.flink.FlinkRunner.getJobGraph(FlinkRunner.java:209)
>   at 
> org.apache.beam.runners.flink.FlinkRequiresStableInputTest.getJobGraph(FlinkRequiresStableInputTest.java:176)
>   at 
> org.apache.beam.runners.flink.FlinkRequiresStableInputTest.restoreFromSavepoint(FlinkRequiresStableInputTest.java:201)
>   at 
> org.apache.beam.runners.flink.FlinkRequiresStableInputTest.testParDoRequiresStableInput(FlinkRequiresStableInputTest.java:163)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.inv

[jira] [Commented] (BEAM-9655) Stateful Dataflow runner?

2020-04-14 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083447#comment-17083447
 ] 

Maximilian Michels commented on BEAM-9655:
--

I thought Dataflow supported state and timers. Is the input to your transform 
of form {{KV}}? In Python KV can be represented as a tuple.

> Stateful Dataflow runner?
> -
>
> Key: BEAM-9655
> URL: https://issues.apache.org/jira/browse/BEAM-9655
> Project: Beam
>  Issue Type: Wish
>  Components: runner-dataflow
>Affects Versions: 2.19.0
>Reporter: Léopold Boudard
>Priority: Major
>
> Hi,
> I'm trying to use python portable DataflowRunner with a 
> [BagStateSpec|[https://beam.apache.org/releases/pydoc/2.6.0/apache_beam.transforms.userstate.html]].
>  Though I encounter followiung issue:
> {code:java}
> Traceback (most recent call last):
>   File "/Users/leopold/.pyenv/versions/3.6.0/lib/python3.6/runpy.py", line 
> 193, in _run_module_as_main
> "__main__", mod_spec)
>   File "/Users/leopold/.pyenv/versions/3.6.0/lib/python3.6/runpy.py", line 
> 85, in _run_code
> exec(code, run_globals)
>   File 
> "/Users/leopold/workspace/BenchmarkListingStreaming/listing_beam_pipeline/test_runner.py",
>  line 49, in 
> run()
>   File 
> "/Users/leopold/workspace/BenchmarkListingStreaming/listing_beam_pipeline/test_runner.py",
>  line 44, in run
> | 'write to file' >> WriteToText(known_args.output)
>   File 
> "/Users/leopold/.pyenv/versions/BenchmarkListingStreaming/lib/python3.6/site-packages/apache_beam/pipeline.py",
>  line 481, in __exit__
> self.run().wait_until_finish()
>   File 
> "/Users/leopold/.pyenv/versions/BenchmarkListingStreaming/lib/python3.6/site-packages/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1449, in wait_until_finish
> (self.state, getattr(self._runner, 'last_error_msg', None)), self)
> apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: 
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.6/site-packages/dataflow_worker/batchworker.py", line 
> 648, in do_work
> work_executor.execute()
>   File "/usr/local/lib/python3.6/site-packages/dataflow_worker/executor.py", 
> line 176, in execute
> op.start()
>   File "apache_beam/runners/worker/operations.py", line 649, in 
> apache_beam.runners.worker.operations.DoOperation.start
>   File "apache_beam/runners/worker/operations.py", line 651, in 
> apache_beam.runners.worker.operations.DoOperation.start
>   File "apache_beam/runners/worker/operations.py", line 652, in 
> apache_beam.runners.worker.operations.DoOperation.start
>   File "apache_beam/runners/worker/operations.py", line 261, in 
> apache_beam.runners.worker.operations.Operation.start
>   File "apache_beam/runners/worker/operations.py", line 266, in 
> apache_beam.runners.worker.operations.Operation.start
>   File "apache_beam/runners/worker/operations.py", line 597, in 
> apache_beam.runners.worker.operations.DoOperation.setup
>   File "apache_beam/runners/worker/operations.py", line 636, in 
> apache_beam.runners.worker.operations.DoOperation.setup
>   File "apache_beam/runners/common.py", line 866, in 
> apache_beam.runners.common.DoFnRunner.__init__
> Exception: Requested execution of a stateful DoFn, but no user state context 
> is available. This likely means that the current runner does not support the 
> execution of stateful DoFns.
> {code}
> I've also seen this issue in stackoverflow
> [https://stackoverflow.com/questions/55413690/does-google-dataflow-support-stateful-pipelines-developed-with-python-sdk]
>  
> Do you have any idea/ETA when this feature will be available with beam?
>  
> Thanks!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9794) Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE checkpoints.

2020-04-21 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088401#comment-17088401
 ] 

Maximilian Michels commented on BEAM-9794:
--

Good catch! I think we have to resort to only using a single state cell for 
buffering on checkpoints, instead of using a new one for every checkpoint. I 
was under the assumption that, if the state cell was cleared, it would not be 
checkpointed but that does not seem to be the case.

Actually, this should be easy to fix by using Flink's namespacing instead of 
creating a new state cell. We currently only use the VoidNamespace.

> Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE 
> checkpoints.
> 
>
> Key: BEAM-9794
> URL: https://issues.apache.org/jira/browse/BEAM-9794
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.19.0, 2.20.0
>Reporter: David Morávek
>Assignee: David Morávek
>Priority: Major
>
> Full original report: 
> https://lists.apache.org/thread.html/rb2ebfad16d85bcf668978b3defd442feda0903c20db29c323497a672%40%3Cuser.beam.apache.org%3E
> The exception comes from: 
> https://github.com/apache/flink/blob/release-1.8.0/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorBackendSerializationProxy.java#L68
> In the Flink Runner code, each checkpoint results in a new OperatorState (or 
> KeyedState if the stream is keyed):
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L91-L103
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L136-L143



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9794) Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE checkpoints.

2020-04-21 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088752#comment-17088752
 ] 

Maximilian Michels commented on BEAM-9794:
--

Taking this one after I synced with [~dmvk].

> Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE 
> checkpoints.
> 
>
> Key: BEAM-9794
> URL: https://issues.apache.org/jira/browse/BEAM-9794
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.19.0, 2.20.0
>Reporter: David Morávek
>Assignee: David Morávek
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Full original report: 
> https://lists.apache.org/thread.html/rb2ebfad16d85bcf668978b3defd442feda0903c20db29c323497a672%40%3Cuser.beam.apache.org%3E
> The exception comes from: 
> https://github.com/apache/flink/blob/release-1.8.0/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorBackendSerializationProxy.java#L68
> In the Flink Runner code, each checkpoint results in a new OperatorState (or 
> KeyedState if the stream is keyed):
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L91-L103
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L136-L143



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (BEAM-9794) Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE checkpoints.

2020-04-21 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned BEAM-9794:


Assignee: Maximilian Michels  (was: David Morávek)

> Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE 
> checkpoints.
> 
>
> Key: BEAM-9794
> URL: https://issues.apache.org/jira/browse/BEAM-9794
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.19.0, 2.20.0
>Reporter: David Morávek
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Full original report: 
> https://lists.apache.org/thread.html/rb2ebfad16d85bcf668978b3defd442feda0903c20db29c323497a672%40%3Cuser.beam.apache.org%3E
> The exception comes from: 
> https://github.com/apache/flink/blob/release-1.8.0/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorBackendSerializationProxy.java#L68
> In the Flink Runner code, each checkpoint results in a new OperatorState (or 
> KeyedState if the stream is keyed):
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L91-L103
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L136-L143



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9794) Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE checkpoints.

2020-04-21 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088812#comment-17088812
 ] 

Maximilian Michels commented on BEAM-9794:
--

That's also what I found when I started working on this. Instead, I went for a 
fixed number of buffers which are rotated: 
https://github.com/apache/beam/pull/11478

> Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE 
> checkpoints.
> 
>
> Key: BEAM-9794
> URL: https://issues.apache.org/jira/browse/BEAM-9794
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.19.0, 2.20.0
>Reporter: David Morávek
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Full original report: 
> https://lists.apache.org/thread.html/rb2ebfad16d85bcf668978b3defd442feda0903c20db29c323497a672%40%3Cuser.beam.apache.org%3E
> The exception comes from: 
> https://github.com/apache/flink/blob/release-1.8.0/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorBackendSerializationProxy.java#L68
> In the Flink Runner code, each checkpoint results in a new OperatorState (or 
> KeyedState if the stream is keyed):
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L91-L103
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L136-L143



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9801) Setting a timer from a timer callback fails

2020-04-22 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9801:


 Summary: Setting a timer from a timer callback fails
 Key: BEAM-9801
 URL: https://issues.apache.org/jira/browse/BEAM-9801
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-harness
Reporter: Maximilian Michels


Hi,

I'm trying to set a timer from a timer callback in the Python SDK:

{code:Python}
class GenerateLoad(beam.DoFn):
  timer_spec = userstate.TimerSpec('timer', userstate.TimeDomain.WATERMARK)

  def process(self, element, timer=beam.DoFn.TimerParam(timer_spec)):
self.key = element[0]
timer.set(0)

  @userstate.on_timer(timer_spec)
  def process_timer(self, timer=beam.DoFn.TimerParam(timer_spec)):
timer.set(0)
{code}

This yields the following Python stack trace:

{noformat}
INFO:apache_beam.utils.subprocess_server:Caused by: java.lang.RuntimeException: 
Error received from SDK harness for instruction 4: Traceback (most recent call 
last):
INFO:apache_beam.utils.subprocess_server: File 
"apache_beam/runners/worker/sdk_worker.py", line 245, in _execute
INFO:apache_beam.utils.subprocess_server: response = task()
INFO:apache_beam.utils.subprocess_server: File 
"apache_beam/runners/worker/sdk_worker.py", line 302, in 
INFO:apache_beam.utils.subprocess_server: lambda: 
self.create_worker().do_instruction(request), request)
INFO:apache_beam.utils.subprocess_server: File 
"apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction
INFO:apache_beam.utils.subprocess_server: getattr(request, request_type), 
request.instruction_id)
INFO:apache_beam.utils.subprocess_server: File 
"apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle
INFO:apache_beam.utils.subprocess_server: 
bundle_processor.process_bundle(instruction_id))
INFO:apache_beam.utils.subprocess_server: File 
"apache_beam/runners/worker/bundle_processor.py", line 910, in process_bundle
INFO:apache_beam.utils.subprocess_server: element.timer_family_id, timer_data)
INFO:apache_beam.utils.subprocess_server: File 
"apache_beam/runners/worker/operations.py", line 688, in process_timer
INFO:apache_beam.utils.subprocess_server: timer_data.fire_timestamp)
INFO:apache_beam.utils.subprocess_server: File "apache_beam/runners/common.py", 
line 990, in process_user_timer
INFO:apache_beam.utils.subprocess_server: self._reraise_augmented(exn)
INFO:apache_beam.utils.subprocess_server: File "apache_beam/runners/common.py", 
line 1043, in _reraise_augmented
INFO:apache_beam.utils.subprocess_server: raise_with_traceback(new_exn)
INFO:apache_beam.utils.subprocess_server: File "apache_beam/runners/common.py", 
line 988, in process_user_timer
INFO:apache_beam.utils.subprocess_server: 
self.do_fn_invoker.invoke_user_timer(timer_spec, key, window, timestamp)
INFO:apache_beam.utils.subprocess_server: File "apache_beam/runners/common.py", 
line 517, in invoke_user_timer
INFO:apache_beam.utils.subprocess_server: self.user_state_context, key, window, 
timestamp))
INFO:apache_beam.utils.subprocess_server: File "apache_beam/runners/common.py", 
line 1093, in process_outputs
INFO:apache_beam.utils.subprocess_server: for result in results:
INFO:apache_beam.utils.subprocess_server: File 
"/Users/max/Dev/beam/sdks/python/apache_beam/testing/load_tests/pardo_test.py", 
line 185, in process_timer
INFO:apache_beam.utils.subprocess_server: timer.set(0)
INFO:apache_beam.utils.subprocess_server: File 
"apache_beam/runners/worker/bundle_processor.py", line 589, in set
INFO:apache_beam.utils.subprocess_server: 
self._timer_coder_impl.encode_to_stream(timer, self._output_stream, True)
INFO:apache_beam.utils.subprocess_server: File 
"apache_beam/coders/coder_impl.py", line 651, in encode_to_stream
INFO:apache_beam.utils.subprocess_server: value.hold_timestamp, out, True)
INFO:apache_beam.utils.subprocess_server: File 
"apache_beam/coders/coder_impl.py", line 608, in encode_to_stream
INFO:apache_beam.utils.subprocess_server: millis = value.micros // 1000
INFO:apache_beam.utils.subprocess_server:AttributeError: 'NoneType' object has 
no attribute 'micros' [while running 'GenerateLoad']
{noformat}

Looking at the code base, I'm not sure we have tests for timer output 
timestamps. Am I missing something?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9794) Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE checkpoints.

2020-04-23 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9794:
-
Fix Version/s: 2.21.0

> Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE 
> checkpoints.
> 
>
> Key: BEAM-9794
> URL: https://issues.apache.org/jira/browse/BEAM-9794
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.19.0, 2.20.0
>Reporter: David Morávek
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Full original report: 
> https://lists.apache.org/thread.html/rb2ebfad16d85bcf668978b3defd442feda0903c20db29c323497a672%40%3Cuser.beam.apache.org%3E
> The exception comes from: 
> https://github.com/apache/flink/blob/release-1.8.0/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorBackendSerializationProxy.java#L68
> In the Flink Runner code, each checkpoint results in a new OperatorState (or 
> KeyedState if the stream is keyed):
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L91-L103
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L136-L143



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9801) Setting a timer from a timer callback fails

2020-04-24 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091373#comment-17091373
 ] 

Maximilian Michels commented on BEAM-9801:
--

Yes, I agree that this earns the "critical" badge.

> Setting a timer from a timer callback fails
> ---
>
> Key: BEAM-9801
> URL: https://issues.apache.org/jira/browse/BEAM-9801
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Maximilian Michels
>Priority: Critical
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Hi,
> I'm trying to set a timer from a timer callback in the Python SDK:
> {code:Python}
> class GenerateLoad(beam.DoFn):
>   timer_spec = userstate.TimerSpec('timer', userstate.TimeDomain.WATERMARK)
>   def process(self, element, timer=beam.DoFn.TimerParam(timer_spec)):
> self.key = element[0]
> timer.set(0)
>   @userstate.on_timer(timer_spec)
>   def process_timer(self, timer=beam.DoFn.TimerParam(timer_spec)):
> timer.set(0)
> {code}
> This yields the following Python stack trace:
> {noformat}
> INFO:apache_beam.utils.subprocess_server:Caused by: 
> java.lang.RuntimeException: Error received from SDK harness for instruction 
> 4: Traceback (most recent call last):
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 245, in _execute
> INFO:apache_beam.utils.subprocess_server: response = task()
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 302, in 
> INFO:apache_beam.utils.subprocess_server: lambda: 
> self.create_worker().do_instruction(request), request)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction
> INFO:apache_beam.utils.subprocess_server: getattr(request, request_type), 
> request.instruction_id)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle
> INFO:apache_beam.utils.subprocess_server: 
> bundle_processor.process_bundle(instruction_id))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 910, in process_bundle
> INFO:apache_beam.utils.subprocess_server: element.timer_family_id, timer_data)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/operations.py", line 688, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer_data.fire_timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 990, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: self._reraise_augmented(exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1043, in _reraise_augmented
> INFO:apache_beam.utils.subprocess_server: raise_with_traceback(new_exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 988, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: 
> self.do_fn_invoker.invoke_user_timer(timer_spec, key, window, timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 517, in invoke_user_timer
> INFO:apache_beam.utils.subprocess_server: self.user_state_context, key, 
> window, timestamp))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1093, in process_outputs
> INFO:apache_beam.utils.subprocess_server: for result in results:
> INFO:apache_beam.utils.subprocess_server: File 
> "/Users/max/Dev/beam/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
>  line 185, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer.set(0)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 589, in set
> INFO:apache_beam.utils.subprocess_server: 
> self._timer_coder_impl.encode_to_stream(timer, self._output_stream, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 651, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: value.hold_timestamp, out, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 608, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: millis = value.micros // 1000
> INFO:apache_beam.utils.subprocess_server:AttributeError: 'NoneType' object 
> has no attribute 'micros' [while running 'GenerateLoad']
> {noformat}
> Looking at the code base, I'm not sure we have tests for timer output 
> timestamps. Am I missing something?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9794) Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE checkpoints.

2020-04-24 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9794.
--
Resolution: Fixed

> Flink pipeline with RequiresStableInput fails after Short.MAX_VALUE 
> checkpoints.
> 
>
> Key: BEAM-9794
> URL: https://issues.apache.org/jira/browse/BEAM-9794
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 2.19.0, 2.20.0
>Reporter: David Morávek
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Full original report: 
> https://lists.apache.org/thread.html/rb2ebfad16d85bcf668978b3defd442feda0903c20db29c323497a672%40%3Cuser.beam.apache.org%3E
> The exception comes from: 
> https://github.com/apache/flink/blob/release-1.8.0/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorBackendSerializationProxy.java#L68
> In the Flink Runner code, each checkpoint results in a new OperatorState (or 
> KeyedState if the stream is keyed):
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L91-L103
> https://github.com/apache/beam/blob/v2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/stableinput/BufferingDoFnRunner.java#L136-L143



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9801) Setting a timer from a timer callback fails

2020-04-24 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9801:
-
Fix Version/s: 2.21.0

> Setting a timer from a timer callback fails
> ---
>
> Key: BEAM-9801
> URL: https://issues.apache.org/jira/browse/BEAM-9801
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Hi,
> I'm trying to set a timer from a timer callback in the Python SDK:
> {code:Python}
> class GenerateLoad(beam.DoFn):
>   timer_spec = userstate.TimerSpec('timer', userstate.TimeDomain.WATERMARK)
>   def process(self, element, timer=beam.DoFn.TimerParam(timer_spec)):
> self.key = element[0]
> timer.set(0)
>   @userstate.on_timer(timer_spec)
>   def process_timer(self, timer=beam.DoFn.TimerParam(timer_spec)):
> timer.set(0)
> {code}
> This yields the following Python stack trace:
> {noformat}
> INFO:apache_beam.utils.subprocess_server:Caused by: 
> java.lang.RuntimeException: Error received from SDK harness for instruction 
> 4: Traceback (most recent call last):
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 245, in _execute
> INFO:apache_beam.utils.subprocess_server: response = task()
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 302, in 
> INFO:apache_beam.utils.subprocess_server: lambda: 
> self.create_worker().do_instruction(request), request)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction
> INFO:apache_beam.utils.subprocess_server: getattr(request, request_type), 
> request.instruction_id)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle
> INFO:apache_beam.utils.subprocess_server: 
> bundle_processor.process_bundle(instruction_id))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 910, in process_bundle
> INFO:apache_beam.utils.subprocess_server: element.timer_family_id, timer_data)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/operations.py", line 688, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer_data.fire_timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 990, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: self._reraise_augmented(exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1043, in _reraise_augmented
> INFO:apache_beam.utils.subprocess_server: raise_with_traceback(new_exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 988, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: 
> self.do_fn_invoker.invoke_user_timer(timer_spec, key, window, timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 517, in invoke_user_timer
> INFO:apache_beam.utils.subprocess_server: self.user_state_context, key, 
> window, timestamp))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1093, in process_outputs
> INFO:apache_beam.utils.subprocess_server: for result in results:
> INFO:apache_beam.utils.subprocess_server: File 
> "/Users/max/Dev/beam/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
>  line 185, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer.set(0)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 589, in set
> INFO:apache_beam.utils.subprocess_server: 
> self._timer_coder_impl.encode_to_stream(timer, self._output_stream, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 651, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: value.hold_timestamp, out, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 608, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: millis = value.micros // 1000
> INFO:apache_beam.utils.subprocess_server:AttributeError: 'NoneType' object 
> has no attribute 'micros' [while running 'GenerateLoad']
> {noformat}
> Looking at the code base, I'm not sure we have tests for timer output 
> timestamps. Am I missing something?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9801) Setting a timer from a timer callback fails

2020-04-24 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091513#comment-17091513
 ] 

Maximilian Michels commented on BEAM-9801:
--

I'm also putting it up for debate whether to hold the 2.21.0 release until this 
is fixed. Mitigation is in the PR.

> Setting a timer from a timer callback fails
> ---
>
> Key: BEAM-9801
> URL: https://issues.apache.org/jira/browse/BEAM-9801
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Maximilian Michels
>Priority: Critical
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Hi,
> I'm trying to set a timer from a timer callback in the Python SDK:
> {code:Python}
> class GenerateLoad(beam.DoFn):
>   timer_spec = userstate.TimerSpec('timer', userstate.TimeDomain.WATERMARK)
>   def process(self, element, timer=beam.DoFn.TimerParam(timer_spec)):
> self.key = element[0]
> timer.set(0)
>   @userstate.on_timer(timer_spec)
>   def process_timer(self, timer=beam.DoFn.TimerParam(timer_spec)):
> timer.set(0)
> {code}
> This yields the following Python stack trace:
> {noformat}
> INFO:apache_beam.utils.subprocess_server:Caused by: 
> java.lang.RuntimeException: Error received from SDK harness for instruction 
> 4: Traceback (most recent call last):
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 245, in _execute
> INFO:apache_beam.utils.subprocess_server: response = task()
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 302, in 
> INFO:apache_beam.utils.subprocess_server: lambda: 
> self.create_worker().do_instruction(request), request)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction
> INFO:apache_beam.utils.subprocess_server: getattr(request, request_type), 
> request.instruction_id)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle
> INFO:apache_beam.utils.subprocess_server: 
> bundle_processor.process_bundle(instruction_id))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 910, in process_bundle
> INFO:apache_beam.utils.subprocess_server: element.timer_family_id, timer_data)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/operations.py", line 688, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer_data.fire_timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 990, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: self._reraise_augmented(exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1043, in _reraise_augmented
> INFO:apache_beam.utils.subprocess_server: raise_with_traceback(new_exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 988, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: 
> self.do_fn_invoker.invoke_user_timer(timer_spec, key, window, timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 517, in invoke_user_timer
> INFO:apache_beam.utils.subprocess_server: self.user_state_context, key, 
> window, timestamp))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1093, in process_outputs
> INFO:apache_beam.utils.subprocess_server: for result in results:
> INFO:apache_beam.utils.subprocess_server: File 
> "/Users/max/Dev/beam/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
>  line 185, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer.set(0)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 589, in set
> INFO:apache_beam.utils.subprocess_server: 
> self._timer_coder_impl.encode_to_stream(timer, self._output_stream, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 651, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: value.hold_timestamp, out, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 608, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: millis = value.micros // 1000
> INFO:apache_beam.utils.subprocess_server:AttributeError: 'NoneType' object 
> has no attribute 'micros' [while running 'GenerateLoad']
> {noformat}
> Looking at the code base, I'm not sure we have tests for timer output 
> timestamps. Am I missing something?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-5440) Add option to mount a directory inside SDK harness containers

2020-04-24 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091848#comment-17091848
 ] 

Maximilian Michels commented on BEAM-5440:
--

I think we got sidetracked but this is still useful, at the least for 
development purposes.

> Add option to mount a directory inside SDK harness containers
> -
>
> Key: BEAM-5440
> URL: https://issues.apache.org/jira/browse/BEAM-5440
> Project: Beam
>  Issue Type: New Feature
>  Components: java-fn-execution, sdk-java-core
>Reporter: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> While experimenting with the Python SDK locally, I found it inconvenient that 
> I can't mount a host directory to the Docker containers, i.e. the input must 
> already be in the container and the results of a Write remain inside the 
> container. For local testing, users may want to mount a host directory.
> Since BEAM-5288 the {{Environment}} carries explicit environment information, 
> we could a) add volume args to the {{DockerPayload}}, or b) provide a general 
> Docker arguments field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9827) Test SplittableDoFnTest#testPairWithIndexBasicBounded is flaky

2020-04-27 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9827:


 Summary: Test SplittableDoFnTest#testPairWithIndexBasicBounded is 
flaky
 Key: BEAM-9827
 URL: https://issues.apache.org/jira/browse/BEAM-9827
 Project: Beam
  Issue Type: Test
  Components: runner-flink
Reporter: Maximilian Michels
Assignee: Maximilian Michels


Both {{testPairWithIndexBasicUnbounded}} and {{testPairWithIndexBasicBounded}} 
from {{SplittableDoFnTest}} are flaky every other run. We need to investigate 
the cause for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9827) Test SplittableDoFnTest#testPairWithIndexBasicBounded is flaky

2020-04-27 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9827:
-
Status: Open  (was: Triage Needed)

> Test SplittableDoFnTest#testPairWithIndexBasicBounded is flaky
> --
>
> Key: BEAM-9827
> URL: https://issues.apache.org/jira/browse/BEAM-9827
> Project: Beam
>  Issue Type: Test
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Both {{testPairWithIndexBasicUnbounded}} and 
> {{testPairWithIndexBasicBounded}} from {{SplittableDoFnTest}} are flaky every 
> other run. We need to investigate the cause for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9827) Test SplittableDoFnTest#testPairWithIndexBasicBounded is flaky

2020-04-27 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9827:
-
Fix Version/s: 2.21.0

> Test SplittableDoFnTest#testPairWithIndexBasicBounded is flaky
> --
>
> Key: BEAM-9827
> URL: https://issues.apache.org/jira/browse/BEAM-9827
> Project: Beam
>  Issue Type: Test
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Both {{testPairWithIndexBasicUnbounded}} and 
> {{testPairWithIndexBasicBounded}} from {{SplittableDoFnTest}} are flaky every 
> other run. We need to investigate the cause for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9827) Test SplittableDoFnTest#testPairWithIndexBasicBounded is flaky

2020-04-27 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9827:
-
Priority: Critical  (was: Major)

> Test SplittableDoFnTest#testPairWithIndexBasicBounded is flaky
> --
>
> Key: BEAM-9827
> URL: https://issues.apache.org/jira/browse/BEAM-9827
> Project: Beam
>  Issue Type: Test
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Both {{testPairWithIndexBasicUnbounded}} and 
> {{testPairWithIndexBasicBounded}} from {{SplittableDoFnTest}} are flaky every 
> other run. We need to investigate the cause for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2020-04-27 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-3956.
--
Fix Version/s: Not applicable
   Resolution: Fixed

To me best knowledge this has been resolved a while ago.

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9827) Test SplittableDoFnTest#testPairWithIndexBasicBounded is flaky

2020-04-27 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9827.
--
Resolution: Fixed

> Test SplittableDoFnTest#testPairWithIndexBasicBounded is flaky
> --
>
> Key: BEAM-9827
> URL: https://issues.apache.org/jira/browse/BEAM-9827
> Project: Beam
>  Issue Type: Test
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Both {{testPairWithIndexBasicUnbounded}} and 
> {{testPairWithIndexBasicBounded}} from {{SplittableDoFnTest}} are flaky every 
> other run. We need to investigate the cause for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2020-04-28 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned BEAM-2535:


Assignee: Reuven Lax  (was: Rehman Murad Ali)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 25h 10m
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9841) PortableRunner does not support wait_until_finish(duration=...)

2020-04-28 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9841:
-
Status: Open  (was: Triage Needed)

> PortableRunner does not support wait_until_finish(duration=...)
> ---
>
> Key: BEAM-9841
> URL: https://issues.apache.org/jira/browse/BEAM-9841
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>
> Other runners in the Python SDK support waiting for a finite amount of time 
> on the PipelineResult, and so should PortableRunner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9841) PortableRunner does not support wait_until_finish(duration=...)

2020-04-28 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9841:


 Summary: PortableRunner does not support 
wait_until_finish(duration=...)
 Key: BEAM-9841
 URL: https://issues.apache.org/jira/browse/BEAM-9841
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Maximilian Michels
Assignee: Maximilian Michels


Other runners in the Python SDK support waiting for a finite amount of time on 
the PipelineResult, and so should PortableRunner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2020-04-28 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094581#comment-17094581
 ] 

Maximilian Michels commented on BEAM-2535:
--

I think this still needs API support on the Python side.

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 25h 10m
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9801) Setting a timer from a timer callback fails

2020-04-28 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094864#comment-17094864
 ] 

Maximilian Michels commented on BEAM-9801:
--

I'm not sure if it was working at some point and broke due to introducing new 
features. It appears to be untested, so it likely never worked. It was 
relatively easy to fix though, see the PR.

> Setting a timer from a timer callback fails
> ---
>
> Key: BEAM-9801
> URL: https://issues.apache.org/jira/browse/BEAM-9801
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Hi,
> I'm trying to set a timer from a timer callback in the Python SDK:
> {code:Python}
> class GenerateLoad(beam.DoFn):
>   timer_spec = userstate.TimerSpec('timer', userstate.TimeDomain.WATERMARK)
>   def process(self, element, timer=beam.DoFn.TimerParam(timer_spec)):
> self.key = element[0]
> timer.set(0)
>   @userstate.on_timer(timer_spec)
>   def process_timer(self, timer=beam.DoFn.TimerParam(timer_spec)):
> timer.set(0)
> {code}
> This yields the following Python stack trace:
> {noformat}
> INFO:apache_beam.utils.subprocess_server:Caused by: 
> java.lang.RuntimeException: Error received from SDK harness for instruction 
> 4: Traceback (most recent call last):
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 245, in _execute
> INFO:apache_beam.utils.subprocess_server: response = task()
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 302, in 
> INFO:apache_beam.utils.subprocess_server: lambda: 
> self.create_worker().do_instruction(request), request)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction
> INFO:apache_beam.utils.subprocess_server: getattr(request, request_type), 
> request.instruction_id)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle
> INFO:apache_beam.utils.subprocess_server: 
> bundle_processor.process_bundle(instruction_id))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 910, in process_bundle
> INFO:apache_beam.utils.subprocess_server: element.timer_family_id, timer_data)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/operations.py", line 688, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer_data.fire_timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 990, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: self._reraise_augmented(exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1043, in _reraise_augmented
> INFO:apache_beam.utils.subprocess_server: raise_with_traceback(new_exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 988, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: 
> self.do_fn_invoker.invoke_user_timer(timer_spec, key, window, timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 517, in invoke_user_timer
> INFO:apache_beam.utils.subprocess_server: self.user_state_context, key, 
> window, timestamp))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1093, in process_outputs
> INFO:apache_beam.utils.subprocess_server: for result in results:
> INFO:apache_beam.utils.subprocess_server: File 
> "/Users/max/Dev/beam/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
>  line 185, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer.set(0)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 589, in set
> INFO:apache_beam.utils.subprocess_server: 
> self._timer_coder_impl.encode_to_stream(timer, self._output_stream, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 651, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: value.hold_timestamp, out, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 608, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: millis = value.micros // 1000
> INFO:apache_beam.utils.subprocess_server:AttributeError: 'NoneType' object 
> has no attribute 'micros' [while running 'GenerateLoad']
> {noformat}
> Looking at the code base, I'm not sure we have tests for timer output 
> timestamps. Am I missing something?



--
This message was se

[jira] [Resolved] (BEAM-6661) FnApi gRPC setup/teardown glitch

2020-04-28 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-6661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-6661.
--
Fix Version/s: 2.22.0
 Assignee: (was: Heejong Lee)
   Resolution: Fixed

> FnApi gRPC setup/teardown glitch
> 
>
> Key: BEAM-6661
> URL: https://issues.apache.org/jira/browse/BEAM-6661
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Affects Versions: 2.11.0
>Reporter: Heejong Lee
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Multiple exceptions are observed during FnApi gRPC setup/teardown. The 
> examples are
> {noformat}
> 14:53:22 [grpc-default-executor-1] WARN 
> org.apache.beam.runners.fnexecution.logging.GrpcLoggingService - Logging 
> client failed unexpectedly.
> 14:53:22 org.apache.beam.vendor.grpc.v1p13p1.io.grpc.StatusRuntimeException: 
> CANCELLED: cancelled before receiving half close
> 14:53:22 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Status.asRuntimeException(Status.java:517)
> 14:53:22 at{noformat}
> {noformat}
>  
> 14:52:56 [main] INFO org.apache.beam.runners.flink.FlinkJobServerDriver - 
> JobService started on localhost:58179
> 14:52:57 [grpc-default-executor-0] ERROR 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService - 
> Encountered Unexpected Exception for Invocation 
> job_bfb7df0e-408e-4bfd-bb3c-432e946ca819
> 14:52:57 org.apache.beam.vendor.grpc.v1p13p1.io.grpc.StatusException: 
> NOT_FOUND
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Status.asException(Status.java:534)
> 14:52:57 at 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService.getInvocation(InMemoryJobService.java:341)
> 14:52:57 at 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService.getStateStream(InMemoryJobService.java:262)
> 14:52:57 at 
> org.apache.beam.model.jobmanagement.v1.JobServiceGrpc$MethodHandlers.invoke(JobServiceGrpc.java:693)
>  14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>  14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> 14:52:57 at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 14:52:57 at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {noformat}
> {noformat}
> 14:54:50 Feb 07, 2019 10:54:50 PM 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference
>  cleanQueue
> 14:54:50 SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=628, 
> target=localhost:41409} was not shutdown properly!!! ~*~*~*
> 14:54:50 Make sure to call shutdown()/shutdownNow() and wait until 
> awaitTermination() returns true.
> 14:54:50 java.lang.RuntimeException: ManagedChannel allocation site
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:103)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:53)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:44)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:410)
> 14:54:50 at 
> org.apache.beam.sdk.fn.channel.ManagedChannelFactory.forDescriptor(ManagedChannelFactory.java:44)
> 14:54:50 at 
> org.apache.beam.runners.fnexecution.environment.ExternalEnvironmentFactory.createEnvironment(ExternalEnvironmentFa

[jira] [Updated] (BEAM-9851) "DNS resolution failed"

2020-04-29 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9851:
-
Status: Open  (was: Triage Needed)

> "DNS resolution failed"
> ---
>
> Key: BEAM-9851
> URL: https://issues.apache.org/jira/browse/BEAM-9851
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Maximilian Michels
>Priority: Major
>
> I'm seeing this one all the time, e.g. 
> https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Phrase/158/console
> {noformat}
> 23:17:02 ERROR:apache_beam.runners.worker.data_plane:Failed to read inputs in 
> the data plane.
> 23:17:02 Traceback (most recent call last):
> 23:17:02   File "apache_beam/runners/worker/data_plane.py", line 528, in 
> _read_inputs
> 23:17:02 for elements in elements_iterator:
> 23:17:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Phrase/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
>  line 413, in next
> 23:17:02 return self._next()
> 23:17:02   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Phrase/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
>  line 689, in _next
> 23:17:02 raise self
> 23:17:02 _MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that 
> terminated with:
> 23:17:02  status = StatusCode.UNAVAILABLE
> 23:17:02  details = "DNS resolution failed"
> 23:17:02  debug_error_string = 
> "{"created":"@1588108621.907750662","description":"Failed to pick 
> subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3981,"referenced_errors":[{"created":"@1588108621.907745000","description":"Resolver
>  transient 
> failure","file":"src/core/ext/filters/client_channel/resolving_lb_policy.cc","file_line":214,"referenced_errors":[{"created":"@1588108621.907743049","description":"DNS
>  resolution 
> failed","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/dns_resolver_ares.cc","file_line":357,"grpc_status":14,"referenced_errors":[{"created":"@1588108621.907719737","description":"C-ares
>  status is not ARES_SUCCESS: Misformatted domain 
> name","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc","file_line":244,"referenced_errors":[{"created":"@1588108621.907691960","description":"C-ares
>  status is not ARES_SUCCESS: Misformatted domain 
> name","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc","file_line":244}]}]}]}]}"
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9851) "DNS resolution failed"

2020-04-29 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9851:


 Summary: "DNS resolution failed"
 Key: BEAM-9851
 URL: https://issues.apache.org/jira/browse/BEAM-9851
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Maximilian Michels


I'm seeing this one all the time, e.g. 
https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Phrase/158/console

{noformat}
23:17:02 ERROR:apache_beam.runners.worker.data_plane:Failed to read inputs in 
the data plane.
23:17:02 Traceback (most recent call last):
23:17:02   File "apache_beam/runners/worker/data_plane.py", line 528, in 
_read_inputs
23:17:02 for elements in elements_iterator:
23:17:02   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Phrase/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
 line 413, in next
23:17:02 return self._next()
23:17:02   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Phrase/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
 line 689, in _next
23:17:02 raise self
23:17:02 _MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that 
terminated with:
23:17:02status = StatusCode.UNAVAILABLE
23:17:02details = "DNS resolution failed"
23:17:02debug_error_string = 
"{"created":"@1588108621.907750662","description":"Failed to pick 
subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3981,"referenced_errors":[{"created":"@1588108621.907745000","description":"Resolver
 transient 
failure","file":"src/core/ext/filters/client_channel/resolving_lb_policy.cc","file_line":214,"referenced_errors":[{"created":"@1588108621.907743049","description":"DNS
 resolution 
failed","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/dns_resolver_ares.cc","file_line":357,"grpc_status":14,"referenced_errors":[{"created":"@1588108621.907719737","description":"C-ares
 status is not ARES_SUCCESS: Misformatted domain 
name","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc","file_line":244,"referenced_errors":[{"created":"@1588108621.907691960","description":"C-ares
 status is not ARES_SUCCESS: Misformatted domain 
name","file":"src/core/ext/filters/client_channel/resolver/dns/c_ares/grpc_ares_wrapper.cc","file_line":244}]}]}]}]}"
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9854) Docker container build fails (:sdks:python:container:py37:docker)

2020-04-29 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9854:


 Summary: Docker container build fails 
(:sdks:python:container:py37:docker)
 Key: BEAM-9854
 URL: https://issues.apache.org/jira/browse/BEAM-9854
 Project: Beam
  Issue Type: Bug
  Components: build-system, sdk-py-harness
Reporter: Maximilian Michels


The {{:sdks:python:container:py37:docker}} goal doesn't build, e.g.: 
https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console

{noformat}
17:48:10 > Task :sdks:python:container:py37:docker
17:49:36 The command '/bin/sh -c pip install -r 
/tmp/base_image_requirements.txt && python -c "from 
google.protobuf.internal import api_implementation; assert 
api_implementation._default_implementation_type == 'cpp'; print ('Verified fast 
protobuf used.')" && rm -rf /root/.cache/pip' returned a non-zero code: 1
17:49:36 
17:49:36 > Task :sdks:python:container:py37:docker FAILED
17:49:36 
17:49:36 FAILURE: Build failed with an exception.
17:49:36 
17:49:36 * What went wrong:
17:49:36 Execution failed for task ':sdks:python:container:py37:docker'.
17:49:36 > Process 'command 'docker'' finished with non-zero exit value 1
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9854) Docker container build fails (:sdks:python:container:py37:docker)

2020-04-29 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9854:
-
Status: Open  (was: Triage Needed)

> Docker container build fails (:sdks:python:container:py37:docker)
> -
>
> Key: BEAM-9854
> URL: https://issues.apache.org/jira/browse/BEAM-9854
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-py-harness
>Reporter: Maximilian Michels
>Priority: Major
>
> The {{:sdks:python:container:py37:docker}} goal doesn't build, e.g.: 
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console
> {noformat}
> 17:48:10 > Task :sdks:python:container:py37:docker
> 17:49:36 The command '/bin/sh -c pip install -r 
> /tmp/base_image_requirements.txt && python -c "from 
> google.protobuf.internal import api_implementation; assert 
> api_implementation._default_implementation_type == 'cpp'; print ('Verified 
> fast protobuf used.')" && rm -rf /root/.cache/pip' returned a non-zero 
> code: 1
> 17:49:36 
> 17:49:36 > Task :sdks:python:container:py37:docker FAILED
> 17:49:36 
> 17:49:36 FAILURE: Build failed with an exception.
> 17:49:36 
> 17:49:36 * What went wrong:
> 17:49:36 Execution failed for task ':sdks:python:container:py37:docker'.
> 17:49:36 > Process 'command 'docker'' finished with non-zero exit value 1
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9855) Make it easier to configure a Flink state backend

2020-04-29 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9855:


 Summary: Make it easier to configure a Flink state backend
 Key: BEAM-9855
 URL: https://issues.apache.org/jira/browse/BEAM-9855
 Project: Beam
  Issue Type: Improvement
  Components: runner-flink
Reporter: Maximilian Michels


We should make it easier to configure a Flink state backend. At the moment, 
users have to either (1) configure the default state backend in their Flink 
cluster, or make sure (2a) they include the dependency in their Gradle/Maven 
project (e.g. 
{{"org.apache.flink:flink-statebackend-rocksdb_2.11:$flink_version"}} for 
RocksDB) (2b) set the state backend factory in the {{FlinkPipelineOptions}.

The drawback of option (2) is that it only works in Java due to the factory 
specification being in Java.

We can make it easier by simple adding pipeline options for the state backend 
name and the checkpoint directory which will be enough for configuring the 
state backend. We can add the RocksDB state backend as a default dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9855) Make it easier to configure a Flink state backend

2020-04-29 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9855:
-
Status: Open  (was: Triage Needed)

> Make it easier to configure a Flink state backend
> -
>
> Key: BEAM-9855
> URL: https://issues.apache.org/jira/browse/BEAM-9855
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Priority: Major
>
> We should make it easier to configure a Flink state backend. At the moment, 
> users have to either (1) configure the default state backend in their Flink 
> cluster, or make sure (2a) they include the dependency in their Gradle/Maven 
> project (e.g. 
> {{"org.apache.flink:flink-statebackend-rocksdb_2.11:$flink_version"}} for 
> RocksDB) (2b) set the state backend factory in the {{FlinkPipelineOptions}.
> The drawback of option (2) is that it only works in Java due to the factory 
> specification being in Java.
> We can make it easier by simple adding pipeline options for the state backend 
> name and the checkpoint directory which will be enough for configuring the 
> state backend. We can add the RocksDB state backend as a default dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-6661) FnApi gRPC setup/teardown glitch

2020-04-29 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-6661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-6661:
-
Status: Open  (was: Triage Needed)

> FnApi gRPC setup/teardown glitch
> 
>
> Key: BEAM-6661
> URL: https://issues.apache.org/jira/browse/BEAM-6661
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Affects Versions: 2.11.0
>Reporter: Heejong Lee
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Multiple exceptions are observed during FnApi gRPC setup/teardown. The 
> examples are
> {noformat}
> 14:53:22 [grpc-default-executor-1] WARN 
> org.apache.beam.runners.fnexecution.logging.GrpcLoggingService - Logging 
> client failed unexpectedly.
> 14:53:22 org.apache.beam.vendor.grpc.v1p13p1.io.grpc.StatusRuntimeException: 
> CANCELLED: cancelled before receiving half close
> 14:53:22 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Status.asRuntimeException(Status.java:517)
> 14:53:22 at{noformat}
> {noformat}
>  
> 14:52:56 [main] INFO org.apache.beam.runners.flink.FlinkJobServerDriver - 
> JobService started on localhost:58179
> 14:52:57 [grpc-default-executor-0] ERROR 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService - 
> Encountered Unexpected Exception for Invocation 
> job_bfb7df0e-408e-4bfd-bb3c-432e946ca819
> 14:52:57 org.apache.beam.vendor.grpc.v1p13p1.io.grpc.StatusException: 
> NOT_FOUND
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Status.asException(Status.java:534)
> 14:52:57 at 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService.getInvocation(InMemoryJobService.java:341)
> 14:52:57 at 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService.getStateStream(InMemoryJobService.java:262)
> 14:52:57 at 
> org.apache.beam.model.jobmanagement.v1.JobServiceGrpc$MethodHandlers.invoke(JobServiceGrpc.java:693)
>  14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>  14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> 14:52:57 at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 14:52:57 at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {noformat}
> {noformat}
> 14:54:50 Feb 07, 2019 10:54:50 PM 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference
>  cleanQueue
> 14:54:50 SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=628, 
> target=localhost:41409} was not shutdown properly!!! ~*~*~*
> 14:54:50 Make sure to call shutdown()/shutdownNow() and wait until 
> awaitTermination() returns true.
> 14:54:50 java.lang.RuntimeException: ManagedChannel allocation site
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:103)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:53)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:44)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:410)
> 14:54:50 at 
> org.apache.beam.sdk.fn.channel.ManagedChannelFactory.forDescriptor(ManagedChannelFactory.java:44)
> 14:54:50 at 
> org.apache.beam.runners.fnexecution.environment.ExternalEnvironmentFactory.createEnvironment(ExternalEnvironmentFactory.java:108)
> 14:54:50 at 
> org.apache.beam.runne

[jira] [Resolved] (BEAM-6661) FnApi gRPC setup/teardown glitch

2020-04-29 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-6661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-6661.
--
Fix Version/s: (was: 2.22.0)
   2.21.0
   Resolution: Fixed

> FnApi gRPC setup/teardown glitch
> 
>
> Key: BEAM-6661
> URL: https://issues.apache.org/jira/browse/BEAM-6661
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Affects Versions: 2.11.0
>Reporter: Heejong Lee
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Multiple exceptions are observed during FnApi gRPC setup/teardown. The 
> examples are
> {noformat}
> 14:53:22 [grpc-default-executor-1] WARN 
> org.apache.beam.runners.fnexecution.logging.GrpcLoggingService - Logging 
> client failed unexpectedly.
> 14:53:22 org.apache.beam.vendor.grpc.v1p13p1.io.grpc.StatusRuntimeException: 
> CANCELLED: cancelled before receiving half close
> 14:53:22 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Status.asRuntimeException(Status.java:517)
> 14:53:22 at{noformat}
> {noformat}
>  
> 14:52:56 [main] INFO org.apache.beam.runners.flink.FlinkJobServerDriver - 
> JobService started on localhost:58179
> 14:52:57 [grpc-default-executor-0] ERROR 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService - 
> Encountered Unexpected Exception for Invocation 
> job_bfb7df0e-408e-4bfd-bb3c-432e946ca819
> 14:52:57 org.apache.beam.vendor.grpc.v1p13p1.io.grpc.StatusException: 
> NOT_FOUND
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Status.asException(Status.java:534)
> 14:52:57 at 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService.getInvocation(InMemoryJobService.java:341)
> 14:52:57 at 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService.getStateStream(InMemoryJobService.java:262)
> 14:52:57 at 
> org.apache.beam.model.jobmanagement.v1.JobServiceGrpc$MethodHandlers.invoke(JobServiceGrpc.java:693)
>  14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>  14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> 14:52:57 at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 14:52:57 at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {noformat}
> {noformat}
> 14:54:50 Feb 07, 2019 10:54:50 PM 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference
>  cleanQueue
> 14:54:50 SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=628, 
> target=localhost:41409} was not shutdown properly!!! ~*~*~*
> 14:54:50 Make sure to call shutdown()/shutdownNow() and wait until 
> awaitTermination() returns true.
> 14:54:50 java.lang.RuntimeException: ManagedChannel allocation site
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:103)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:53)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:44)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:410)
> 14:54:50 at 
> org.apache.beam.sdk.fn.channel.ManagedChannelFactory.forDescriptor(ManagedChannelFactory.java:44)
> 14:54:50 at 
> org.apache.beam.runners.fnexecution.environment.ExternalEnvironmentFactory.createEnvironment(ExternalEnvironmentFacto

[jira] [Reopened] (BEAM-6661) FnApi gRPC setup/teardown glitch

2020-04-29 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-6661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reopened BEAM-6661:
--

> FnApi gRPC setup/teardown glitch
> 
>
> Key: BEAM-6661
> URL: https://issues.apache.org/jira/browse/BEAM-6661
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Affects Versions: 2.11.0
>Reporter: Heejong Lee
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Multiple exceptions are observed during FnApi gRPC setup/teardown. The 
> examples are
> {noformat}
> 14:53:22 [grpc-default-executor-1] WARN 
> org.apache.beam.runners.fnexecution.logging.GrpcLoggingService - Logging 
> client failed unexpectedly.
> 14:53:22 org.apache.beam.vendor.grpc.v1p13p1.io.grpc.StatusRuntimeException: 
> CANCELLED: cancelled before receiving half close
> 14:53:22 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Status.asRuntimeException(Status.java:517)
> 14:53:22 at{noformat}
> {noformat}
>  
> 14:52:56 [main] INFO org.apache.beam.runners.flink.FlinkJobServerDriver - 
> JobService started on localhost:58179
> 14:52:57 [grpc-default-executor-0] ERROR 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService - 
> Encountered Unexpected Exception for Invocation 
> job_bfb7df0e-408e-4bfd-bb3c-432e946ca819
> 14:52:57 org.apache.beam.vendor.grpc.v1p13p1.io.grpc.StatusException: 
> NOT_FOUND
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Status.asException(Status.java:534)
> 14:52:57 at 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService.getInvocation(InMemoryJobService.java:341)
> 14:52:57 at 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService.getStateStream(InMemoryJobService.java:262)
> 14:52:57 at 
> org.apache.beam.model.jobmanagement.v1.JobServiceGrpc$MethodHandlers.invoke(JobServiceGrpc.java:693)
>  14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
> 14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>  14:52:57 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> 14:52:57 at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 14:52:57 at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {noformat}
> {noformat}
> 14:54:50 Feb 07, 2019 10:54:50 PM 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference
>  cleanQueue
> 14:54:50 SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=628, 
> target=localhost:41409} was not shutdown properly!!! ~*~*~*
> 14:54:50 Make sure to call shutdown()/shutdownNow() and wait until 
> awaitTermination() returns true.
> 14:54:50 java.lang.RuntimeException: ManagedChannel allocation site
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:103)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:53)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:44)
> 14:54:50 at 
> org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:410)
> 14:54:50 at 
> org.apache.beam.sdk.fn.channel.ManagedChannelFactory.forDescriptor(ManagedChannelFactory.java:44)
> 14:54:50 at 
> org.apache.beam.runners.fnexecution.environment.ExternalEnvironmentFactory.createEnvironment(ExternalEnvironmentFactory.java:108)
> 14:54:50 at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBund

[jira] [Commented] (BEAM-9854) Docker container build fails (:sdks:python:container:py37:docker)

2020-04-30 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096596#comment-17096596
 ] 

Maximilian Michels commented on BEAM-9854:
--

Apparently caused by missing disk space on the Jenkins hosts.

> Docker container build fails (:sdks:python:container:py37:docker)
> -
>
> Key: BEAM-9854
> URL: https://issues.apache.org/jira/browse/BEAM-9854
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-py-harness
>Reporter: Maximilian Michels
>Priority: Major
>
> The {{:sdks:python:container:py37:docker}} goal doesn't build, e.g.: 
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/5/console
> {noformat}
> 17:48:10 > Task :sdks:python:container:py37:docker
> 17:49:36 The command '/bin/sh -c pip install -r 
> /tmp/base_image_requirements.txt && python -c "from 
> google.protobuf.internal import api_implementation; assert 
> api_implementation._default_implementation_type == 'cpp'; print ('Verified 
> fast protobuf used.')" && rm -rf /root/.cache/pip' returned a non-zero 
> code: 1
> 17:49:36 
> 17:49:36 > Task :sdks:python:container:py37:docker FAILED
> 17:49:36 
> 17:49:36 FAILURE: Build failed with an exception.
> 17:49:36 
> 17:49:36 * What went wrong:
> 17:49:36 Execution failed for task ':sdks:python:container:py37:docker'.
> 17:49:36 > Process 'command 'docker'' finished with non-zero exit value 1
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-8944) Python SDK harness performance degradation with UnboundedThreadPoolExecutor

2020-05-01 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17097302#comment-17097302
 ] 

Maximilian Michels commented on BEAM-8944:
--

[~lcwik] For the time being, do you think we could support two modes?

1) dynamic thread allocation 
2) a static number of threads

The second mode could be removed once we can ensure that it performs as 
efficient as the first one. We can default to mode (1). What do you think?

> Python SDK harness performance degradation with UnboundedThreadPoolExecutor
> ---
>
> Key: BEAM-8944
> URL: https://issues.apache.org/jira/browse/BEAM-8944
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Affects Versions: 2.18.0
>Reporter: Yichi Zhang
>Priority: Critical
> Attachments: checkpoint-duration.png, profiling.png, 
> profiling_one_thread.png, profiling_twelve_threads.png
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> We are seeing a performance degradation for python streaming word count load 
> tests.
>  
> After some investigation, it appears to be caused by swapping the original 
> ThreadPoolExecutor to UnboundedThreadPoolExecutor in sdk worker. Suspicion is 
> that python performance is worse with more threads on cpu-bounded tasks.
>  
> A simple test for comparing the multiple thread pool executor performance:
>  
> {code:python}
> def test_performance(self):
>    def run_perf(executor):
>      total_number = 100
>      q = queue.Queue()
>     def task(number):
>        hash(number)
>        q.put(number + 200)
>        return number
>     t = time.time()
>      count = 0
>      for i in range(200):
>        q.put(i)
>     while count < total_number:
>        executor.submit(task, q.get(block=True))
>        count += 1
>      print('%s uses %s' % (executor, time.time() - t))
>    with UnboundedThreadPoolExecutor() as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=1) as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=12) as executor:
>      run_perf(executor)
> {code}
> Results:
>  0x7fab400dbe50> uses 268.160675049
>   uses 
> 79.904583931
>   uses 
> 191.179054976
>  ```
> Profiling:
> UnboundedThreadPoolExecutor:
>  !profiling.png! 
> 1 Thread ThreadPoolExecutor:
>  !profiling_one_thread.png! 
> 12 Threads ThreadPoolExecutor:
>  !profiling_twelve_threads.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9874) Portable timers can't be cleared in batch mode

2020-05-03 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9874:


 Summary: Portable timers can't be cleared in batch mode
 Key: BEAM-9874
 URL: https://issues.apache.org/jira/browse/BEAM-9874
 Project: Beam
  Issue Type: Bug
  Components: runner-flink, runner-spark
Reporter: Maximilian Michels
Assignee: Maximilian Michels


After BEAM-9801, the {{test_pardo_timers_clear}} test fails.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9874) Portable timers can't be cleared in batch mode

2020-05-03 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9874:
-
Fix Version/s: 2.21.0

> Portable timers can't be cleared in batch mode
> --
>
> Key: BEAM-9874
> URL: https://issues.apache.org/jira/browse/BEAM-9874
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.21.0
>
>
> After BEAM-9801, the {{test_pardo_timers_clear}} test fails. The test was 
> probably broken before but we weren't depleting timers on shutdown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9874) Portable timers can't be cleared in batch mode

2020-05-03 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9874:
-
Description: After BEAM-9801, the {{test_pardo_timers_clear}} test fails. 
The test was probably broken before but we weren't depleting timers on 
shutdown.  (was: After BEAM-9801, the {{test_pardo_timers_clear}} test fails.)

> Portable timers can't be cleared in batch mode
> --
>
> Key: BEAM-9874
> URL: https://issues.apache.org/jira/browse/BEAM-9874
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>
> After BEAM-9801, the {{test_pardo_timers_clear}} test fails. The test was 
> probably broken before but we weren't depleting timers on shutdown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9801) Setting a timer from a timer callback fails

2020-05-04 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9801.
--
Resolution: Fixed

> Setting a timer from a timer callback fails
> ---
>
> Key: BEAM-9801
> URL: https://issues.apache.org/jira/browse/BEAM-9801
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Hi,
> I'm trying to set a timer from a timer callback in the Python SDK:
> {code:Python}
> class GenerateLoad(beam.DoFn):
>   timer_spec = userstate.TimerSpec('timer', userstate.TimeDomain.WATERMARK)
>   def process(self, element, timer=beam.DoFn.TimerParam(timer_spec)):
> self.key = element[0]
> timer.set(0)
>   @userstate.on_timer(timer_spec)
>   def process_timer(self, timer=beam.DoFn.TimerParam(timer_spec)):
> timer.set(0)
> {code}
> This yields the following Python stack trace:
> {noformat}
> INFO:apache_beam.utils.subprocess_server:Caused by: 
> java.lang.RuntimeException: Error received from SDK harness for instruction 
> 4: Traceback (most recent call last):
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 245, in _execute
> INFO:apache_beam.utils.subprocess_server: response = task()
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 302, in 
> INFO:apache_beam.utils.subprocess_server: lambda: 
> self.create_worker().do_instruction(request), request)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction
> INFO:apache_beam.utils.subprocess_server: getattr(request, request_type), 
> request.instruction_id)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle
> INFO:apache_beam.utils.subprocess_server: 
> bundle_processor.process_bundle(instruction_id))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 910, in process_bundle
> INFO:apache_beam.utils.subprocess_server: element.timer_family_id, timer_data)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/operations.py", line 688, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer_data.fire_timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 990, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: self._reraise_augmented(exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1043, in _reraise_augmented
> INFO:apache_beam.utils.subprocess_server: raise_with_traceback(new_exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 988, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: 
> self.do_fn_invoker.invoke_user_timer(timer_spec, key, window, timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 517, in invoke_user_timer
> INFO:apache_beam.utils.subprocess_server: self.user_state_context, key, 
> window, timestamp))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1093, in process_outputs
> INFO:apache_beam.utils.subprocess_server: for result in results:
> INFO:apache_beam.utils.subprocess_server: File 
> "/Users/max/Dev/beam/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
>  line 185, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer.set(0)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 589, in set
> INFO:apache_beam.utils.subprocess_server: 
> self._timer_coder_impl.encode_to_stream(timer, self._output_stream, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 651, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: value.hold_timestamp, out, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 608, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: millis = value.micros // 1000
> INFO:apache_beam.utils.subprocess_server:AttributeError: 'NoneType' object 
> has no attribute 'micros' [while running 'GenerateLoad']
> {noformat}
> Looking at the code base, I'm not sure we have tests for timer output 
> timestamps. Am I missing something?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (BEAM-9801) Setting a timer from a timer callback fails

2020-05-04 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned BEAM-9801:


Assignee: Maximilian Michels

> Setting a timer from a timer callback fails
> ---
>
> Key: BEAM-9801
> URL: https://issues.apache.org/jira/browse/BEAM-9801
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Hi,
> I'm trying to set a timer from a timer callback in the Python SDK:
> {code:Python}
> class GenerateLoad(beam.DoFn):
>   timer_spec = userstate.TimerSpec('timer', userstate.TimeDomain.WATERMARK)
>   def process(self, element, timer=beam.DoFn.TimerParam(timer_spec)):
> self.key = element[0]
> timer.set(0)
>   @userstate.on_timer(timer_spec)
>   def process_timer(self, timer=beam.DoFn.TimerParam(timer_spec)):
> timer.set(0)
> {code}
> This yields the following Python stack trace:
> {noformat}
> INFO:apache_beam.utils.subprocess_server:Caused by: 
> java.lang.RuntimeException: Error received from SDK harness for instruction 
> 4: Traceback (most recent call last):
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 245, in _execute
> INFO:apache_beam.utils.subprocess_server: response = task()
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 302, in 
> INFO:apache_beam.utils.subprocess_server: lambda: 
> self.create_worker().do_instruction(request), request)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction
> INFO:apache_beam.utils.subprocess_server: getattr(request, request_type), 
> request.instruction_id)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle
> INFO:apache_beam.utils.subprocess_server: 
> bundle_processor.process_bundle(instruction_id))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 910, in process_bundle
> INFO:apache_beam.utils.subprocess_server: element.timer_family_id, timer_data)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/operations.py", line 688, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer_data.fire_timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 990, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: self._reraise_augmented(exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1043, in _reraise_augmented
> INFO:apache_beam.utils.subprocess_server: raise_with_traceback(new_exn)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 988, in process_user_timer
> INFO:apache_beam.utils.subprocess_server: 
> self.do_fn_invoker.invoke_user_timer(timer_spec, key, window, timestamp)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 517, in invoke_user_timer
> INFO:apache_beam.utils.subprocess_server: self.user_state_context, key, 
> window, timestamp))
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/common.py", line 1093, in process_outputs
> INFO:apache_beam.utils.subprocess_server: for result in results:
> INFO:apache_beam.utils.subprocess_server: File 
> "/Users/max/Dev/beam/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
>  line 185, in process_timer
> INFO:apache_beam.utils.subprocess_server: timer.set(0)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/runners/worker/bundle_processor.py", line 589, in set
> INFO:apache_beam.utils.subprocess_server: 
> self._timer_coder_impl.encode_to_stream(timer, self._output_stream, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 651, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: value.hold_timestamp, out, True)
> INFO:apache_beam.utils.subprocess_server: File 
> "apache_beam/coders/coder_impl.py", line 608, in encode_to_stream
> INFO:apache_beam.utils.subprocess_server: millis = value.micros // 1000
> INFO:apache_beam.utils.subprocess_server:AttributeError: 'NoneType' object 
> has no attribute 'micros' [while running 'GenerateLoad']
> {noformat}
> Looking at the code base, I'm not sure we have tests for timer output 
> timestamps. Am I missing something?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9874) Portable timers can't be cleared in batch mode

2020-05-04 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9874.
--
Resolution: Fixed

> Portable timers can't be cleared in batch mode
> --
>
> Key: BEAM-9874
> URL: https://issues.apache.org/jira/browse/BEAM-9874
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> After BEAM-9801, the {{test_pardo_timers_clear}} test fails. The test was 
> probably broken before but this was not visible because we weren't depleting 
> timers on shutdown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9874) Portable timers can't be cleared in batch mode

2020-05-04 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9874:
-
Description: After BEAM-9801, the {{test_pardo_timers_clear}} test fails. 
The test was probably broken before but this was not visible because we weren't 
depleting timers on shutdown.  (was: After BEAM-9801, the 
{{test_pardo_timers_clear}} test fails. The test was probably broken before but 
we weren't depleting timers on shutdown.)

> Portable timers can't be cleared in batch mode
> --
>
> Key: BEAM-9874
> URL: https://issues.apache.org/jira/browse/BEAM-9874
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> After BEAM-9801, the {{test_pardo_timers_clear}} test fails. The test was 
> probably broken before but this was not visible because we weren't depleting 
> timers on shutdown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9841) PortableRunner does not support wait_until_finish(duration=...)

2020-05-04 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9841.
--
Fix Version/s: 2.22.0
   Resolution: Fixed

> PortableRunner does not support wait_until_finish(duration=...)
> ---
>
> Key: BEAM-9841
> URL: https://issues.apache.org/jira/browse/BEAM-9841
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Other runners in the Python SDK support waiting for a finite amount of time 
> on the PipelineResult, and so should PortableRunner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-1819) Key should be available in @OnTimer methods (Java)

2020-05-05 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-1819:
-
Summary: Key should be available in @OnTimer methods (Java)  (was: Key 
should be available in @OnTimer methods)

> Key should be available in @OnTimer methods (Java)
> --
>
> Key: BEAM-1819
> URL: https://issues.apache.org/jira/browse/BEAM-1819
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Rehman Murad Ali
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 32.5h
>  Remaining Estimate: 0h
>
> Every timer firing has an associated key. This key should be available when 
> the timer is delivered to a user's {{DoFn}}, so they don't have to store it 
> in state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-1819) Key should be available in @OnTimer methods

2020-05-05 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099771#comment-17099771
 ] 

Maximilian Michels commented on BEAM-1819:
--

It looks like Python already passes the key. So good to close.

> Key should be available in @OnTimer methods
> ---
>
> Key: BEAM-1819
> URL: https://issues.apache.org/jira/browse/BEAM-1819
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Rehman Murad Ali
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 32.5h
>  Remaining Estimate: 0h
>
> Every timer firing has an associated key. This key should be available when 
> the timer is delivered to a user's {{DoFn}}, so they don't have to store it 
> in state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-1819) Key should be available in @OnTimer methods (Java)

2020-05-05 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-1819.
--
Resolution: Fixed

> Key should be available in @OnTimer methods (Java)
> --
>
> Key: BEAM-1819
> URL: https://issues.apache.org/jira/browse/BEAM-1819
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Rehman Murad Ali
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 32.5h
>  Remaining Estimate: 0h
>
> Every timer firing has an associated key. This key should be available when 
> the timer is delivered to a user's {{DoFn}}, so they don't have to store it 
> in state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9900) Remove the need for shutdownSourcesOnFinalWatermark flag

2020-05-06 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9900:


 Summary: Remove the need for shutdownSourcesOnFinalWatermark flag
 Key: BEAM-9900
 URL: https://issues.apache.org/jira/browse/BEAM-9900
 Project: Beam
  Issue Type: Improvement
  Components: runner-flink
Reporter: Maximilian Michels
Assignee: Maximilian Michels


The {{shutdownSourcesOnFinalWatermark}} has caused some confusion in the past. 
It is generally used for testing pipelines to ensure that the pipeline and the 
testing cluster shuts down at the end of the job. Without it, the pipeline will 
run forever in streaming mode, regardless of whether the input is finite or not.

We didn't want to enable the flag by default because shutting down any 
operators including sources in Flink will prevent checkpointing from working. 
If we have side input, for example, that may be the case even for long-running 
pipelines. However, we can detect whether checkpointing is enabled and set the 
flag automatically.

The only situation where we may want the flag to be disabled is when users do 
not have checkpointing enabled but want to take a savepoint. This should be 
rare and users can mitigate by setting the flag to false to prevent operators 
from shutting down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9853) Remove spurious or irrelevant error messages from portable runner log output.

2020-05-07 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9853:
-
Status: Open  (was: Triage Needed)

> Remove spurious or irrelevant error messages from portable runner log output.
> -
>
> Key: BEAM-9853
> URL: https://issues.apache.org/jira/browse/BEAM-9853
> Project: Beam
>  Issue Type: Wish
>  Components: runner-flink, runner-spark
>Reporter: Kyle Weaver
>Priority: Major
>  Labels: portability-flink, portability-spark
>
> This is a tracking bug for various other bugs that cause misleading log 
> messages to be printed in the output of portable runners. Such messages 
> usually don't actually affect pipeline execution, but they pollute logs, 
> making it difficult for developers and users to diagnose legitimate issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-8742) Add stateful processing to ParDo load test

2020-05-07 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-8742.
--
Fix Version/s: 2.22.0
   Resolution: Fixed

> Add stateful processing to ParDo load test
> --
>
> Key: BEAM-8742
> URL: https://issues.apache.org/jira/browse/BEAM-8742
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> So far, the ParDo load test is not stateful. We should add a basic counter to 
> test the stateful processing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9900) Remove the need for shutdownSourcesOnFinalWatermark flag

2020-05-07 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9900.
--
Fix Version/s: 2.22.0
   Resolution: Fixed

> Remove the need for shutdownSourcesOnFinalWatermark flag
> 
>
> Key: BEAM-9900
> URL: https://issues.apache.org/jira/browse/BEAM-9900
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>
> The {{shutdownSourcesOnFinalWatermark}} has caused some confusion in the 
> past. It is generally used for testing pipelines to ensure that the pipeline 
> and the testing cluster shuts down at the end of the job. Without it, the 
> pipeline will run forever in streaming mode, regardless of whether the input 
> is finite or not.
> We didn't want to enable the flag by default because shutting down any 
> operators including sources in Flink will prevent checkpointing from working. 
> If we have side input, for example, that may be the case even for 
> long-running pipelines. However, we can detect whether checkpointing is 
> enabled and set the flag automatically.
> The only situation where we may want the flag to be disabled is when users do 
> not have checkpointing enabled but want to take a savepoint. This should be 
> rare and users can mitigate by setting the flag to false to prevent operators 
> from shutting down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9900) Remove the need for shutdownSourcesOnFinalWatermark flag

2020-05-07 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101838#comment-17101838
 ] 

Maximilian Michels commented on BEAM-9900:
--

Fixed via d106f263e625e5d7c4f3848f16da301871f65142.

> Remove the need for shutdownSourcesOnFinalWatermark flag
> 
>
> Key: BEAM-9900
> URL: https://issues.apache.org/jira/browse/BEAM-9900
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>
> The {{shutdownSourcesOnFinalWatermark}} has caused some confusion in the 
> past. It is generally used for testing pipelines to ensure that the pipeline 
> and the testing cluster shuts down at the end of the job. Without it, the 
> pipeline will run forever in streaming mode, regardless of whether the input 
> is finite or not.
> We didn't want to enable the flag by default because shutting down any 
> operators including sources in Flink will prevent checkpointing from working. 
> If we have side input, for example, that may be the case even for 
> long-running pipelines. However, we can detect whether checkpointing is 
> enabled and set the flag automatically.
> The only situation where we may want the flag to be disabled is when users do 
> not have checkpointing enabled but want to take a savepoint. This should be 
> rare and users can mitigate by setting the flag to false to prevent operators 
> from shutting down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9930) Add announcement for Beam Summit Digital 2020 to the blog

2020-05-08 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9930:


 Summary: Add announcement for Beam Summit Digital 2020 to the blog
 Key: BEAM-9930
 URL: https://issues.apache.org/jira/browse/BEAM-9930
 Project: Beam
  Issue Type: Task
  Components: website
Reporter: Maximilian Michels
Assignee: Maximilian Michels


We need to announce Beam Summit Digital 2020 on the blog.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9930) Add announcement for Beam Summit Digital 2020 to the blog

2020-05-10 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9930.
--
Fix Version/s: Not applicable
   Resolution: Fixed

> Add announcement for Beam Summit Digital 2020 to the blog
> -
>
> Key: BEAM-9930
> URL: https://issues.apache.org/jira/browse/BEAM-9930
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We need to announce Beam Summit Digital 2020 on the blog.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9947) Timer coder contains a faulty key coder leading to a corrupted encoding

2020-05-11 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9947:


 Summary: Timer coder contains a faulty key coder leading to a 
corrupted encoding
 Key: BEAM-9947
 URL: https://issues.apache.org/jira/browse/BEAM-9947
 Project: Beam
  Issue Type: Bug
  Components: java-fn-execution, sdk-py-core
Reporter: Maximilian Michels
Assignee: Maximilian Michels
 Fix For: 2.21.0


The coder for timers contains a key coder which may have to be length-prefixed 
in case of a non-standard coders. We noticed that this was not reflected in the
ProcessBundleDescriptor leading to errors like this one for non-standard 
coders, e.g. Python's {{FastPrimitivesCoder}}:

{noformat}
Caused by: org.apache.beam.sdk.coders.CoderException: java.io.EOFException: 
reached end of stream after reading 36 bytes; 68 bytes expected
at 
org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104)
at 
org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90)
at 
org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:194)
at 
org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:157)
at 
org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:81)
at 
org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:31)
at 
org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:180)
at 
org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:127)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:251)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
{noformat}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9947) Timer coder contains a faulty key coder leading to a corrupted encoding

2020-05-11 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9947:
-
Description: 
The coder for timers contains a key coder which may have to be length-prefixed 
in case of a non-standard coders. We noticed that this was not reflected in the 
{{ProcessBundleDescriptor}} leading to errors like this one for non-standard 
coders, e.g. Python's {{FastPrimitivesCoder}}:

{noformat}
Caused by: org.apache.beam.sdk.coders.CoderException: java.io.EOFException: 
reached end of stream after reading 36 bytes; 68 bytes expected
at 
org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104)
at 
org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90)
at 
org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:194)
at 
org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:157)
at 
org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:81)
at 
org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:31)
at 
org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:180)
at 
org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:127)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:251)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
{noformat}


  was:
The coder for timers contains a key coder which may have to be length-prefixed 
in case of a non-standard coders. We noticed that this was not reflected in the
ProcessBundleDescriptor leading to errors like this one for non-standard 
coders, e.g. Python's {{FastPrimitivesCoder}}:

{noformat}
Caused by: org.apache.beam.sdk.coders.CoderException: java.io.EOFException: 
reached end of stream after reading 36 bytes; 68 bytes expected
at 
org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104)
at 
org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90)
at 
org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:194)
at 
org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:157)
at 
org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:81)
at 
org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:31)
at 
org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:180)
at 
org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:127)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:251)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292)
at 
org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782)
at 
org.apache

[jira] [Updated] (BEAM-9947) Timer coder contains a faulty key coder leading to a corrupted encoding

2020-05-11 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9947:
-
Affects Version/s: 2.20.0

> Timer coder contains a faulty key coder leading to a corrupted encoding
> ---
>
> Key: BEAM-9947
> URL: https://issues.apache.org/jira/browse/BEAM-9947
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, sdk-py-core
>Affects Versions: 2.20.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The coder for timers contains a key coder which may have to be 
> length-prefixed in case of a non-standard coders. We noticed that this was 
> not reflected in the {{ProcessBundleDescriptor}} leading to errors like this 
> one for non-standard coders, e.g. Python's {{FastPrimitivesCoder}}:
> {noformat}
> Caused by: org.apache.beam.sdk.coders.CoderException: java.io.EOFException: 
> reached end of stream after reading 36 bytes; 68 bytes expected
>   at 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104)
>   at 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90)
>   at 
> org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:194)
>   at 
> org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:157)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:81)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:31)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:180)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:127)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:251)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   ... 1 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9947) Timer coder contains a faulty key coder leading to a corrupted encoding

2020-05-11 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9947:
-
Affects Version/s: (was: 2.20.0)
   2.21.0

> Timer coder contains a faulty key coder leading to a corrupted encoding
> ---
>
> Key: BEAM-9947
> URL: https://issues.apache.org/jira/browse/BEAM-9947
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, sdk-py-core
>Affects Versions: 2.21.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The coder for timers contains a key coder which may have to be 
> length-prefixed in case of a non-standard coders. We noticed that this was 
> not reflected in the {{ProcessBundleDescriptor}} leading to errors like this 
> one for non-standard coders, e.g. Python's {{FastPrimitivesCoder}}:
> {noformat}
> Caused by: org.apache.beam.sdk.coders.CoderException: java.io.EOFException: 
> reached end of stream after reading 36 bytes; 68 bytes expected
>   at 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104)
>   at 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90)
>   at 
> org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:194)
>   at 
> org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:157)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:81)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:31)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:180)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:127)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:251)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   ... 1 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9164) [PreCommit_Java] [Flake] UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest

2020-05-12 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105184#comment-17105184
 ] 

Maximilian Michels commented on BEAM-9164:
--

I was not aware that tests had been disabled. I'm a bit shocked that we just 
disabled them without mentioning relevant folks or at least posting in the JIRA.

I changed logic around the watermark propagation in BEAM-9900 which could have 
removed the flakiness. I'll re-enable the tests.

> [PreCommit_Java] [Flake] 
> UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest
> ---
>
> Key: BEAM-9164
> URL: https://issues.apache.org/jira/browse/BEAM-9164
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kirill Kozlov
>Priority: Critical
>  Labels: flake
>
> Test:
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest
>  >> testWatermarkEmission[numTasks = 1; numSplits=1]
> Fails with the following exception:
> {code:java}
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds{code}
> Affected Jenkins job: 
> [https://builds.apache.org/job/beam_PreCommit_Java_Phrase/1665/]
> Gradle build scan: [https://scans.gradle.com/s/nvgeb425fe63q]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-8742) Add stateful processing to ParDo load test

2020-05-12 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105273#comment-17105273
 ] 

Maximilian Michels commented on BEAM-8742:
--

I just saw your comment here. Thank you. I added streaming tests in 
https://github.com/apache/beam/pull/11558. Have a look if you want.

I've created a dashboard here: 
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056

> Add stateful processing to ParDo load test
> --
>
> Key: BEAM-8742
> URL: https://issues.apache.org/jira/browse/BEAM-8742
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> So far, the ParDo load test is not stateful. We should add a basic counter to 
> test the stateful processing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9947) Timer coder contains a faulty key coder leading to a corrupted encoding

2020-05-12 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9947.
--
Resolution: Fixed

> Timer coder contains a faulty key coder leading to a corrupted encoding
> ---
>
> Key: BEAM-9947
> URL: https://issues.apache.org/jira/browse/BEAM-9947
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, sdk-py-core
>Affects Versions: 2.21.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The coder for timers contains a key coder which may have to be 
> length-prefixed in case of a non-standard coders. We noticed that this was 
> not reflected in the {{ProcessBundleDescriptor}} leading to errors like this 
> one for non-standard coders, e.g. Python's {{FastPrimitivesCoder}}:
> {noformat}
> Caused by: org.apache.beam.sdk.coders.CoderException: java.io.EOFException: 
> reached end of stream after reading 36 bytes; 68 bytes expected
>   at 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104)
>   at 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90)
>   at 
> org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:194)
>   at 
> org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:157)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:81)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:31)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:180)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:127)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:251)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   ... 1 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9963) Flink streaming load tests fail with TypeError

2020-05-12 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9963:


 Summary: Flink streaming load tests fail with TypeError
 Key: BEAM-9963
 URL: https://issues.apache.org/jira/browse/BEAM-9963
 Project: Beam
  Issue Type: Test
  Components: build-system, runner-flink
Reporter: Maximilian Michels
Assignee: Maximilian Michels


The newly added load tests now fail on the latest master.

https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console

{noformat}
16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state changed 
to FAILED
16:00:38 Traceback (most recent call last):
16:00:38   File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
16:00:38 "__main__", mod_spec)
16:00:38   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
16:00:38 exec(code, run_globals)
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
 line 225, in 
16:00:38 ParDoTest().run()
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py",
 line 115, in run
16:00:38 self.result.wait_until_finish(duration=self.timeout_ms)
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
 line 583, in wait_until_finish
16:00:38 raise self._runtime_exception
16:00:38 RuntimeError: Pipeline 
load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa
 failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
harness for instruction 2: Traceback (most recent call last):
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 245, in _execute
16:00:38 response = task()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 302, in 
16:00:38 lambda: self.create_worker().do_instruction(request), request)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 471, in do_instruction
16:00:38 getattr(request, request_type), request.instruction_id)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 506, in process_bundle
16:00:38 bundle_processor.process_bundle(instruction_id))
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 918, in process_bundle
16:00:38 op.finish()
16:00:38   File "apache_beam/runners/worker/operations.py", line 697, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File "apache_beam/runners/worker/operations.py", line 699, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File "apache_beam/runners/worker/operations.py", line 702, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 709, in commit
16:00:38 state.commit()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 414, in commit
16:00:38 self._underlying_bag_state.commit()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 481, in commit
16:00:38 is_cached=True)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 951, in extend
16:00:38 coder.encode_to_stream(element, out, True)
16:00:38   File "apache_beam/coders/coder_impl.py", line 690, in 
apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
16:00:38   File "apache_beam/coders/coder_impl.py", line 692, in 
apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
16:00:38 TypeError: an integer is required
16:00:38 
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9963) Flink streaming load tests fail with TypeError

2020-05-12 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9963:
-
Description: 
The newly added load tests now fail on the latest master.

https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console

{noformat}
16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state changed 
to FAILED
16:00:38 Traceback (most recent call last):
16:00:38   File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
16:00:38 "__main__", mod_spec)
16:00:38   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
16:00:38 exec(code, run_globals)
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
 line 225, in 
16:00:38 ParDoTest().run()
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py",
 line 115, in run
16:00:38 self.result.wait_until_finish(duration=self.timeout_ms)
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
 line 583, in wait_until_finish
16:00:38 raise self._runtime_exception
16:00:38 RuntimeError: Pipeline 
load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa
 failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
harness for instruction 2: Traceback (most recent call last):
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 245, in _execute
16:00:38 response = task()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 302, in 
16:00:38 lambda: self.create_worker().do_instruction(request), request)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 471, in do_instruction
16:00:38 getattr(request, request_type), request.instruction_id)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 506, in process_bundle
16:00:38 bundle_processor.process_bundle(instruction_id))
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 918, in process_bundle
16:00:38 op.finish()
16:00:38   File "apache_beam/runners/worker/operations.py", line 697, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File "apache_beam/runners/worker/operations.py", line 699, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File "apache_beam/runners/worker/operations.py", line 702, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 709, in commit
16:00:38 state.commit()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 414, in commit
16:00:38 self._underlying_bag_state.commit()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 481, in commit
16:00:38 is_cached=True)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 951, in extend
16:00:38 coder.encode_to_stream(element, out, True)
16:00:38   File "apache_beam/coders/coder_impl.py", line 690, in 
apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
16:00:38   File "apache_beam/coders/coder_impl.py", line 692, in 
apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
16:00:38 TypeError: an integer is required
16:00:38 
{noformat}

Cron fails as well: 
https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming/5/console

  was:
The newly added load tests now fail on the latest master.

https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console

{noformat}
16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state changed 
to FAILED
16:00:38 Traceback (most recent call last):
16:00:38   File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
16:00:38 "__main__", mod_spec)
16:00:38   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
16:00:38 exec(code, run_globals)
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
 line 225, in 
16:00:38 ParDoTest().run()
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py",
 line 115

[jira] [Resolved] (BEAM-9963) Flink streaming load tests fail with TypeError

2020-05-12 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9963.
--
Fix Version/s: Not applicable
   Resolution: Fixed

> Flink streaming load tests fail with TypeError
> --
>
> Key: BEAM-9963
> URL: https://issues.apache.org/jira/browse/BEAM-9963
> Project: Beam
>  Issue Type: Test
>  Components: build-system, runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: Not applicable
>
>
> The newly added load tests now fail on the latest master.
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console
> {noformat}
> 16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state 
> changed to FAILED
> 16:00:38 Traceback (most recent call last):
> 16:00:38   File "/usr/lib/python3.7/runpy.py", line 193, in 
> _run_module_as_main
> 16:00:38 "__main__", mod_spec)
> 16:00:38   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
> 16:00:38 exec(code, run_globals)
> 16:00:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
>  line 225, in 
> 16:00:38 ParDoTest().run()
> 16:00:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py",
>  line 115, in run
> 16:00:38 self.result.wait_until_finish(duration=self.timeout_ms)
> 16:00:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 583, in wait_until_finish
> 16:00:38 raise self._runtime_exception
> 16:00:38 RuntimeError: Pipeline 
> load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa
>  failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
> harness for instruction 2: Traceback (most recent call last):
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 245, in _execute
> 16:00:38 response = task()
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 302, in 
> 16:00:38 lambda: self.create_worker().do_instruction(request), request)
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 471, in do_instruction
> 16:00:38 getattr(request, request_type), request.instruction_id)
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 506, in process_bundle
> 16:00:38 bundle_processor.process_bundle(instruction_id))
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 918, in process_bundle
> 16:00:38 op.finish()
> 16:00:38   File "apache_beam/runners/worker/operations.py", line 697, in 
> apache_beam.runners.worker.operations.DoOperation.finish
> 16:00:38   File "apache_beam/runners/worker/operations.py", line 699, in 
> apache_beam.runners.worker.operations.DoOperation.finish
> 16:00:38   File "apache_beam/runners/worker/operations.py", line 702, in 
> apache_beam.runners.worker.operations.DoOperation.finish
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 709, in commit
> 16:00:38 state.commit()
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 414, in commit
> 16:00:38 self._underlying_bag_state.commit()
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 481, in commit
> 16:00:38 is_cached=True)
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 951, in extend
> 16:00:38 coder.encode_to_stream(element, out, True)
> 16:00:38   File "apache_beam/coders/coder_impl.py", line 690, in 
> apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
> 16:00:38   File "apache_beam/coders/coder_impl.py", line 692, in 
> apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
> 16:00:38 TypeError: an integer is required
> 16:00:38 
> {noformat}
> Cron fails as well: 
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming/5/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-8742) Add stateful processing to ParDo load test

2020-05-12 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-8742:
-
Description: 
So far, the ParDo load test is not stateful. We should add a basic counter to 
test the stateful processing.

The test should work in streaming mode and with checkpointing.

  was:So far, the ParDo load test is not stateful. We should add a basic 
counter to test the stateful processing.


> Add stateful processing to ParDo load test
> --
>
> Key: BEAM-8742
> URL: https://issues.apache.org/jira/browse/BEAM-8742
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> So far, the ParDo load test is not stateful. We should add a basic counter to 
> test the stateful processing.
> The test should work in streaming mode and with checkpointing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-8742) Add stateful processing to ParDo load test

2020-05-12 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105339#comment-17105339
 ] 

Maximilian Michels commented on BEAM-8742:
--

Cron: https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming/
PR: 
https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/

> Add stateful processing to ParDo load test
> --
>
> Key: BEAM-8742
> URL: https://issues.apache.org/jira/browse/BEAM-8742
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> So far, the ParDo load test is not stateful. We should add a basic counter to 
> test the stateful processing.
> The test should work in streaming mode and with checkpointing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9900) Remove the need for shutdownSourcesOnFinalWatermark flag

2020-05-12 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105620#comment-17105620
 ] 

Maximilian Michels commented on BEAM-9900:
--

For context: 
https://github.com/apache/beam/pull/11558/commits/d106f263e625e5d7c4f3848f16da301871f65142

> Remove the need for shutdownSourcesOnFinalWatermark flag
> 
>
> Key: BEAM-9900
> URL: https://issues.apache.org/jira/browse/BEAM-9900
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>
> The {{shutdownSourcesOnFinalWatermark}} has caused some confusion in the 
> past. It is generally used for testing pipelines to ensure that the pipeline 
> and the testing cluster shuts down at the end of the job. Without it, the 
> pipeline will run forever in streaming mode, regardless of whether the input 
> is finite or not.
> We didn't want to enable the flag by default because shutting down any 
> operators including sources in Flink will prevent checkpointing from working. 
> If we have side input, for example, that may be the case even for 
> long-running pipelines. However, we can detect whether checkpointing is 
> enabled and set the flag automatically.
> The only situation where we may want the flag to be disabled is when users do 
> not have checkpointing enabled but want to take a savepoint. This should be 
> rare and users can mitigate by setting the flag to false to prevent operators 
> from shutting down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9966) Investigate variance in checkpoint duration of ParDo streaming tests

2020-05-12 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9966:


 Summary: Investigate variance in checkpoint duration of ParDo 
streaming tests
 Key: BEAM-9966
 URL: https://issues.apache.org/jira/browse/BEAM-9966
 Project: Beam
  Issue Type: Bug
  Components: build-system, runner-flink
Reporter: Maximilian Michels
Assignee: Maximilian Michels


We need to take a closer look at the variance in checkpoint duration which, for 
different test runs, fluctuates between one second and one minute. 
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9164) [PreCommit_Java] [Flake] UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest

2020-05-12 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9164.
--
Fix Version/s: Not applicable
   Resolution: Fixed

This should be fixed but feel free to re-open if it pops up again.

> [PreCommit_Java] [Flake] 
> UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest
> ---
>
> Key: BEAM-9164
> URL: https://issues.apache.org/jira/browse/BEAM-9164
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kirill Kozlov
>Priority: Critical
>  Labels: flake
> Fix For: Not applicable
>
>
> Test:
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest
>  >> testWatermarkEmission[numTasks = 1; numSplits=1]
> Fails with the following exception:
> {code:java}
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds{code}
> Affected Jenkins job: 
> [https://builds.apache.org/job/beam_PreCommit_Java_Phrase/1665/]
> Gradle build scan: [https://scans.gradle.com/s/nvgeb425fe63q]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9843) Flink UnboundedSourceWrapperTest flaky due to a timeout

2020-05-12 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105723#comment-17105723
 ] 

Maximilian Michels commented on BEAM-9843:
--

Thanks for reporting this. This should be fixed. Feel free to ping me here if 
it fails again.

> Flink UnboundedSourceWrapperTest flaky due to a timeout
> ---
>
> Key: BEAM-9843
> URL: https://issues.apache.org/jira/browse/BEAM-9843
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
> Fix For: Not applicable
>
>
> For example,
> [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2685/]
> [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2684/]
> [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2682/]
> [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2680/]
> [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2685/testReport/junit/org.apache.beam.runners.flink.translation.wrappers.streaming.io/UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest/testWatermarkEmission_numTasks___4__numSplits_4_/]
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds at sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest.testWatermarkEmission(UnboundedSourceWrapperTest.java:354)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (BEAM-9976) FlinkSavepointTest timeout flake

2020-05-13 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9976:
-
Status: Open  (was: Triage Needed)

> FlinkSavepointTest timeout flake
> 
>
> Key: BEAM-9976
> URL: https://issues.apache.org/jira/browse/BEAM-9976
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Reporter: Kyle Weaver
>Priority: Minor
>
> org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestorePortable
> org.junit.runners.model.TestTimedOutException: test timed out after 60 seconds
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
>   at 
> org.apache.beam.runners.flink.FlinkSavepointTest.runSavepointAndRestore(FlinkSavepointTest.java:188)
>   at 
> org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestorePortable(FlinkSavepointTest.java:154)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9976) FlinkSavepointTest timeout flake

2020-05-13 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106196#comment-17106196
 ] 

Maximilian Michels commented on BEAM-9976:
--

Do you have a link to the full log?

> FlinkSavepointTest timeout flake
> 
>
> Key: BEAM-9976
> URL: https://issues.apache.org/jira/browse/BEAM-9976
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Reporter: Kyle Weaver
>Priority: Minor
>
> org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestorePortable
> org.junit.runners.model.TestTimedOutException: test timed out after 60 seconds
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>   at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
>   at 
> org.apache.beam.runners.flink.FlinkSavepointTest.runSavepointAndRestore(FlinkSavepointTest.java:188)
>   at 
> org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestorePortable(FlinkSavepointTest.java:154)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (BEAM-9981) Use FlinkRunner instead of PortableRunner for load tests

2020-05-13 Thread Maximilian Michels (Jira)

Maximilian Michels created BEAM-9981:


 Summary: Use FlinkRunner instead of PortableRunner for load tests
 Key: BEAM-9981
 URL: https://issues.apache.org/jira/browse/BEAM-9981
 Project: Beam
  Issue Type: Test
  Components: build-system, runner-flink
Reporter: Maximilian Michels
Assignee: Maximilian Michels


We may want to use the FlinkRunner to exercise it and to get rid of the job 
server container dependency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-9966) Investigate variance in checkpoint duration of ParDo streaming tests

2020-05-13 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106442#comment-17106442
 ] 

Maximilian Michels commented on BEAM-9966:
--

It seems to have been caused because the test used local storage instead of 
SSDs or a distributed file system.

> Investigate variance in checkpoint duration of ParDo streaming tests
> 
>
> Key: BEAM-9966
> URL: https://issues.apache.org/jira/browse/BEAM-9966
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>
> We need to take a closer look at the variance in checkpoint duration which, 
> for different test runs, fluctuates between one second and one minute. 
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (BEAM-8239) Docker options in --environment_config

2020-05-16 Thread Maximilian Michels (Jira)



[ 
https://issues.apache.org/jira/browse/BEAM-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108991#comment-17108991
 ] 

Maximilian Michels commented on BEAM-8239:
--

To this date there is no option to mount a volume using the environment config. 
I think it would be good to revisit BEAM-5440 for the next release.

For now the only option is to use a distributed file system (e.g. HDFS, S3, 
Google Storage).

> Docker options in --environment_config
> --
>
> Key: BEAM-8239
> URL: https://issues.apache.org/jira/browse/BEAM-8239
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Affects Versions: 2.15.0
>Reporter: Benjamin Tan
>Priority: Major
>
> {{I'm trying to mount a directory by providing additional arguments via 
> –environment_config in the PipelineOptions:}}
>  
> {{pipeline_options = 
> PipelineOptions([{color:#6a8759}"--runner=PortableRunner"{color}{color:#cc7832},{color}
>  {color:#6a8759}"--job_endpoint=localhost:8099"{color}{color:#cc7832},{color} 
> {color:#6a8759}"--environment_config=-v /tmp:/tmp 
> benjamintan-docker-apache.bintray.io/beam/python3:latest"{color}]{color:#cc7832},{color}
>  {color:#aa4926}pipeline_type_check{color}={color:#cc7832}True{color})}}
>  
> However,  the command fails with the following:
>  
>  
> {{RuntimeError: Pipeline 
> BeamApp-benjamintan-091616-839e633f_994659f0-7da9-412e-91e2-f32dd4f24b5c 
> failed in state FAILED: java.io.IOException: Received exit code 125 for 
> command 'docker run -d --mount 
> type=bind,src=/home/benjamintan/.config/gcloud,dst=/root/.config/gcloud 
> --network=host --env=DOCKER_MAC_CONTAINER=null -v /tmp:/tmp 
> benjamintan-docker-apache.bintray.io/beam/python3:latest --id=7-1 
> --logging_endpoint=localhost:41835 --artifact_endpoint=localhost:40063 
> --provision_endpoint=localhost:39827 --control_endpoint=localhost:45355'. 
> stderr: unknown flag: --idSee 'docker run --help'.}}
>  
> However, if I were to copy and paste the `docker run ...` command, the 
> command seems OK (no syntax errors)
>  
> This seems related to BEAM-5440. It isn't clear if there's a "right" way to 
> pass in additional Docker run arguments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9966) Investigate variance in checkpoint duration of ParDo streaming tests

2020-05-19 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9966.
--
Fix Version/s: Not applicable
   Resolution: Fixed

The load test had a bug which was addressed in the PR.

> Investigate variance in checkpoint duration of ParDo streaming tests
> 
>
> Key: BEAM-9966
> URL: https://issues.apache.org/jira/browse/BEAM-9966
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: P2
> Fix For: Not applicable
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We need to take a closer look at the variance in checkpoint duration which, 
> for different test runs, fluctuates between one second and one minute. 
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Closed] (BEAM-7949) Add time-based cache threshold support in the data service of the Python SDK harness

2019-12-25 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed BEAM-7949.

Fix Version/s: 2.19.0
 Assignee: sunjincheng
   Resolution: Fixed

> Add time-based cache threshold support in the data service of the Python SDK 
> harness
> 
>
> Key: BEAM-7949
> URL: https://issues.apache.org/jira/browse/BEAM-7949
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently only size-based cache threshold is supported in the data service of 
> Python SDK harness. It should also support the time-based cache threshold. 
> This is very important, especially for streaming jobs which are sensitive to 
> the delay. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (BEAM-9006) Meta space memory leak caused by the shutdown hook of ProcessManager

2020-01-02 Thread Maximilian Michels (Jira)



 [ 
https://issues.apache.org/jira/browse/BEAM-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9006.
--
Resolution: Fixed

> Meta space memory leak caused by the shutdown hook of ProcessManager 
> -
>
> Key: BEAM-9006
> URL: https://issues.apache.org/jira/browse/BEAM-9006
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently the class `ProcessManager` will add a shutdown hook to stop all the 
> living processes before JVM exits. The shutdown hook will never be removed. 
> If this class is loaded by the user class loader, it will cause the user 
> class loader could not be garbage collected which causes meta space memory 
> leak eventually.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

< 1 2 3 4 5 6 7 8 9 10 >

301 - 400 of 1232 matches

Mail list logo