[
https://issues.apache.org/jira/browse/BEAM-10670?focusedWorklogId=486063&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-486063
]
ASF GitHub Bot logged work on BEAM-10670:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 18/Sep/20 05:25
Start Date: 18/Sep/20 05:25
Worklog Time Spent: 10m
Work Description: lukecwik commented on pull request #12603:
URL: https://github.com/apache/beam/pull/12603#issuecomment-694659139
@iemejia I have updated the code and added a `SparkProcessedKeyedElements`
using `updateStateByKey` to evaluate a splittable DoFn. I based the logic off
of the `SparkGroupAlsoByWindowViaWindowSet` logic. I also added some
`SplittableDoFnTest`s and have been using
`org.apache.beam.runners.spark.translation.streaming.SplittableDoFnTest#testPairWithIndexBasicUnbounded`
as my base test to get it working. So far I am able to get output produced
including having the splittable DoFn state saved and restored for the next
round of execution. I do see most of output via the `LoggingDoFn` that I added
to the test but the `PAssert` is failing because it doesn't see any of the
output and it is also triggering too early as not all the output has been
produced. Any suggestions as to what to take a look at?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 486063)
Time Spent: 18h 10m (was: 18h)
> Make non-portable Splittable DoFn the only option when executing Java "Read"
> transforms
> ---------------------------------------------------------------------------------------
>
> Key: BEAM-10670
> URL: https://issues.apache.org/jira/browse/BEAM-10670
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-core
> Reporter: Luke Cwik
> Assignee: Luke Cwik
> Priority: P2
> Time Spent: 18h 10m
> Remaining Estimate: 0h
>
> All runners seem to be capable of migrating to splittable DoFn for
> non-portable execution except for Dataflow runner v1 which will internalize
> the current primitive read implementation that is shared across runner
> implementations.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)