[
https://issues.apache.org/jira/browse/BEAM-7116?focusedWorklogId=358237&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-358237
]
ASF GitHub Bot logged work on BEAM-7116:
----------------------------------------
Author: ASF GitHub Bot
Created on: 12/Dec/19 01:57
Start Date: 12/Dec/19 01:57
Worklog Time Spent: 10m
Work Description: reuvenlax commented on issue #10151: [BEAM-7116] Remove
use of KV in Schema transforms
URL: https://github.com/apache/beam/pull/10151#issuecomment-564815005
The short answer is that the failure is caused by a limitation of the Spark
runner - the Beam model allows the Iterable passed into a GroupByKey to
reiterated, but Spark is unable to support that.
The longer answer is that this is exposing a performance bug in the Group
transform. We were accidentally walking over the iterator, which we should
not be doing here. I'll send a PR to fix it.
On Wed, Dec 11, 2019 at 3:43 PM Brian Hulette <[email protected]>
wrote:
> #10358 <https://github.com/apache/beam/pull/10358> should fix Spark
> ValidatesRunner
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
>
<https://github.com/apache/beam/pull/10151?email_source=notifications&email_token=AFAYJVMNEC7QSLVPSGN6B3TQYF3ITA5CNFSM4JO5WLU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGU6Q5Y#issuecomment-564783223>,
> or unsubscribe
>
<https://github.com/notifications/unsubscribe-auth/AFAYJVNBKIBVTN6JRD7YJZLQYF3ITANCNFSM4JO5WLUQ>
> .
>
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 358237)
Time Spent: 2h 10m (was: 2h)
> Remove KV from Schema transforms
> --------------------------------
>
> Key: BEAM-7116
> URL: https://issues.apache.org/jira/browse/BEAM-7116
> Project: Beam
> Issue Type: Sub-task
> Components: sdk-java-core
> Reporter: Reuven Lax
> Assignee: Brian Hulette
> Priority: Major
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> Instead of returning KV objects, we should return a Schema with two fields.
> The Convert transform should be able to convert these to KV objects.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)