[
https://issues.apache.org/jira/browse/HUDI-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Pouttu updated HUDI-1873:
------------------------------
Description: A collect call causes resource issues with very large upserts,
and is only used for reporting error messages that are already in the spark
task logs. I replaced it with a .isEmpty() call and amended the error message
to direct the user to the task logs. (was: A collect call causes resource
issues with very large upserts, and is only used for reporting error messages
that are already in the spark task logs. I replaced it with a .isEmpty() call
and amended the error message to direct the user to the task logs.
PR: https://github.com/apache/hudi/pull/2907)
> collect() call causing issues with very large upserts
> -----------------------------------------------------
>
> Key: HUDI-1873
> URL: https://issues.apache.org/jira/browse/HUDI-1873
> Project: Apache Hudi
> Issue Type: Bug
> Components: Spark Integration
> Affects Versions: 0.7.0, 0.8.0
> Environment: EMR 5.28 Spark 11
> Reporter: Matt Pouttu
> Priority: Major
> Labels: newbie, pull-request-available
> Fix For: 0.9.0
>
> Original Estimate: 0h
> Remaining Estimate: 0h
>
> A collect call causes resource issues with very large upserts, and is only
> used for reporting error messages that are already in the spark task logs. I
> replaced it with a .isEmpty() call and amended the error message to direct
> the user to the task logs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)