Weichen Xu created SPARK-28366:
----------------------------------
Summary: Logging in driver when loading single large gzipped file
via sc.textFile
Key: SPARK-28366
URL: https://issues.apache.org/jira/browse/SPARK-28366
Project: Spark
Issue Type: Improvement
Components: Spark Core
Affects Versions: 2.4.3
Reporter: Weichen Xu
For a large gzipped file, since they are not splittable, spark have to use only
one partition task to read and decompress it. This could be very slow.
We should log for this case in driver side.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]