Re: Documentation on "Automatic file coalescing for native data sources"?

Aakash Basu Fri, 19 May 2017 12:14:01 -0700

Hey all,

A reply on this would be great!


Thanks,
A.B.

On 17-May-2017 1:43 AM, "Daniel Siegmann" <dsiegm...@securityscorecard.io>
wrote:

> When using spark.read on a large number of small files, these are
> automatically coalesced into fewer partitions. The only documentation I can
> find on this is in the Spark 2.0.0 release notes, where it simply says (
> http://spark.apache.org/releases/spark-release-2-0-0.html):
>
> "Automatic file coalescing for native data sources"
>
> Can anyone point me to documentation explaining what triggers this
> feature, how it decides how many partitions to coalesce to, and what counts
> as a "native data source"? I couldn't find any mention of this feature in
> the SQL Programming Guide and Google was not helpful.
>
> --
> Daniel Siegmann
> Senior Software Engineer
> *SecurityScorecard Inc.*
> 214 W 29th Street, 5th Floor
> New York, NY 10001
>
>

Re: Documentation on "Automatic file coalescing for native data sources"?

Reply via email to