How does Spark determine in-memory partition count when reading Parquet ~files?

shea.parkes Tue, 18 Oct 2016 19:04:47 -0700

When reading a parquet ~file with >50 parts, Spark is giving me a DataFrame
object with far fewer in-memory partitions.


I'm happy to troubleshoot this further, but I don't know Scala well and
could use some help pointing me in the right direction.  Where should I be
looking in the code base to understand how many partitions will result from
reading a parquet ~file?

Thanks,

Shea



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-does-Spark-determine-in-memory-partition-count-when-reading-Parquet-files-tp27921.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

How does Spark determine in-memory partition count when reading Parquet ~files?

Reply via email to