When reading a parquet ~file with >50 parts, Spark is giving me a DataFrame object with far fewer in-memory partitions.
I'm happy to troubleshoot this further, but I don't know Scala well and could use some help pointing me in the right direction. Where should I be looking in the code base to understand how many partitions will result from reading a parquet ~file? Thanks, Shea -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-does-Spark-determine-in-memory-partition-count-when-reading-Parquet-files-tp27921.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org