the detailed
LocatedBlocks info.
Another finding is, if I read the Parquet file via scala code form
spark-shell as below, it looks fine, the computation will return the result
quick as before.
sqlContext.parquetFile(data/myparquettable)
Any idea about it? Thank you!
--
郑旭东
Zheng, Xudong
it to
executor side and each task only needs to reconcile those part-files it
needs to touch. This is also what the Parquet developers did recently for
parquet-hadoop https://github.com/apache/incubator-parquet-mr/pull/45.
Cheng
On 3/31/15 11:49 PM, Zheng, Xudong wrote:
Thanks Cheng!
Set
Is it possible to set the number of cores per executor on standalone
cluster?
Because we find that, cores distribution may be very skewed on executor at
some time, so the workload is skewed, that make our job become slow.
Thanks!
--
郑旭东
Zheng, Xudong