Hi, I use spark to generate data , then we use hive/pig/presto/spark to analyze data, but I found even I add used bucketBy and sortBy with bucket number in Spark, the results files was generate by Spark is always far more than bucket number under each partition, then Presto can not recognize the bucket, how can I control that in Spark ?
Unfortunately, I did not find any way to do that. Thank you. -- Adam - App Annie Ops Phone: +86 18610024053 Email: q...@appannie.com -- *This email may contain or reference confidential information and is intended only for the individual to whom it is addressed. Please refrain from distributing, disclosing or copying this email and the information contained within unless you are the intended recipient. If you received this email in error, please notify us at le...@appannie.com <le...@appannie.com>** immediately and remove it from your system.*