I use spark to generate data , then we use hive/pig/presto/spark to analyze
data, but I found even I add used bucketBy and sortBy with bucket number in
Spark, the results files was generate by Spark is always far more than
bucket number under each partition, then Presto can not recognize the
bucket, how can I control that in Spark ?

Unfortunately, I did not find any way to do that.

Thank you.

Adam - App Annie Ops
Phone: +86 18610024053
Email: q...@appannie.com

*This email may contain or reference confidential information and is 
intended only for the individual to whom it is addressed.  Please refrain 
from distributing, disclosing or copying this email and the information 
contained within unless you are the intended recipient.  If you received 
this email in error, please notify us at le...@appannie.com 
<le...@appannie.com>** immediately and remove it from your system.*

Reply via email to