HeartSaVioR commented on pull request #32875: URL: https://github.com/apache/spark/pull/32875#issuecomment-1023715717
Oh well... I may need to try hacking the file stream source to support bucketing (regardless of whether it works correctly or not) and check the physical plan. cc. @cloud-fan Could you please help triage that the problem may exist even before this PR? Would using HashClusteredDistribution "force" using Spark's internal hash function on distribution? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
