Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21608
cc: @wzhfy @gatorsmile
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail:
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21608
ok, can you put the result in the description? Also, can you make the title
more precise? e.g., Parallelize size computation in ANALYZE command
---
Github user Achuth17 commented on the issue:
https://github.com/apache/spark/pull/21608
Yes, In the case where the data is stored in S3 I noticed a significant
difference.
Some rough numbers - When done serially for a table in S3 with 1000
partitions, the calculateTotalSize
Github user maropu commented on the issue:
https://github.com/apache/spark/pull/21608
This pr improves actual performance values? (My question is that the
calculation is a bottleneck?)
---
-
To unsubscribe, e-mail:
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21608
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21608
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/21608
Can one of the admins verify this patch?
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional