wangshisan commented on issue #25869: [SPARK-29189][SQL] Add an option to 
ignore block locations when listing file
URL: https://github.com/apache/spark/pull/25869#issuecomment-533809214
 
 
   Yes, I see. A new API call was introduced in #24175 . And it do improve a 
lot. While, the new API will still fetch all the block location informations, 
and in our benchmark, it may consume tens of seconds to fetch all of them for a 
huge table with the new API. 
   In my opinion, if a Spark cluster is deployed totally physically separated 
from a HDFS cluster, we do not need any of such block location information. And 
this is what this PR for.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to