steveloughran commented on pull request #33828:
URL: https://github.com/apache/spark/pull/33828#issuecomment-1064048607
oh, also, I'm thinking of making some gcs enhancements which turn off some
checks under __temporary/ paths, breaking "strict" fs semantics but delivering
performance through reduced io
* skipping all overwrite/parent is dir/dest is not a directory checks when
creating a file
* not worrying about recreating parent dir markers after renaming or
deleting files
... etc. S3A will do the same under paths with `__magic` an element above
it, saves a HEAD and a LIST for every parquet file written (it sets
overwrite=false when creating files, for no reason at all)
so you should always use _temporary as one path element in your staging dir
to get any of those benefits
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]