steveloughran commented on PR #34864:
URL: https://github.com/apache/spark/pull/34864#issuecomment-2059907838

   @michaelbilow hadoop s3a is on v2 sdk; the com.amazonaws classes are not on 
the CP and amazon are slowly stopping support. you cannot for example use the 
lower latency S3 express stores with it.
   
   Like I say: I think you would be better off using the Hue file system APIs 
to talk to s3. If there are aspects of s3 storage which aren't available 
through the API -or just very inefficiently due to the effort to preserve the 
Posix metaphor, then lets fix the API so that other stores can offer the same 
features, and other apps can pick up.
   
   For example, here's our ongoing delete API for iceberg and other 
manifest-based tables
   https://github.com/apache/hadoop/pull/6726
   It maps to s3 bulk delete calls, but there's scope to add to other stores 
(we now actually want to add it as a page-size == 1 option for all filesystems 
as it simplifies iceberg integration). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to