raulcd commented on PR #50195:
URL: https://github.com/apache/arrow/pull/50195#issuecomment-4771317143

   @pitrou @kou I've been working on splitting the S3 library (and the AWS SDK) 
outside `libarrow.so` into its own library `libarrow_s3.so`.
   On this PR I am just moving the AWS-SDK and the s3 filesystem related source 
into its own library. Any user trying to leverage it would require linking 
against it in order to use it (as we do with bindings) or dlopen via 
`LoadFileSystemFactories` (path), which would register at load time, no link 
dependency.
   On this PR I am not planning on moving our existing bindings to the 
`FileSystemFromUriAndOptions` and `LoadFileSystemFactories` path. I am also not 
sure we should do that. I think that path is good for a user that doesn't want 
to link against `libarrow_s3.so` and have the same functionality but probably 
not what the majority of users (and our internal bindings) should do? What are 
your thoughts on that?
   As per the size of the artifacts with the new code the size of libarrow.so 
and libarrow_s3.so:
   ```bash
   $ ls -lhL libarrow.so libarrow_s3.so
   -rwxrwxr-x 1 raulcd raulcd  59M Jun 22 19:30 libarrow_s3.so
   -rwxrwxr-x 1 raulcd raulcd 317M Jun 22 19:29 libarrow.so
   ```
   And we can see AWS symbols aren't present on libarrow.so
   ```bash
   $  nm -C libarrow.so | grep -c "Aws::"
   0
   $ nm -C libarrow_s3.so | grep -c "Aws::"
   33991
   ```
   With current main libarrow.so size and it contains AWS SDK symbols:
   ```bash
   $ ls -lhL libarrow.so
   -rwxrwxr-x 1 raulcd raulcd 368M Jun 22 19:45 libarrow.so
   $ ls -lhL libarrow_s3.so
   ls: cannot access 'libarrow_s3.so': No such file or directory
   $ nm -C libarrow.so | grep -c "Aws::"
   33991
   ```
   Those are debug builds but as a summary:
   `libarrow.so` goes from 368M to 317M (~51M smaller), and AWS (33,991 
symbols) move entirely into the new 59M `libarrow_s3.so`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to