raulcd commented on code in PR #41559:
URL: https://github.com/apache/arrow/pull/41559#discussion_r1910521567


##########
cpp/cmake_modules/DefineOptions.cmake:
##########
@@ -414,6 +414,12 @@ takes precedence over ccache if a storage backend is 
configured" ON)
                 DEPENDS
                 ARROW_FILESYSTEM)
 
+  define_option(ARROW_S3_MODULE

Review Comment:
   we are not enabling this on any of our CI builds at the moment, right? 
Should we have at least a job where we enable it so the `s3fs_module_test` is 
run on CI



##########
cpp/src/arrow/CMakeLists.txt:
##########
@@ -876,6 +876,18 @@ if(ARROW_FILESYSTEM)
     foreach(ARROW_FILESYSTEM_TARGET ${ARROW_FILESYSTEM_TARGETS})
       target_link_libraries(${ARROW_FILESYSTEM_TARGET} PRIVATE 
${AWSSDK_LINK_LIBRARIES})
     endforeach()
+
+    if(ARROW_S3_MODULE)
+      if(NOT ARROW_BUILD_SHARED)
+        message(FATAL_ERROR "ARROW_S3_MODULE without shared libarrow is not 
supported")
+      endif()
+
+      add_library(arrow_s3fs MODULE filesystem/s3fs_module.cc 
filesystem/s3fs.cc)
+      target_link_libraries(arrow_s3fs PRIVATE ${AWSSDK_LINK_LIBRARIES} 
arrow_shared)
+      set_source_files_properties(filesystem/s3fs.cc filesystem/s3fs_module.cc
+                                  PROPERTIES SKIP_PRECOMPILE_HEADERS ON
+                                             SKIP_UNITY_BUILD_INCLUSION ON)

Review Comment:
   Did you had any idea on how to make the module expose the missing 
functionality too? I've been playing around with your PR exposing 
`LoadFileSystemFactories` to pyarrow and running a small test where I can load 
the s3 module from pyarrow using libarrow built without S3 and I can do it but 
I have some missing symbols which I can get if I reload with 
`ctypes.CDLL(libarrow_s3fs_path, mode=ctypes.RTLD_GLOBAL)`.
   Basically this test works using libarrow without S3:
   ```python
   def test_filesystem_from_uri_s3(s3_server):
       # Load libarrow_s3fs.so
       libarrow_s3fs_path = '/home/raulcd/code/libarrow_s3fs.so'
       FileSystem.load_file_system(libarrow_s3fs_path)
       import ctypes
       lib = ctypes.CDLL(libarrow_s3fs_path, mode=ctypes.RTLD_GLOBAL)
       assert lib is not None
   
       host, port, access_key, secret_key = s3_server['connection']
   
       uri = "s3://{}:{}@mybucket/foo/bar?scheme=http&endpoint_override={}:{}"\
             "&allow_bucket_creation=True" \
             .format(access_key, secret_key, host, port)
       fs, path = FileSystem.from_uri(uri)
       assert path == "mybucket/foo/bar"
       fs.create_dir(path)
       [info] = fs.get_file_info([path])
       assert info.path == path
       assert info.type == FileType.Directory
   ```
   I am just trying to understand if you had an idea around the next steps.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to