advancedxy commented on PR #55:
URL: 
https://github.com/apache/arrow-datafusion-comet/pull/55#issuecomment-1954094484

   > Ah, yes. The failures observed for now are all on X86 macs.
   Let me see if there's other ways to investigate this.
   
   I believe I have found the root cause to x86 mac is failing loading 
libcrypto.dylib.
   
   ## Solutions:
   1. install openssl and setup `DYLD_LIBRARY_PATH` with openssl's actual lib 
path (chosen by this PR)
   2. or passing property 
`commons.crypto.cipher.classes=org.apache.commons.crypto.cipher.JceCipher` to 
loading JCE cipher only for apache commons' crypto lib.
   
   ## Detailed investigations
   Q1: why mac runners are failing with `loading libcrypto in an unsafe way`?
   A1: See this comment: 
https://github.com/cl-plus-ssl/cl-plus-ssl/issues/114#issuecomment-770370592
   
   Q2: why libcrypto.dylib is loaded? Or who is loading it?
   A2: It's loaded by Apache's commons-crypto lib, which is used by Spark to 
enable IO encryption. After enabling `DYLD_PRINT_SEARCHING=true` for mac 
runners, it shows it's loaded by commons-crypto's native lib
   <img width="1536" alt="image" 
src="https://github.com/apache/arrow-datafusion-comet/assets/807537/89d17c78-11f9-4ea2-aeb7-5da2b5463434";>
   Log path: 
https://github.com/apache/arrow-datafusion-comet/actions/runs/7971374594/job/21760923967
   
   Q3: why only X86 runners are failing?
   A3: It's because commons-crypto only bundled with x86's native lib. 
   <img width="359" alt="image" 
src="https://github.com/apache/arrow-datafusion-comet/assets/807537/86ca86d5-6e4e-42df-a404-3b31e58b7773";>
   When running under Silicon Macs, there's no native lib and fails loading. 
However the native code is to provide `OpenSslCipher` only. Spark's IO 
encryption doesn't require OpenSSL. `JceCipher` is sufficient.
   
   Q4: why is it flaky? 
   A4: Not sure. Maybe the dynamic libs are loaded in different for each run. 
And the correct libcrypto.dylib has a chance to be loaded by the JVM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to