sezruby commented on code in PR #12226:
URL: https://github.com/apache/gluten/pull/12226#discussion_r3350599505


##########
package/pom.xml:
##########
@@ -121,10 +121,22 @@
                 <relocation>
                   <pattern>org.apache.arrow</pattern>
                   
<shadedPattern>${gluten.shade.packageName}.org.apache.arrow</shadedPattern>
-                  <!--arrow's C and dataset wrapper refers to the original 
class path, so we should not relocate here-->
+                  <!--
+                    arrow's C and dataset wrappers refer to the original class
+                    path, so they must not be relocated. Their public APIs also
+                    take and return org.apache.arrow.memory.* and
+                    org.apache.arrow.vector.* types, so those packages must 
also
+                    stay unshaded — otherwise the bundled (unshaded)
+                    ArrowArrayStream/ArrowSchema get compiled against the
+                    relocated BufferAllocator/VectorSchemaRoot, producing
+                    `NoSuchMethodError` for any caller passing a vanilla
+                    Apache Arrow allocator. See #12225.
+                  -->
                   <excludes>
                     <exclude>org.apache.arrow.c.*</exclude>
                     <exclude>org.apache.arrow.c.jni.*</exclude>
+                    <exclude>org.apache.arrow.memory.**</exclude>
+                    <exclude>org.apache.arrow.vector.**</exclude>

Review Comment:
   The full org.apache.arrow.* exclusion would lose gluten's isolation from the 
user's Arrow version everywhere, not just on the C-Data boundary. The C-Data 
classes have to be unshaded because their JNI native lib hardcodes the original 
class names; arrow.memory.* and arrow.vector.* follow because they appear in 
arrow.c.* public method signatures. Anything else under org.apache.arrow.* 
(flight, algorithm, adapter, etc.) is internal to gluten's columnar batch 
handling and safer to keep shaded so it doesn't conflict with user Arrow. The 
narrow exclusion is the minimum that makes the public C-Data API 
self-consistent without giving up isolation elsewhere.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to