DenisTarasyuk opened a new issue, #40839:
URL: https://github.com/apache/arrow/issues/40839

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Using Java to build Projector and evalute it. Projector calls function sinh 
with double precision Float argument. 
   Running built jar inside docker on ubuntu:latest aarch64 image. Using 
openjdk 11 jdk and temurin 11 jdk.
   On openjdk I am getting valid result.
   ON temurin jdk I am getting SIGSEGV (0xb) at pc=0x0000000000000000, 
pid=6522, tid=6523
   Same reproduces on aarch64 docker image openjdk:11 
   Issue seems to be related to libc or some java libs dependencies.
   Also reproduces for functions: 
sqrt,cbrt,exp,log,log10,sin,asin,cos,acos,tan,atan,cosh,tanh,atan2,cot,radians,degrees,power.
   I have used debugger to investigate llvm generated projector code and found 
that when I get SIGSEGV generated function has code like this:
   ```
   expr_0_0:
   0000ffff4cd52000:   stp     x30, x21, [sp, #-32]!  
   0000ffff4cd52004:   ldr     x8, [x1, #24]. 
   0000ffff4cd52008:   cmp     x6, #0x1
   0000ffff4cd5200c:   ldr     x9, [x0, #24]
   0000ffff4cd52010:   stp     x20, x19, [sp, #16]
   0000ffff4cd52014:   csinc   x19, x6, xzr, gt
   0000ffff4cd52018:   ldr     x20, [x0]
   0000ffff4cd5201c:   add     x21, x9, x8, lsl #3
   0000ffff4cd52020:   ldr     x0, [x21], #8
   0000ffff4cd52024:   bl      0xffff4cd5204c
   0000ffff4cd52028:   bl      0xffff4cd52060
   0000ffff4cd5202c:   bl      0xffff4cd52074
   0000ffff4cd52030:   subs    x19, x19, #0x1
   0000ffff4cd52034:   str     d0, [x20], #8
   0000ffff4cd52038:   b.ne    0xffff4cd52020 <expr_0_0+32>  // b.any
   0000ffff4cd5203c:   ldp     x20, x19, [sp, #16]
   0000ffff4cd52040:   mov     w0, wzr
   0000ffff4cd52044:   ldp     x30, x21, [sp], #32
   0000ffff4cd52048:   ret     
   0000ffff4cd5204c:   movz    x16, #0x0, lsl #48
   0000ffff4cd52050:   movk    x16, #0x0, lsl #32
   0000ffff4cd52054:   movk    x16, #0x0, lsl #16
   0000ffff4cd52058:   movk    x16, #0x0
   0000ffff4cd5205c:   br      x16
   0000ffff4cd52060:   movz    x16, #0x0, lsl #48
   0000ffff4cd52064:   movk    x16, #0x0, lsl #32
   0000ffff4cd52068:   movk    x16, #0x0, lsl #16
   0000ffff4cd5206c:   movk    x16, #0x0
   0000ffff4cd52070:   br      x16
   0000ffff4cd52074:   movz    x16, #0x0, lsl #48
   0000ffff4cd52078:   movk    x16, #0x0, lsl #32
   0000ffff4cd5207c:   movk    x16, #0x0, lsl #16
   0000ffff4cd52080:   movk    x16, #0x0
   0000ffff4cd52084:   br      x16
   ```
   
   Here #0x0 is where actual address of external functions should be but it is 
not populated so code jumps to 0x0000000000000000.
   Here is example of valid llvm function that works:
   
   ```
   expr_0_0:
   0000ffff86635000:   stp     x30, x21, [sp, #-32]!
   0000ffff86635004:   ldr     x8, [x1, #24]
   0000ffff86635008:   cmp     x6, #0x1
   0000ffff8663500c:   ldr     x9, [x0, #24]
   0000ffff86635010:   stp     x20, x19, [sp, #16]
   0000ffff86635014:   csinc   x19, x6, xzr, gt
   0000ffff86635018:   ldr     x20, [x0]
   0000ffff8663501c:   add     x21, x9, x8, lsl #3
   0000ffff86635020:   ldr     x0, [x21], #8
   0000ffff86635024:   bl      0xffff8663504c
   0000ffff86635028:   bl      0xffff86635060
   0000ffff8663502c:   bl      0xffff86635074
   0000ffff86635030:   subs    x19, x19, #0x1
   0000ffff86635034:   str     d0, [x20], #8
   0000ffff86635038:   b.ne    0xffff86635020 <expr_0_0+32>  // b.any
   0000ffff8663503c:   ldp     x20, x19, [sp, #16]
   0000ffff86635040:   mov     w0, wzr
   0000ffff86635044:   ldp     x30, x21, [sp], #32
   0000ffff86635048:   ret     
   0000ffff8663504c:   movz    x16, #0x0, lsl #48
   0000ffff86635050:   movk    x16, #0xffff, lsl #32
   0000ffff86635054:   movk    x16, #0x8661, lsl #16
   0000ffff86635058:   movk    x16, #0xa4a0
   0000ffff8663505c:   br      x16
   0000ffff86635060:   movz    x16, #0x0, lsl #48
   0000ffff86635064:   movk    x16, #0xffff, lsl #32
   0000ffff86635068:   movk    x16, #0x8664, lsl #16
   0000ffff8663506c:   movk    x16, #0xe6c0
   0000ffff86635070:   br      x16
   0000ffff86635074:   movz    x16, #0x0, lsl #48
   0000ffff86635078:   movk    x16, #0xffff, lsl #32
   0000ffff8663507c:   movk    x16, #0x8661, lsl #16
   0000ffff86635080:   movk    x16, #0xb4d0
   0000ffff86635084:   br      x16
   ```
        
   This does not seem to happen on x86. Both openjdk and temurin works there.
   Here is example java code I have used to reproduce:
   ```
   
       package org.example;
       import com.google.common.collect.Lists;
       import java.math.BigDecimal;
       import java.util.ArrayList;
       import java.util.Arrays;
       import java.util.List;
   
       import org.apache.arrow.gandiva.evaluator.Projector;
       import org.apache.arrow.gandiva.exceptions.GandivaException;
       import org.apache.arrow.gandiva.expression.ExpressionTree;
       import org.apache.arrow.gandiva.expression.TreeBuilder;
       import org.apache.arrow.memory.ArrowBuf;
       import org.apache.arrow.memory.BufferAllocator;
       import org.apache.arrow.memory.RootAllocator;
       import org.apache.arrow.vector.DecimalVector;
       import org.apache.arrow.vector.Float8Vector;
       import org.apache.arrow.vector.ValueVector;
       import org.apache.arrow.vector.ipc.message.ArrowFieldNode;
       import org.apache.arrow.vector.ipc.message.ArrowRecordBatch;
       import org.apache.arrow.vector.types.FloatingPointPrecision;
       import org.apache.arrow.vector.types.pojo.ArrowType.Decimal;
       import org.apache.arrow.vector.types.pojo.ArrowType.FloatingPoint;
       import org.apache.arrow.vector.types.pojo.Field;
               import org.apache.arrow.vector.types.pojo.Schema;
   
       public class Main {
   
       public static void main(String[] _args) throws GandivaException {
           BufferAllocator allocator= new RootAllocator(Long.MAX_VALUE);
           FloatingPoint fPoint64 = new 
FloatingPoint(FloatingPointPrecision.DOUBLE);
           Field aFp = Field.nullable("a", fPoint64);
           List<Field> args = Lists.newArrayList(aFp);
   
           List<ExpressionTree> exprs = new ArrayList<>(
                   Arrays.asList(
                           TreeBuilder.makeExpression(
                                   TreeBuilder.makeFunction("sinh",
                                           
Lists.newArrayList(TreeBuilder.makeField(aFp)),
                                           fPoint64),
                                   Field.nullable("f_field", fPoint64))
                   )
           );
   
           Schema schema = new Schema(args);
           Projector eval = Projector.make(schema, exprs);
   
           List<ValueVector> output = null;
           ArrowRecordBatch batch = null;
           try {
               int numRows = 4;
               String[] aValues = new String[]{"1.23", "1.58", "-1.23", 
"-1.58"};
   
               Float8Vector valuesa = floatVector(allocator, aValues);
               batch =
                       new ArrowRecordBatch(
                               numRows,
                               Lists.newArrayList(new ArrowFieldNode(numRows, 
0)),
                               Lists.newArrayList(valuesa.getValidityBuffer(), 
valuesa.getDataBuffer()));
   
   
               // Allocate output vectors.
               output = new ArrayList<>(
                       Arrays.asList(
                               new Float8Vector("f_field", allocator)
                       )
               );
               for (ValueVector v : output) {
                   v.allocateNew();
               }
   
               // evaluate expressions.
               eval.evaluate(batch, output);
   
               // compare the outputs.
               for (int idx = 0; idx < output.size(); ++idx) {
                   Float8Vector resultVector = (Float8Vector) output.get(idx);
   
                   for (int i = 0; i < numRows; i++) {
                       System.out.println("value: "+resultVector.getObject(i));
                   }
               }
           } finally {
               // free buffers
               if (batch != null) {
                   releaseRecordBatch(batch);
               }
               if (output != null) {
                   releaseValueVectors(output);
               }
               eval.close();
               allocator.close();
           }
       }
   
       static Float8Vector floatVector(BufferAllocator allocator, String[] 
values) {
           Float8Vector vector = new Float8Vector("double" + Math.random(), 
allocator);
           vector.allocateNew();
           for (int i = 0; i < values.length; i++) {
               Double doubleValue = new Double(values[i]);
               vector.setSafe(i, doubleValue);
           }
   
           vector.setValueCount(values.length);
           return vector;
       }
   
       static void releaseRecordBatch(ArrowRecordBatch recordBatch) {
           // There are 2 references to the buffers
           // One in the recordBatch - release that by calling close()
           // One in the allocator - release that explicitly
           List<ArrowBuf> buffers = recordBatch.getBuffers();
           recordBatch.close();
           for (ArrowBuf buf : buffers) {
               buf.getReferenceManager().release();
           }
       }
   
       static void releaseValueVectors(List<ValueVector> valueVectors) {
           for (ValueVector valueVector : valueVectors) {
               valueVector.close();
           }
       }
   }
   ```
   
   pom.xml
   ```
   <?xml version="1.0" encoding="UTF-8"?>
   <project xmlns="http://maven.apache.org/POM/4.0.0";
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
            xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
       <modelVersion>4.0.0</modelVersion>
   
       <groupId>org.example</groupId>
       <artifactId>test</artifactId>
       <version>1.0-SNAPSHOT</version>
   
       <properties>
           <maven.compiler.source>11</maven.compiler.source>
           <maven.compiler.target>11</maven.compiler.target>
           <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
           <arrow.version>14.0.2</arrow.version>
       </properties>
       <dependencies>
   
           <dependency>
               <groupId>com.googlecode.json-simple</groupId>
               <artifactId>json-simple</artifactId>
               <version>1.1</version>
           </dependency>
           <dependency>
               <groupId>org.apache.arrow.gandiva</groupId>
               <artifactId>arrow-gandiva</artifactId>
               <version>${arrow.version}</version>
           </dependency>
           <dependency>
               <groupId>org.apache.arrow</groupId>
               <artifactId>arrow-vector</artifactId>
               <version>${arrow.version}</version>
           </dependency>
           <dependency>
               <groupId>org.apache.arrow</groupId>
               <artifactId>arrow-memory-netty</artifactId>
               <version>${arrow.version}</version>
           </dependency>
           <dependency>
               <groupId>junit</groupId>
               <artifactId>junit</artifactId>
               <version>4.4</version>
               <scope>test</scope>
           </dependency>
   
       </dependencies>
       <build>
           <plugins>
               <plugin>
                   <groupId>org.apache.maven.plugins</groupId>
                   <artifactId>maven-shade-plugin</artifactId>
                   <version>3.5.2</version>
                   <executions>
                       <execution>
                           <phase>package</phase>
                           <goals>
                               <goal>shade</goal>
                           </goals>
                           <configuration>
                               <transformers>
                                   <transformer 
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                       <mainClass>org.example.Main</mainClass>
                                   </transformer>
                               </transformers>
                           </configuration>
                       </execution>
                   </executions>
               </plugin>
           </plugins>
       </build>
   
   </project>
   ```
   
   Here is ldd output:
   openjdk
   ```
        linux-vdso.so.1 (0x0000ffffa48da000)
        libz.so.1 => /lib/aarch64-linux-gnu/libz.so.1 (0x0000ffffa4850000)
        libjli.so => 
/usr/lib/jvm/java-11-openjdk-arm64/bin/../lib/jli/libjli.so (0x0000ffffa4820000)
        libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffffa4670000)
        /lib/ld-linux-aarch64.so.1 (0x0000ffffa48a1000)
   ```
   temurin jdk
   
   ```
        linux-vdso.so.1 (0x0000ffff9f62d000)
        libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 
(0x0000ffff9f5b0000)
        libjli.so => /usr/lib/jvm/temurin-11-jdk-arm64/bin/../lib/jli/libjli.so 
(0x0000ffff9f570000)
        libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000ffff9f550000)
        libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000ffff9f3a0000)
        /lib/ld-linux-aarch64.so.1 (0x0000ffff9f5f4000)
   ```
   
   ### Component(s)
   
   C++ - Gandiva


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to