Zinway created IMPALA-12665:
-------------------------------

             Summary: Update ScratchMicroBatch length to new 
scratch_batch_->capacity after ScratchTupleBatch::Reset
                 Key: IMPALA-12665
                 URL: https://issues.apache.org/jira/browse/IMPALA-12665
             Project: IMPALA
          Issue Type: Bug
          Components: be
    Affects Versions: Impala 4.3.0
            Reporter: Zinway


{panel}
Happened when parquet table scanning where row_size > 4096 bytes and row batch 
> 1024.
{panel}
h3. Log with AddressSanitizer

 
{code:java}
==557405==ERROR: AddressSanitizer: heap-buffer-overflow on address 
0x7fa162333408 at pc 0x00000413a68c bp 0x7fa162f2fc10 sp 0x7fa162f2fc08
WRITE of size 4 at 0x7fa162333408 thread T559
    #0  0x413a68b  (/usr/lib/impala/sbin/impalad+0x413a68b)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/parquet-common.h:570
    #1  0x419b76f  (/usr/lib/impala/sbin/impalad+0x419b76f)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/parquet-common.h:616
    #2  0x4199769  (/usr/lib/impala/sbin/impalad+0x4199769)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/parquet-column-readers.cc:864
    #3  0x4195e74  (/usr/lib/impala/sbin/impalad+0x4195e74)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/parquet-column-readers.cc:663
    #4  0x419f719  (/usr/lib/impala/sbin/impalad+0x419f719)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/parquet-column-readers.cc:496
    #5  0x38876d4  (/usr/lib/impala/sbin/impalad+0x38876d4)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/hdfs-parquet-scanner.cc:?
    #6  0x388ef4f  (/usr/lib/impala/sbin/impalad+0x388ef4f)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/hdfs-parquet-scanner.cc:2370
    #7  0x386db0d  (/usr/lib/impala/sbin/impalad+0x386db0d)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/hdfs-parquet-scanner.cc:532
    #8  0x386b7d1  (/usr/lib/impala/sbin/impalad+0x386b7d1)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/hdfs-parquet-scanner.cc:416
    #9  0x3742adf  (/usr/lib/impala/sbin/impalad+0x3742adf)    # addr2line => 
apache-impala-4.3.0/be/src/exec/hdfs-scan-node.cc:495
    #10 0x37418b8  (/usr/lib/impala/sbin/impalad+0x37418b8)    # addr2line => 
apache-impala-4.3.0/be/src/exec/hdfs-scan-node.cc:413
    #11 0x28720f6  (/usr/lib/impala/sbin/impalad+0x28720f6)
    #12 0x33db1ef  (/usr/lib/impala/sbin/impalad+0x33db1ef)
    #13 0x33e74f8  (/usr/lib/impala/sbin/impalad+0x33e74f8)
    #14 0x33e734b  (/usr/lib/impala/sbin/impalad+0x33e734b)
    #15 0x4b016f6  (/usr/lib/impala/sbin/impalad+0x4b016f6)
    #16 0x7fa5a4d1cdd4  (/lib64/libpthread.so.0+0x7dd4)
    #17 0x7fa5a1d0102c  (/lib64/libc.so.6+0xfe02c)0x7fa162333408 is located 8 
bytes to the right of 4193280-byte region [0x7fa161f33800,0x7fa162333400)
allocated by thread T559 here:
    #0  0x1eb956f  (/usr/lib/impala/sbin/impalad+0x1eb956f)    # addr2line => 
??:?
    #1  0x28fe1c3  (/usr/lib/impala/sbin/impalad+0x28fe1c3)    # addr2line => 
apache-impala-4.3.0/be/src/runtime/mem-pool.cc:132
    #2  0x2966b08  (/usr/lib/impala/sbin/impalad+0x2966b08)    # addr2line => 
apache-impala-4.3.0/be/src/runtime/mem-pool.h:295
    #3  0x2961bfd  (/usr/lib/impala/sbin/impalad+0x2961bfd)    # addr2line => 
apache-impala-4.3.0/be/src/runtime/row-batch.cc:528
    #4  0x3818295  (/usr/lib/impala/sbin/impalad+0x3818295)    # addr2line => 
apache-impala-4.3.0/be/src/exec/scratch-tuple-batch.h:92
    #5  0x388ee46  (/usr/lib/impala/sbin/impalad+0x388ee46)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/hdfs-parquet-scanner.cc:2363
    #6  0x386db0d  (/usr/lib/impala/sbin/impalad+0x386db0d)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/hdfs-parquet-scanner.cc:532
    #7  0x386b7d1  (/usr/lib/impala/sbin/impalad+0x386b7d1)    # addr2line => 
apache-impala-4.3.0/be/src/exec/parquet/hdfs-parquet-scanner.cc:416
    #8  0x3742adf  (/usr/lib/impala/sbin/impalad+0x3742adf)    # addr2line => 
apache-impala-4.3.0/be/src/exec/hdfs-scan-node.cc:495
    #9  0x37418b8  (/usr/lib/impala/sbin/impalad+0x37418b8)    # addr2line => 
apache-impala-4.3.0/be/src/exec/hdfs-scan-node.cc:413
    #10 0x28720f6  (/usr/lib/impala/sbin/impalad+0x28720f6)
    #11 0x33db1ef  (/usr/lib/impala/sbin/impalad+0x33db1ef)
    #12 0x33e74f8  (/usr/lib/impala/sbin/impalad+0x33e74f8)
    #13 0x33e734b  (/usr/lib/impala/sbin/impalad+0x33e734b)
    #14 0x4b016f6  (/usr/lib/impala/sbin/impalad+0x4b016f6) {code}
 

 
h3. Fault Reproduction Steps
h4. Prepare data with bash and hive client
{code:sh}
#!/bin/bash

# Table Name
TABLE_NAME="p3"

# Generate Hive SQL to create the table
echo "CREATE TABLE $TABLE_NAME (id INT," > create_table.sql
for i in $(seq 1 600)
do
    if [ $i -ne 600 ]; then
        echo "field$i STRING," >> create_table.sql
    else
        echo "field$i STRING" >> create_table.sql
    fi
done
echo ") STORED AS PARQUET" >> create_table.sql

# Execute the SQL to create the table in Hive
hive -e "$(cat create_table.sql)"

# Generate Hive SQL for inserting data
echo "INSERT INTO $TABLE_NAME SELECT s.id," > insert_data.sql
for i in $(seq 1 600)
do
    if [ $i -ne 600 ]; then
        echo "cast(rand() as string) AS field$i," >> insert_data.sql
    else
        echo "cast(rand() as string) AS field$i" >> insert_data.sql
    fi
done
echo "FROM (SELECT posexplode(SPLIT(REPEAT(' ', 2000), ' ')) AS (id, val) FROM 
(SELECT 1) t) s LIMIT 2000;" >> insert_data.sql

# Execute SQL to insert data in Hive
hive -e "$(cat insert_data.sql)"

 {code}
 
h4. Query with Impala
{code:sql}
SELECT * FROM p3 where field1 = '123'; {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to