Quanlong Huang created IMPALA-12855:
---------------------------------------

             Summary: NullPointerException in firing RELOAD events if the 
partition is just dropped
                 Key: IMPALA-12855
                 URL: https://issues.apache.org/jira/browse/IMPALA-12855
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
            Reporter: Quanlong Huang


REFRESH <table> PARTITION could fail in firing RELOAD events (when 
--enable_reload_events=true) if the partition is dropped by a concurrent DDL. 
The failure is a NullPointerException:
{noformat}
E0229 15:04:25.578933  7381 JniUtil.java:183] 
824a23c46a6f71de:78a2f3dc00000000] Error in REFRESH TABLE default.part_tbl 
PARTITIONS issued by quanlong. Time spent: 1s061ms
I0229 15:04:25.579373  7381 jni-util.cc:302] 824a23c46a6f71de:78a2f3dc00000000] 
java.lang.NullPointerException
        at 
org.apache.impala.catalog.HdfsPartition.access$500(HdfsPartition.java:101)
        at 
org.apache.impala.catalog.HdfsPartition$Builder.<init>(HdfsPartition.java:1314)
        at 
org.apache.impala.service.CatalogOpExecutor.fireReloadEventAndUpdateRefreshEventId(CatalogOpExecutor.java:6810)
        at 
org.apache.impala.service.CatalogOpExecutor.execResetMetadataImpl(CatalogOpExecutor.java:6744)
        at 
org.apache.impala.service.CatalogOpExecutor.execResetMetadata(CatalogOpExecutor.java:6612)
        at 
org.apache.impala.service.JniCatalog.lambda$resetMetadata$4(JniCatalog.java:327)
        at 
org.apache.impala.service.JniCatalogOp.lambda$execAndSerialize$1(JniCatalogOp.java:90)
        at org.apache.impala.service.JniCatalogOp.execOp(JniCatalogOp.java:58)
        at 
org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:89)
        at 
org.apache.impala.service.JniCatalogOp.execAndSerialize(JniCatalogOp.java:100)
        at 
org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:243)
        at 
org.apache.impala.service.JniCatalog.execAndSerialize(JniCatalog.java:257)
        at 
org.apache.impala.service.JniCatalog.resetMetadata(JniCatalog.java:326){noformat}
The problem is that in the implementation of execResetMetadataImpl(), the table 
lock is not held all the way. Instead, it's held when reloading the metadata 
then released, and held again when we need to fire RELOAD events. In the time 
between these, the partition could be dropped by concurrent DDL. Then firing 
the RELOAD events failed by not finding the partition.

*Reproducing the issue*

For how to reproduce the issue, start catalogd with --enable_reload_events=true
{code:bash}
bin/start-impala-cluster.py --catalogd_args="--enable_reload_events=true"{code}
Create a partitioned table
{code:sql}
create table part_tbl (i int) partitioned by (p int);{code}
Run a loop to ADD+DROP partition on this table
{code:bash}
while true; do impala-shell.sh --quiet -B -q "ALTER TABLE part_tbl ADD 
PARTITION (p=1); ALTER TABLE part_tbl DROP PARTITION (p=1);" > /dev/null; 
done{code}
In another session, run a loop to REFRESH the partition
{code:bash}
while true; do impala-shell.sh --quiet -B -q "REFRESH part_tbl PARTITION (p=1)" 
> /dev/null; done{code}
After a while, some REFRESH would fail:
{noformat}
Could not execute command: REFRESH part_tbl PARTITION (p=1)
ERROR: NullPointerException: null{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to