[ 
https://issues.apache.org/jira/browse/IMPALA-8243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v updated IMPALA-8243:
------------------------------
    Description: 
Following is the full stack from the Catalog server logs.

{noformat}

14:09:29.474424 14829 jni-util.cc:256] java.util.ConcurrentModificationException
java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
java.util.ArrayList$Itr.next(ArrayList.java:851)
org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1449)
org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1278)
org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1144)
org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1062)
org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:919)
org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:815)
org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:862)
org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:759)
org.apache.impala.thrift.TPartialPartitionInfo.write(TPartialPartitionInfo.java:665)
org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:731)
org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:624)
org.apache.impala.thrift.TPartialTableInfo.write(TPartialTableInfo.java:543)
org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:977)
org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:857)
org.apache.impala.thrift.TGetPartialCatalogObjectResponse.write(TGetPartialCatalogObjectResponse.java:739)
org.apache.thrift.TSerializer.serialize(TSerializer.java:79)
org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:233)
{noformat}

It looks like the bug is in the following piece of code.

{noformat}
/**
   * Returns a Hive-compatible partition object that may be used in calls to the
   * metastore.
   */
  public org.apache.hadoop.hive.metastore.api.Partition toHmsPartition() {
    if (cachedMsPartitionDescriptor_ == null) return null;
    Preconditions.checkNotNull(table_.getNonPartitionFieldSchemas());
    // Update the serde library class based on the currently used file format.
    org.apache.hadoop.hive.metastore.api.StorageDescriptor storageDescriptor =
        new org.apache.hadoop.hive.metastore.api.StorageDescriptor(
            table_.getNonPartitionFieldSchemas(),  <===== Reference to the 
actual field schema list.
            getLocation(),
            cachedMsPartitionDescriptor_.sdInputFormat,
            cachedMsPartitionDescriptor_.sdOutputFormat,
            cachedMsPartitionDescriptor_.sdCompressed,
{noformat}

It appears we are leaking a reference to {{nonPartFieldSchemas_}} in to the 
thrift object and once the thread leaves the lock scope, some other thread 
(load() for ex: ) can potentially change the source list and the serialization 
code could throw {{ConcurrentModificationException}}

While the stack above is Catalog-v2 only, it is possible that some other 
threads can race in a similar fashion.

  was:
Following is the full stack from the Catalog server logs.

{noformat}

14:09:29.474424 14829 jni-util.cc:256] java.util.ConcurrentModificationException
java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
java.util.ArrayList$Itr.next(ArrayList.java:851)
org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1449)
org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1278)
org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1144)
org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1062)
org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:919)
org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:815)
org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:862)
org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:759)
org.apache.impala.thrift.TPartialPartitionInfo.write(TPartialPartitionInfo.java:665)
org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:731)
org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:624)
org.apache.impala.thrift.TPartialTableInfo.write(TPartialTableInfo.java:543)
org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:977)
org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:857)
org.apache.impala.thrift.TGetPartialCatalogObjectResponse.write(TGetPartialCatalogObjectResponse.java:739)
org.apache.thrift.TSerializer.serialize(TSerializer.java:79)
org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:233)
{noformat}

It looks like the bug is in the following piece of code.

{noformat}
/**
   * Returns a Hive-compatible partition object that may be used in calls to the
   * metastore.
   */
  public org.apache.hadoop.hive.metastore.api.Partition toHmsPartition() {
    if (cachedMsPartitionDescriptor_ == null) return null;
    Preconditions.checkNotNull(table_.getNonPartitionFieldSchemas());
    // Update the serde library class based on the currently used file format.
    org.apache.hadoop.hive.metastore.api.StorageDescriptor storageDescriptor =
        new org.apache.hadoop.hive.metastore.api.StorageDescriptor(
            table_.getNonPartitionFieldSchemas(),  <===== Reference to the 
actual field schema list.
            getLocation(),
            cachedMsPartitionDescriptor_.sdInputFormat,
            cachedMsPartitionDescriptor_.sdOutputFormat,
            cachedMsPartitionDescriptor_.sdCompressed,
{noformat}

It appears we are leaking a reference to {{nonPartFieldSchemas_}} in to the 
thrift object and once the thread leaves the lock scope, some other thread 
(load() for ex:) can potentially change the source list and the serialization 
code could throw {{ConcurrentModificationException}}


> ConcurrentModificationException in Catalog stress tests
> -------------------------------------------------------
>
>                 Key: IMPALA-8243
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8243
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 3.1.0
>            Reporter: bharath v
>            Assignee: bharath v
>            Priority: Major
>
> Following is the full stack from the Catalog server logs.
> {noformat}
> 14:09:29.474424 14829 jni-util.cc:256] 
> java.util.ConcurrentModificationException
> java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
> java.util.ArrayList$Itr.next(ArrayList.java:851)
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1449)
> org.apache.hadoop.hive.metastore.api.StorageDescriptor$StorageDescriptorStandardScheme.write(StorageDescriptor.java:1278)
> org.apache.hadoop.hive.metastore.api.StorageDescriptor.write(StorageDescriptor.java:1144)
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:1062)
> org.apache.hadoop.hive.metastore.api.Partition$PartitionStandardScheme.write(Partition.java:919)
> org.apache.hadoop.hive.metastore.api.Partition.write(Partition.java:815)
> org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:862)
> org.apache.impala.thrift.TPartialPartitionInfo$TPartialPartitionInfoStandardScheme.write(TPartialPartitionInfo.java:759)
> org.apache.impala.thrift.TPartialPartitionInfo.write(TPartialPartitionInfo.java:665)
> org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:731)
> org.apache.impala.thrift.TPartialTableInfo$TPartialTableInfoStandardScheme.write(TPartialTableInfo.java:624)
> org.apache.impala.thrift.TPartialTableInfo.write(TPartialTableInfo.java:543)
> org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:977)
> org.apache.impala.thrift.TGetPartialCatalogObjectResponse$TGetPartialCatalogObjectResponseStandardScheme.write(TGetPartialCatalogObjectResponse.java:857)
> org.apache.impala.thrift.TGetPartialCatalogObjectResponse.write(TGetPartialCatalogObjectResponse.java:739)
> org.apache.thrift.TSerializer.serialize(TSerializer.java:79)
> org.apache.impala.service.JniCatalog.getPartialCatalogObject(JniCatalog.java:233)
> {noformat}
> It looks like the bug is in the following piece of code.
> {noformat}
> /**
>    * Returns a Hive-compatible partition object that may be used in calls to 
> the
>    * metastore.
>    */
>   public org.apache.hadoop.hive.metastore.api.Partition toHmsPartition() {
>     if (cachedMsPartitionDescriptor_ == null) return null;
>     Preconditions.checkNotNull(table_.getNonPartitionFieldSchemas());
>     // Update the serde library class based on the currently used file format.
>     org.apache.hadoop.hive.metastore.api.StorageDescriptor storageDescriptor =
>         new org.apache.hadoop.hive.metastore.api.StorageDescriptor(
>             table_.getNonPartitionFieldSchemas(),  <===== Reference to the 
> actual field schema list.
>             getLocation(),
>             cachedMsPartitionDescriptor_.sdInputFormat,
>             cachedMsPartitionDescriptor_.sdOutputFormat,
>             cachedMsPartitionDescriptor_.sdCompressed,
> {noformat}
> It appears we are leaking a reference to {{nonPartFieldSchemas_}} in to the 
> thrift object and once the thread leaves the lock scope, some other thread 
> (load() for ex: ) can potentially change the source list and the 
> serialization code could throw {{ConcurrentModificationException}}
> While the stack above is Catalog-v2 only, it is possible that some other 
> threads can race in a similar fashion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to