[ 
https://issues.apache.org/jira/browse/IMPALA-9887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17145815#comment-17145815
 ] 

Joe McDonnell edited comment on IMPALA-9887 at 6/25/20, 8:55 PM:
-----------------------------------------------------------------

I did a test run where I split EE test execution into 5 shards (serial + 4 
parallel shards) and restarted Impala between the shards. This runs ee tests in 
about 5 hours. They ran in about 12 hours prior to the current troubles. So, 
clearly, things are getting much slower over time. We could use this sharding 
for ASAN runs.

It sounds like Chrome ran into a similar issue with v8 JIT compilation:

[https://gn.googlesource.com/gn/+/e375226345b2be2bfc8c4549702ee3e437b5c136/build/sanitizers/sanitizer_options.cc#33]

[https://github.com/google/sanitizers/issues/177]

They use malloc_context_size=5, but that doesn't help us (the degradation is 
still there). The problem with setting malloc_context_size to an even lower 
value is that it makes the errors less useful. The degradation happens with 
malloc_context_size=3, but it doesn't happen on malloc_context_size=2.


was (Author: joemcdonnell):
I did a test run where I split EE test execution into 5 shards (serial + 4 
parallel shards) and restarted Impala between the shards. This runs ee tests in 
about 5 hours. They ran in about 12 hours prior to the current troubles. So, 
clearly, things are getting much slower over time. We could use this sharding 
for ASAN runs.

It sounds like Chrome ran into a similar issue with v8 JIT compilation:

[https://gn.googlesource.com/gn/+/e375226345b2be2bfc8c4549702ee3e437b5c136/build/sanitizers/sanitizer_options.cc#33]

[https://github.com/google/sanitizers/issues/177]

They use malloc_context_size=5, but that doesn't help us (the degradation is 
still there). The problem with setting malloc_context_size to an even lower 
value is that it makes the errors less useful. 

> ASAN builds timeout frequently
> ------------------------------
>
>                 Key: IMPALA-9887
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9887
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 4.0
>            Reporter: Vihang Karajgaonkar
>            Priority: Blocker
>              Labels: broken-build
>
> It has happened atleast couple of times in this week on the ASAN builds. The 
> custom cluster tests fails with the test setup error and logs suggest that 
> coordinator nodes don't start up due to the following exception trace:
> {noformat}
> F0623 17:22:42.725920 25786 frontend.cc:136] IllegalStateException: 
> java.lang.RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> CAUSED BY: RuntimeException: Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> CAUSED BY: InvocationTargetException: null
> CAUSED BY: MetaException: Could not connect to meta store using any of the 
> URIs provided. Most recent failure: 
> org.apache.thrift.transport.TTransportException: java.net.ConnectException: 
> Connection refused (Connection refused)
>         at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:631)
>         at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:241)
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>         at 
> org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:84)
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:95)
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:148)
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:119)
>         at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:112)
>         at 
> org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:99)
>         at 
> org.apache.impala.catalog.MetaStoreClientPool$MetaStoreClient.<init>(MetaStoreClientPool.java:78)
>         at 
> org.apache.impala.catalog.MetaStoreClientPool.initClients(MetaStoreClientPool.java:174)
>         at 
> org.apache.impala.catalog.MetaStoreClientPool.<init>(MetaStoreClientPool.java:163)
>         at 
> org.apache.impala.catalog.MetaStoreClientPool.<init>(MetaStoreClientPool.java:155)
>         at org.apache.impala.service.Frontend.<init>(Frontend.java:331)
>         at org.apache.impala.service.Frontend.<init>(Frontend.java:288)
>         at org.apache.impala.service.JniFrontend.<init>(JniFrontend.java:144)
> Caused by: java.net.ConnectException: Connection refused (Connection refused)
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>         at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>         at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>         at java.net.Socket.connect(Socket.java:589)
>         at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
>         ... 19 more
> . Impalad exiting.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to