Hi folks, I've got an Impala CDH 5.14.2 cluster with a handful of users, 2-3, at any one time. All of a sudden the JVM inside the Impalad started running out of memory.
I got a heap dump, but the heap was 32GB, host is 240GB, so it's very large. Thus I wasn't able to get Memory Analyzer Tool (MAT) to open it. I was able to get JHAT to opening it when setting JHAT's heap to 160GB. It's pretty unwieldy so much of the JHAT functionality doesn't work. I am spelunking around, but really curious if there is some places I should check.... I am only an occasional reader of Impala source so I am just pointing out things which felt interesting: * Impalad was restarted shortly before the JVM OOM * Joining Parquet on S3 with Kudu * Only 13 instances of org.apache.impala.catalog.HdfsTable * 176836 instances of org.apache.impala.analysis.Analyzer - this feels odd to me. I remember one bug a while back in Hive when it would clone the query tree until it ran OOM. * 176796 of those _user fields point at the same user * org.apache.impala.thrift.TQueryCt@0x7f90975297f8 has 11048 org.apache.impala.analysis.Analyzer@GlobalState objects pointing at it. * There is only a single instance of org.apache.impala.thrift.TQueryCtx alive in the JVM which appears to indicate there is only a single query running. I've tracked that query down in CM. The users need to compute stats, but I don't feel that is relevant to this JVM OOM condition. Any pointers on what I might look for? Cheers, Brock