[
https://issues.apache.org/jira/browse/TRAFODION-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320835#comment-15320835
]
David Wayne Birdsall commented on TRAFODION-2041:
-------------------------------------------------
UPDATE STATISTICS requires an estimate of row count before it selects an
algorithm to use for computing histograms. If the row count is small, it will
choose a very fast in-memory algorithm. If the row count is larger but the
sample size will fit in memory, it will use a sampling query that reads data
directly into memory. If the sample size will not fit in memory, it creates a
sample table, and performs an UPSERT to that, capturing a sample of the data.
It then calculates the histograms a few columns at a time; as many as will fit
in memory.
If sampling is not chosen, UPDATE STATISTICS uses a select count(*) to obtain
the row count. This is typically done only for smaller tables.
If sampling is chosen, UPDATE STATISTICS uses a Java routine,
HBaseClient.estimateRowCount, which uses server-side HBase code to estimate the
row count. It looks at size information stored within the file, and samples a
few rows to estimate how many cells there are per row, and from knowledge of
the Trafodion metadata, estimates the number of rows.
On HDP 2.3.4, however, this routine is failing with a Java OutOfMemory
exception. The exception is not reported to the UPDATE STATISTICS code (the
layers between HBaseClient.estimateRowCount and the UPDATE STATISTICS code
silently discard it), however the estimate of rows returned to UPDATE
STATISTICS is zero. In this case, UPDATE STATISTICS does a select count(*)
instead, as the error in using a small estimate is prohibitively large.
Unfortunately, this is happening when in fact the tables are large. And select
count(*) on large tables is a very slow operation that tends to bottleneck all
the RegionServers in a cluster.
Here are the details:
NATable::estimateHBaseRowCount() calls ExpHbaseInterface_JNI::estimateRowCount,
which calls HBaseClient_JNI::estimateRowCount. That function is returning an
error HBC_ERROR_ROWCOUNT_EST_EXCEPTION, which the higher callers ignore. They
just use a zero row count in that case, which triggers the “select count(*)”.
Not unreasonable; it does ultimately give the correct result.
While I was still in debug in HBaseClient_JNI::estimateRowCount, though, I had
a look at the exception details that get saved off. (In my stepping, I observed
that we went through this code path:
if (jenv_->ExceptionCheck())
{
getExceptionDetails();
logError(CAT_SQL_HBASE, __FILE__, __LINE__);
logError(CAT_SQL_HBASE, "HBaseClient_JNI::estimateRowCount()",
getLastError());
jenv_->PopLocalFrame(NULL);
return HBC_ERROR_ROWCOUNT_EST_EXCEPTION;
}
Now, by looking at cli_globals->getJniErrorStr(), one can see the exception
details captured.)
Here are the exception details:
(gdb) set print elements 1000
(gdb) p cli_globals->getJniErrorStr()
$25 = {<NABasicObject> = {_vptr.NABasicObject = 0x7f6c2ebaa9d0,
h_ = 0x7f6c1b06f068}, fbstring_ = {static npos = <optimized out>,
store_ = {heap_ = 0x0, {
small_ =
"x\227'\003\000\000\000\000\257\003\000\000\000\000\000\000\360\003\000\000\000\000\000@",
ml_ = {
data_ = 0x3279778 "\njava.lang.OutOfMemoryError: Direct buffer
memory\njava.nio.Bits.reserveMemory(Bits.java:658)\njava.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)\njava.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)\norg.apache.hadoop.hbase.util.ByteBufferArray.<init>(ByteBufferArray.java:65)\norg.apache.hadoop.hbase.io.hfile.bucket.ByteBufferIOEngine.<init>(ByteBufferIOEngine.java:47)\norg.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:307)\norg.apache.hadoop.hbase.io.hfile.bucket.BucketCache.<init>(BucketCache.java:217)\norg.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:614)\norg.apache.hadoop.hbase.io.hfile.CacheConfig.getL2(CacheConfig.java:553)\norg.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:637)\norg.apache.hadoop.hbase.io.hfile.CacheConfig.<init>(CacheConfig.java:231)\norg.trafodion.sql.HBaseClient.estimateRowCount(HBaseClient.java:1155)\n",
size_ = 943, capacity_ = 4611686018427388912}}}}}
(gdb)
Analyzing the code in this exception call stack suggests that the cause is we
are creating an HBase offheap bucket cache, and there is not enough memory to
do so. The choice of what kind of cache to create is determined by the setting
of the hbase.bucketcache.ioengine property in hbase-site.xml. Evidently, on HDP
2.3.4, this setting is "offheap", while on other distributions we support it is
"heap". So, the speculation is that it is this "offheap" setting that causes
the problem.
In general, though, this kind of issue raises the question of why we are
calling server-side APIs in client-side logic. It is probably better
architecturally to package this logic in a co-processor, and invoke it in the
server. That seems like the right long-term approach.
For the moment, though, a quick solution is needed. We can add logic to
HBaseClient.estimateRowCount to override the configuration as follows:
String ioEngine = config.get(BUCKET_CACHE_IOENGINE_KEY,null);
if (ioEngine != null)
{
if (ioEngine.startsWith("offheap"))
{
config.set(BUCKET_CACHE_IOENGINE_KEY,"heap");
}
}
The constant BUCKET_CACHE_IOENGINE_KEY ("hbase.bucketcache.ioengine") comes
from the HBase module HConstants.java.
> UPDATE STATS uses select count(*) on large tables on HDP 2.3.4
> --------------------------------------------------------------
>
> Key: TRAFODION-2041
> URL: https://issues.apache.org/jira/browse/TRAFODION-2041
> Project: Apache Trafodion
> Issue Type: Bug
> Components: sql-cmp
> Affects Versions: 2.1-incubating
> Reporter: David Wayne Birdsall
> Assignee: David Wayne Birdsall
> Priority: Critical
> Fix For: 2.1-incubating
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)