[
https://issues.apache.org/jira/browse/TRAFODION-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839020#comment-15839020
]
ASF GitHub Bot commented on TRAFODION-2455:
-------------------------------------------
Github user selvaganesang commented on a diff in the pull request:
https://github.com/apache/incubator-trafodion/pull/929#discussion_r97919059
--- Diff: core/sql/executor/HBaseClient_JNI.cpp ---
@@ -1319,17 +1334,21 @@ HBC_RetCode HBaseClient_JNI::estimateRowCount(const
char* tblName,
jint jPartialRowSize = partialRowSize;
jint jNumCols = numCols;
+ jint jRetryLimitMilliSeconds = retryLimitMilliSeconds;
jlongArray jRowCount = jenv_->NewLongArray(1);
tsRecentJMFromJNI = JavaMethods_[JM_EST_RC].jm_full_name;
jboolean jresult = jenv_->CallBooleanMethod(javaObj_,
JavaMethods_[JM_EST_RC].methodID,
js_tblName, jPartialRowSize,
- jNumCols, jRowCount);
+ jNumCols,
jRetryLimitMilliSeconds, jRowCount);
jboolean isCopy;
jlong* arrayElems = jenv_->GetLongArrayElements(jRowCount, &isCopy);
rowCount = *arrayElems;
if (isCopy == JNI_TRUE)
jenv_->ReleaseLongArrayElements(jRowCount, arrayElems, JNI_ABORT);
+ jenv_->DeleteLocalRef(js_tblName);
--- End diff --
popLocalFrame would do this for you. Again I cleaned this code earlier to
remove the unnecessary call to DeleteLocalRef if push/pop local frame is used
> Initial Update Stats on 22B row 2.5TB OE table gets 0 rowcount from
> estimator, fails with timeouts by doing select count (*)
> ----------------------------------------------------------------------------------------------------------------------------
>
> Key: TRAFODION-2455
> URL: https://issues.apache.org/jira/browse/TRAFODION-2455
> Project: Apache Trafodion
> Issue Type: Bug
> Components: sql-cmp
> Affects Versions: 2.1-incubating
> Environment: A cluster large enough to host a 22 billion row table
> Reporter: David Wayne Birdsall
> Assignee: David Wayne Birdsall
>
> When loading a scale factor 73728 Order Entry database, if UPDATE STATISTICS
> is done soon after the load on one particular table (the largest table,
> having 22 billion rows), we get the following failure:
> SQLEXCEPTION on Statement, Error Code = -9200
> update statistics for table trafodion.javabench.oe_orderline_73728 on
> every column, (OL_W_ID, OL_I_ID), (OL_D_ID, OL_W_ID), (OL_D_ID, OL_I_ID)
> sample
> *** ERROR[9200] UPDATE STATISTICS for table
> TRAFODION.JAVABENCH.OE_ORDERLINE_73728 encountered an error (8448) from
> statement getRow(). [2017-01-09 02:07:22]
> *** ERROR[8448] Unable to access Hbase interface. Call to
> ExpHbaseInterface::coProcAggr returned error HBASE_ACCESS_ERROR(-706). Cause:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=3, exceptions:
> Mon Jan 09 01:47:21 PST 2017,
> RpcRetryingCaller{globalStartTime=1483954641419, pause=100, retries=3},
> java.io.IOException: Call to nap015.esgyn.local/10.1.10.20:60020 failed on
> local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
> id=73, waitTime=600001, operationTimeout=600000 expired.
> Mon Jan 09 01:57:21 PST 2017,
> RpcRetryingCaller{globalStartTime=1483954641419, pause=100, retries=3},
> java.io.IOException: Call to nap015.esgyn.local/10.1.10.20:60020 failed on
> local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
> id=185, waitTime=600001, operationTimeout=600000 expired.
> Mon Jan 09 02:07:22 PST 2017,
> RpcRetryingCaller{globalStartTime=1483954641419, pause=100, retries=3},
> java.io.IOException: Call to nap015.esgyn.local/10.1.10.20:60020 failed on
> local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
> id=310, waitTime=600001, operationTimeout=600000 expired.
> A subsequent update statistics command succeeds, but these failures take a
> half hour or more.
> Enabling logging for update stats shows that getrowcount returns 0, so update
> stats assumes the table is small enough to do a select count (*). The plan
> for this select count (*) (perhaps suffering from the same issue that causes
> getrowcount to return a non-estimate) chooses the HBase aggregate
> coprocessor. The table in question has 22 billion rows, so the the
> coprocessor isn't a good choice, and the query times out. But the real issue
> is, why can't the table get a rowcount estimate.
> Rerunning UPDATE STATS on this table a few hours later succeeds.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)