[
https://issues.apache.org/jira/browse/TRAFODION-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15634120#comment-15634120
]
ASF GitHub Bot commented on TRAFODION-2322:
-------------------------------------------
Github user DaveBirdsall commented on a diff in the pull request:
https://github.com/apache/incubator-trafodion/pull/811#discussion_r86434531
--- Diff: core/sql/ustat/hs_cli.cpp ---
@@ -5442,12 +5427,17 @@ NAString
HSSample::getTempTablePartitionInfo(NABoolean unpartitionedSample,
//
void HSSample::addTruncatedSelectList(NAString & qry)
{
+ bool first = true;
for (Lng32 i = 0; i < objDef->getNumCols(); i++)
{
- if (i)
- qry += ", ";
+ if (!ComTrafReservedColName(*objDef->getColInfo(i).colname))
--- End diff --
This is doable, but with some significant work. UPDATE STATS today gets its
column information by doing a prepare of SELECT [SYSKEY,] * FROM table_name,
then doing a CLI describe of that. The CLI doesn't tell us today whether a
column is a system column or not. And we probably don't want to go to that
trouble, since that would involve enhancing the compiler to produce another
synthesized attribute in the compiled plan. Instead, I'd propose replacing this
prepare/describe scheme with a query to the metadata COLUMNS table. I think we
already have the object UID so a single-table query of COLUMNS should be
possible. We should be able to then easily get all the information we get today
along with "user vs. system" column state. Alternatively we may be able to get
the column information from the NATableDB cache. Either way, I think these
solutions are likely to be more efficient than the existing logic, so that
would be an added bonus of such redesign. I will consider this for follow-on
work.
> UPDATE STATS for Hive TPC-H Lineitem table takes much longer now
> ----------------------------------------------------------------
>
> Key: TRAFODION-2322
> URL: https://issues.apache.org/jira/browse/TRAFODION-2322
> Project: Apache Trafodion
> Issue Type: Bug
> Components: sql-cmp
> Affects Versions: 2.0-incubating
> Environment: All
> Reporter: David Wayne Birdsall
> Assignee: David Wayne Birdsall
>
> When using a LINEITEM table with about 12 million rows, and storing that
> LINEITEM table in Hive files, UPDATE STATISTICS has regressed in its
> performance. On one test system, the elapsed time changed from 6 minutes 20
> seconds to 31 minutes 31 seconds.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)