[ 
https://issues.apache.org/jira/browse/TRAFODION-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955838#comment-15955838
 ] 

ASF GitHub Bot commented on TRAFODION-2575:
-------------------------------------------

GitHub user DaveBirdsall opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/1044

    [TRAFODION-2575] Truncate long strings when computing USTAT aggregates

    This change affects the SELECTs that UPDATE STATISTICS generates when 
computing aggregates on long character and varchar columns. We now truncate the 
string before doing aggregation. (Note: We already do this in other code paths 
in UPDATE STATISTICS -- namely internal sorts and sample table population. This 
particular code path was overlooked.)
    
    Besides improving performance, this will avoid UPDATE STATISTICS failing 
due to lingering engine issues concerning long varchars and sorts.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/DaveBirdsall/incubator-trafodion Trafodion2575

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/1044.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1044
    
----
commit ca7c9c589eb01497ed7b24a83da812a75361999f
Author: Dave Birdsall <[email protected]>
Date:   2017-04-04T21:14:47Z

    [TRAFODION-2575] Truncate long strings when computing USTAT aggregates

----


> UPDATE STATS sometimes fails on very long varchars
> --------------------------------------------------
>
>                 Key: TRAFODION-2575
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2575
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>    Affects Versions: 2.2-incubating
>            Reporter: David Wayne Birdsall
>            Assignee: David Wayne Birdsall
>
> One symptom is observing 6003 or 6004 warnings after UPDATE STATISTICS is 
> done. That is, UPDATE STATISTICS seems to succeed, but when repeated it fails 
> with an error 9200, and 6003 or 6004 warnings. These warnings indicate that 
> the histogram interval values are not in order. In one example that I 
> debugged, the problem was that we used a SELECT statement to compute 
> aggregates on the column, and the plan for that SELECT statement did not sort 
> the results, even though an ORDER BY was specified. While debugging that, I 
> ran into several other issues with sorts on long varchars. Concluding that 
> the engine is simply not up to prime-time for such sorts (and seeing that 
> such sorts are unneeded anyway), I plan to change UPDATE STATISTICS to first 
> truncate the string to a shorter length and then generate the SELECT.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to