[
https://issues.apache.org/jira/browse/TRAFODION-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958120#comment-15958120
]
ASF GitHub Bot commented on TRAFODION-2576:
-------------------------------------------
Github user zellerh commented on a diff in the pull request:
https://github.com/apache/incubator-trafodion/pull/1046#discussion_r110060012
--- Diff: core/sql/ustat/hs_globals.cpp ---
@@ -1882,6 +1882,55 @@ NABoolean HSColumnStruct::operator==(const
HSColumnStruct& other) const
return ( colnum == other.colnum );
}
+//
+// METHOD: addTruncatedColumnReference()
+//
+// PURPOSE: Generates a column reference or a SUBSTRING
+// on a column reference which truncates the
+// column to the maximum length allowed in
+// UPDATE STATISTICS.
+//
+// INPUT: 'qry' - the SQL query string to append the
+// reference to.
+// 'colInfo' - struct containing datatype info
+// about the column.
+//
+void HSColumnStruct::addTruncatedColumnReference(NAString & qry)
+ {
+ HSGlobalsClass *hs_globals = GetHSContext();
+ Lng32 maxLengthInBytes = hs_globals->maxCharColumnLengthInBytes;
+ bool isOverSized = DFS2REC::isAnyCharacter(datatype) &&
+ (length > maxLengthInBytes);
+ if (isOverSized)
+ {
+ // Note: The result data type of SUBSTRING is VARCHAR, always.
+ // But if the column is CHAR, many places in the ustat code are not
+ // expecting a VARCHAR. So, we stick a CAST around it to convert
+ // it back to a CHAR in these cases.
+
+ NABoolean isFixedChar = DFS2REC::isSQLFixedChar(datatype);
+ if (isFixedChar)
+ qry += "CAST(";
+ qry += "SUBSTRING(";
+ qry += externalColumnName->data();
+ qry += " FOR ";
+
+ char temp[20]; // big enough for "nnnnnn)"
+ sprintf(temp,"%d)", maxLengthInBytes /
CharInfo::maxBytesPerChar(charset));
+ qry += temp;
+ if (isFixedChar)
+ {
+ qry += " AS CHAR(";
+ qry += temp;
+ qry += ")";
--- End diff --
... thinking about this again, we don't have a SUBSTRING method that will
give the first n _bytes_ of a UTF-8 string, truncated to the next UTF-8
character. So, without such a function we can't really do what I suggested
above.
If we have a long UTF-8 column c and maxLengthInBytes is 256, then we will
take SUBSTRING(c for 64) here. In many situations and languages, this will be
about 64 bytes of information, not 256.
> Incremental UPDATE STATS fails on long varchar values
> -----------------------------------------------------
>
> Key: TRAFODION-2576
> URL: https://issues.apache.org/jira/browse/TRAFODION-2576
> Project: Apache Trafodion
> Issue Type: Bug
> Components: sql-cmp
> Environment: All
> Reporter: David Wayne Birdsall
> Assignee: David Wayne Birdsall
>
> The following test script demonstrates the problem:
> -- This script creates a table with varchar values longer than 256 characters.
> -- It tnen demonstrates issues with incremental stats on such values.
> ?section setup
> drop table if exists testbigcol;
> create table testbigcol
> ( a int not null,
> b varchar(500),
> primary key (a) )
> salt using 4 partitions;
> insert into testbigcol values (1,'axq'),(2,'bxw');
> insert into testbigcol select x.a + 2 * y.a, x.b || y.b from testbigcol x
> cross join testbigcol y;
> insert into testbigcol select x.a + 6 * y.a, x.b || y.b from testbigcol x
> cross join testbigcol y;
> insert into testbigcol select x.a + 42 * y.a, x.b || y.b from testbigcol x
> cross join testbigcol y;
> -- 0 1 2 3 4
> 5
> --
> 012345678901234567890123456789012345678901234567890123
> update testbigcol set b = b || ' and now for something completely different
> 0123456789'
> where mod(a,2) = 1;
> update testbigcol set b = b || ' that was fun, and now for something
> completely different 0123456789'
> where mod(a,3) = 1;
> update testbigcol set b = b || ' and then there was a great hue and cry and
> now for something completely different 0123456789'
> where mod(a,5) = 1;
> update testbigcol set b = b || ' and we can do all sorts of things to make
> these strings longer, and now for something completely different 0123456789'
> where mod(a,7) = 1;
> ?section testit
> update statistics for table testbigcol on every column sample random 100
> percent persistent;
> showstats for table testbigcol on existing columns detail;
> update testbigcol set b = 'c' || b where a > 1700;
> update statistics for table testbigcol on existing columns incremental where
> a > 1700;
> showstats for table testbigcol on existing columns detail;
> When run, the last UPDATE STATISTICS command fails as follows:
> >>update statistics for table testbigcol on existing columns incremental
> >>where a > 1700;
> *** ERROR[9200] UPDATE STATISTICS for table TRAFODION.SEABASE.TESTBIGCOL
> encountered an error (8402) from statement IUS data set I creation.
> *** ERROR[8402] A string overflow occurred during the evaluation of a
> character expression. Conversion of Source Type:VARCHAR(REC_BYTE_V_ASCII,290
> BYTES,ISO88591) Source Value:caxqbxwbxwaxqaxqbxwbxwbxw and now for something
> completely different 0123456789 and then there was a great hue and cry and
> now for something completely different 0123456789 and we can do all sorts of
> to Target Type:VARCHAR(REC_BYTE_V_ASCII,256 BYTES,ISO88591).
> --- SQL operation failed with errors.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)