[ 
https://issues.apache.org/jira/browse/CALCITE-2572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621132#comment-16621132
 ] 

Julian Hyde commented on CALCITE-2572:
--------------------------------------

Here's actual text from the standard:

{quote}
<character substring function> ::=
  SUBSTRING <left paren> <character value expression> FROM <start position>
      [ FOR <string length> ] [ USING <char length units> ] <right paren>

If <character substring function> is specified, then:
a) If the character encoding form of <character value expression> is UTF8, 
UTF16, or UTF32, then, in the remainder of this General Rule, the term 
“character” shall be taken to mean “unit specified by <char length units>”.
b) Let C be the value of the <character value expression>, let LC be the length 
in characters of C, and let S be the value of the <start position>.
c) If <string length> is specified, then let L be the value of <string length> 
and let E be S+L. Otherwise, let E be the larger of LC + 1 and S.
d) If at least one of C, S, and L is the null value, then the result of the 
<character substring function> is the null value.
e) If E is less than S, then an exception condition is raised: data exception — 
substring error.
f) Case:
  i) If S is greater than LC or if E is less than 1 (one), then the result of 
the <character substring function> is a zero-length string.
  ii) Otherwise,
    1) Let S1 be the larger of S and 1 (one). Let E1 be the smaller of E and 
LC+1. Let L1 be E1–S1.
    2) The result of the <character substring function> is a character string 
containing the L1 characters of C starting at character number S1 in the same 
order that the characters appear in C.
{quote}

The line "S1 be the larger of S and 1 (one)" proves your assertion.

Can you please write some tests? Maybe in 
{{SqlOperatorTest.testSubstringFunc}}. Test the corner cases. Also at least one 
that raises "substring error".

> Calcite substring fails with a start position less than 1.
> ----------------------------------------------------------
>
>                 Key: CALCITE-2572
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2572
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Andrew Pilloud
>            Assignee: Julian Hyde
>            Priority: Major
>
> Calcite substring throws a IndexOutOfBoundsException with a position less 
> than 1. Per the SQL standard, the position should be treated as the larger of 
> the provided value and 1, however many implementations treat negative values 
> as starting from the end of the string.
> Extended standard:
> https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions162.htm
> https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_substring
> https://spark.apache.org/docs/2.3.0/api/sql/index.html#substring
> https://www.sqlite.org/lang_corefunc.html#substr
> Follow the standard:
> https://docs.microsoft.com/en-us/sql/t-sql/functions/substring-transact-sql?view=sql-server-2017
> https://www.postgresql.org/docs/9.1/static/functions-string.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to