[jira] [Updated] (DRILL-2807) Revisit SQL array indexing _before GA/1.0_ (non-std. zero-based vs. std. one-based)

Jacques Nadeau (JIRA) Tue, 05 May 2015 06:37:28 -0700

     [ 
https://issues.apache.org/jira/browse/DRILL-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jacques Nadeau updated DRILL-2807:
----------------------------------
    Fix Version/s: 1.2.0

> Revisit SQL array indexing _before GA/1.0_ (non-std. zero-based vs. std. 
> one-based)
> -----------------------------------------------------------------------------------
>
>                 Key: DRILL-2807
>                 URL: https://issues.apache.org/jira/browse/DRILL-2807
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Daniel Barclay (Drill)
>             Fix For: 1.2.0
>
>
> Currently, Drill's SQL uses zero-based indexing for arrays.  ({{A\[0]}} gets 
> the first element of the array value of {{A}}.)
> This is not compatible with standard SQL, which uses one-based indexing for 
> arrays.  ({{A\[1]}} gets the first element of the array value of {{A}}.)
> &nbsp;
> Most probably, Drill should use standard one-base indexing.
> More definitely, this should be revisited before 1.0/GA, because it will be 
> much harder to change the index origin later.
> (Even though it could be made changeable later, that would still be 
> confusing--the meaning of an indexed array expression wouldn't be clear 
> without knowledge of the current setting.)
> &nbsp;
> Some arguments directly for using standard SQL one-based array indexing are:
> 1. The SQL standard specifies that array indexing is one-based, so Drill 
> should use one-based indexing.  \\  \\ This is true especially because Drill 
> is intended to be compatible with standard SQL rather than having its own 
> dialect of SQL (other than where necessary because of the essentially 
> different nature of Drill).
> 2. Since standard SQL's array indexing is one-based, users familiar with 
> arrays will expect one-based indexing. Drill shouldn't surprise users, or 
> make them have to remember another difference between Drill and standard SQL.
> 3. Even users unfamiliar with SQL arrays will expect their indexing to be  
> one-based since the rest of standard SQL also uses one-based indexing:
>   - Column numbers are one-based:
>     -- in "ORDER BY 1"
>     -- in INFORMATION_SCHEMA.COLUMNS.ORDINAL_POSITION
>   - Character and binary string positions are one-based
>     -- in definition of SUBSTRING
>     -- in "If <start position> is not specified, then 1 (one) is implicit."
> &nbsp;
> The reason that Drill's SQL array indexing currently is zero-based is that 
> previously it was thought that it would be good to be  consistent with 
> indexing in JavaScript, which is expected to be used by a significant number 
> of Drill users to process JSON output from or input to Drill.
> However, besides the direct arguments above, there are additional arguments 
> for not making Drill's SQL non-standard in order to try to be consistent with 
> JavaScript:
> 1. SQL simply isn't JavaScript, just as it isn't Java, C, C++, or any other 
> zero-based language that can be used to process data read into or written 
> from Drill. \\ \\ (One could argue that SQL _should_ have been designed to be 
> zero-based in the first place, but it wasn't, and so now it simply isn't.)
> 2. Drill's SQL doesn't forgo compatibility with the SQL standard to make 
> other things zero-based in order to try to be consistent with JavaScript:
>   - It doesn't make string index zero- based (in {{SUBSTRING}}, etc.).
>   - It doesn't make column indexes zero-based (e.g., changing standard 
> "{{ORDER BY 1}}" to non-standard "{{ORDER BY 0}}").
> 3. If Drill were defining a JavaScript API for accessing SQL-related data 
> from JavaScript (as JDBC provides access from Java), then the index origin 
> used in method parameters in that API could be zero-based:
>   - It could be chosen to be zero-based to be consistent with JavaScript 
> rather than being consistent with SQL.
>   - It could be chosen to be one-based to be consistent with SQL rather than 
> being consistent with JavaScript.  However, note that it could still take and 
> return JavaScript objects, which would still use JavaScript's zero-based 
> indexing--just as JDBC uses SQL-consistent one-based indexing in its  
> methods, but also returns Java objects which of course are indexed in Java's 
> zero-based way.
>   - (More JDBC/Java details:
>     -- is one-based in SQL part (JDBC Java objects representing SQL concepts):
>       --- column numbers (e.g., {{resultSet.getDate(1)}} for first column)
>       --- row numbers (e.g., {{resultSet.absolute(1)}} for first now) 
>       --- element numbers (e.g., {{array.getArray(1, 1)}} for first element, 
> {{Array.getResultSet(...)}}'s element numbers)
>       --- character/byte numbers in Clob/Blob (e.g., {{getSubString(1, ...)}} 
> for first, {{position(...)}}, {{getBytes(1...)}} starts at first)
>     -- is zero-based once back at pure-Java level (non-JDBC objects 
> representing values):
>       --- {{resultSet.getString(...)}}, then {{.charAt(0)}} on string
>       --- {{resultSet.getBytes(...)}}, then {{\[0]}} on byte array
>       --- {{resultSet.getArray(...)}}, then {{.getArray(...)}}, then {{\[0]}} 
> on array
>     -- So, if you want to use Java's zero-based indexing, you retrieve data 
> into generic Java types and then do so.  Java's SQL API (JDBC) doesn't do 
> everything zero-based even though the host language is zero-based.
> )
> However, making the JavaScript _API_'s indexing consistent with JavaScript 
> still does not imply having Drill use non-standard zero-based indexing in 
> Drill's _SQL_.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-2807) Revisit SQL array indexing _before GA/1.0_ (non-std. zero-based vs. std. one-based)

Reply via email to