[
https://issues.apache.org/jira/browse/DRILL-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jacques Nadeau updated DRILL-2807:
----------------------------------
Fix Version/s: 1.2.0
> Revisit SQL array indexing _before GA/1.0_ (non-std. zero-based vs. std.
> one-based)
> -----------------------------------------------------------------------------------
>
> Key: DRILL-2807
> URL: https://issues.apache.org/jira/browse/DRILL-2807
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> Currently, Drill's SQL uses zero-based indexing for arrays. ({{A\[0]}} gets
> the first element of the array value of {{A}}.)
> This is not compatible with standard SQL, which uses one-based indexing for
> arrays. ({{A\[1]}} gets the first element of the array value of {{A}}.)
>
> Most probably, Drill should use standard one-base indexing.
> More definitely, this should be revisited before 1.0/GA, because it will be
> much harder to change the index origin later.
> (Even though it could be made changeable later, that would still be
> confusing--the meaning of an indexed array expression wouldn't be clear
> without knowledge of the current setting.)
>
> Some arguments directly for using standard SQL one-based array indexing are:
> 1. The SQL standard specifies that array indexing is one-based, so Drill
> should use one-based indexing. \\ \\ This is true especially because Drill
> is intended to be compatible with standard SQL rather than having its own
> dialect of SQL (other than where necessary because of the essentially
> different nature of Drill).
> 2. Since standard SQL's array indexing is one-based, users familiar with
> arrays will expect one-based indexing. Drill shouldn't surprise users, or
> make them have to remember another difference between Drill and standard SQL.
> 3. Even users unfamiliar with SQL arrays will expect their indexing to be
> one-based since the rest of standard SQL also uses one-based indexing:
> - Column numbers are one-based:
> -- in "ORDER BY 1"
> -- in INFORMATION_SCHEMA.COLUMNS.ORDINAL_POSITION
> - Character and binary string positions are one-based
> -- in definition of SUBSTRING
> -- in "If <start position> is not specified, then 1 (one) is implicit."
>
> The reason that Drill's SQL array indexing currently is zero-based is that
> previously it was thought that it would be good to be consistent with
> indexing in JavaScript, which is expected to be used by a significant number
> of Drill users to process JSON output from or input to Drill.
> However, besides the direct arguments above, there are additional arguments
> for not making Drill's SQL non-standard in order to try to be consistent with
> JavaScript:
> 1. SQL simply isn't JavaScript, just as it isn't Java, C, C++, or any other
> zero-based language that can be used to process data read into or written
> from Drill. \\ \\ (One could argue that SQL _should_ have been designed to be
> zero-based in the first place, but it wasn't, and so now it simply isn't.)
> 2. Drill's SQL doesn't forgo compatibility with the SQL standard to make
> other things zero-based in order to try to be consistent with JavaScript:
> - It doesn't make string index zero- based (in {{SUBSTRING}}, etc.).
> - It doesn't make column indexes zero-based (e.g., changing standard
> "{{ORDER BY 1}}" to non-standard "{{ORDER BY 0}}").
> 3. If Drill were defining a JavaScript API for accessing SQL-related data
> from JavaScript (as JDBC provides access from Java), then the index origin
> used in method parameters in that API could be zero-based:
> - It could be chosen to be zero-based to be consistent with JavaScript
> rather than being consistent with SQL.
> - It could be chosen to be one-based to be consistent with SQL rather than
> being consistent with JavaScript. However, note that it could still take and
> return JavaScript objects, which would still use JavaScript's zero-based
> indexing--just as JDBC uses SQL-consistent one-based indexing in its
> methods, but also returns Java objects which of course are indexed in Java's
> zero-based way.
> - (More JDBC/Java details:
> -- is one-based in SQL part (JDBC Java objects representing SQL concepts):
> --- column numbers (e.g., {{resultSet.getDate(1)}} for first column)
> --- row numbers (e.g., {{resultSet.absolute(1)}} for first now)
> --- element numbers (e.g., {{array.getArray(1, 1)}} for first element,
> {{Array.getResultSet(...)}}'s element numbers)
> --- character/byte numbers in Clob/Blob (e.g., {{getSubString(1, ...)}}
> for first, {{position(...)}}, {{getBytes(1...)}} starts at first)
> -- is zero-based once back at pure-Java level (non-JDBC objects
> representing values):
> --- {{resultSet.getString(...)}}, then {{.charAt(0)}} on string
> --- {{resultSet.getBytes(...)}}, then {{\[0]}} on byte array
> --- {{resultSet.getArray(...)}}, then {{.getArray(...)}}, then {{\[0]}}
> on array
> -- So, if you want to use Java's zero-based indexing, you retrieve data
> into generic Java types and then do so. Java's SQL API (JDBC) doesn't do
> everything zero-based even though the host language is zero-based.
> )
> However, making the JavaScript _API_'s indexing consistent with JavaScript
> still does not imply having Drill use non-standard zero-based indexing in
> Drill's _SQL_.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)