[
https://issues.apache.org/jira/browse/DRILL-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997086#comment-15997086
]
Paul Rogers commented on DRILL-5329:
------------------------------------
To be honest, I'm a bit unclear on the goals for data types. We have a
documented list on the web site, but that walks back certain types. We have
confusion over our date/time types (the docs say that are in UTC, they are
implemented as local time, and shift values if the client and server have
different time zones.) Decimal types are kinda-but-not-really supported. Some
times, it seems, are never used by readers.
So, my goal is to make sure that the types supported by the sort in Drill 1.9
are also supported in the "managed" sort for Drill 1.11.
Beyond that, I'm open to suggestions.
> External sort does not support "obscure" data types
> ---------------------------------------------------
>
> Key: DRILL-5329
> URL: https://issues.apache.org/jira/browse/DRILL-5329
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.10.0
> Reporter: Paul Rogers
>
> A unit test was created to exercise the "Sorter" mechanism within the
> External Sort, which is used to sort each incoming batch. The sorter was
> tested with each Drill data type.
> The following types fail:
> * TINYINT
> * UINT1
> * SMALLINT
> * UINT2
> * UINT4
> * UINT8
> * VAR16CHAR
> * DECIMAL28SPARSE
> * DECIMAL38SPARSE
> The types that work include:
> * INT
> * BIGINT
> * FLOAT4
> * FLOAT8
> * DECIMAL9
> * DECIMAL18
> * VARCHAR
> * VARBINARY
> * DATE
> * TIME
> * TIMESTAMP
> * INTERVAL
> * INTERVALDAY
> * INTERVALYEAR
> Could not find a way to test the following:
> * DECIMAL28DENSE
> * DECIMAL38DENSE
> * LIST
> * MAP
> * GENERIC_OBJECT
> * UNION
> Not yet supported in Drill:
> * MONEY
> * FIXEDCHAR
> * FIXED16CHAR
> * FIXEDBINARY
> * NULL
> * TIMETZ
> * TIMESTAMPTZ
> * LATE
> The failure manifests on one of two ways:
> * If dynamic UDFs are enabled, the query crashes with an NPE. (See
> DRILL-5331.)
> * If dynamic UDFs are disabled, the generated code silently skips the
> comparison step, resulting in the sort not actually being done:
> Sorting a set of 20-pseudo-random rows produces the following output:
> {code}
> #, row #, key, value
> 0(0): 11, "0"
> 1(1): 14, "1"
> 2(2): 17, "2"
> 3(3): 0, "3"
> {code}
> By contrast, the (working) Int type produces the correct results:
> {code}
> #, row #, key, value
> 0(3): 0, "3"
> 1(10): 1, "10"
> 2(17): 2, "17"
> 3(4): 3, "4"
> {code}
> The first number is the row index, the second is the row pointed to by the
> sv2 (which should be written to create sort order). Sort was done ASC,
> NULLS_HIGH, by the key field.
> A strong concern here is that there is no error or other warning to the user
> that Drill cannot sort this type; Drill just silently declines to perform the
> operation.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)