Thanks Daniel, I'll make some sub-JIRAs to try to fill out the task list.
This should be a good opportunity for a newbie contribution if someone
wants to get to know the Drill code.

On Mon, Jun 8, 2015 at 1:51 PM, Daniel Barclay <[email protected]>
wrote:

> Note DRILL-2470, "Implement SMALLINT and TINYINT [umbrella]".
>
>
> Jacques Nadeau wrote:
>
>> I think it would be worthwhile to first open up a set of JIRAs associated
>> with finishing support for these datatypes.  I'm guessing the scale of
>> effort is less than one might initially guess.  Once those are opened, it
>> would be easier to give feedback on the relative merit of that work versus
>> the alternative solution you suggested.
>>
>> On Mon, Jun 8, 2015 at 11:12 AM, Jason Altekruse <
>> [email protected]>
>> wrote:
>>
>>  Hello Drillers,
>>>
>>> I have been working on DRILL-3209, which aims to speed up reading from
>>> hive
>>> tables by re-planning them as native Drill reads in the case where the
>>> tables are backed by files that have available native readers. This will
>>> begin with parquet and delimited text files.
>>>
>>> To provide the same behavior as reading through the Serde interface, I
>>> must
>>> insert a cast above the read operation to provide the same types that the
>>> Hive scan otherwise would.
>>>
>>> The issue I am seeing is that Hive appears to be reading into both the
>>> tinyint and smallint types which I believe are not fully supported
>>> (currently my new injected project is failing to find a function to cast
>>> to
>>> tinyint). See the unsupported note in the docs here [1] for smallint,
>>> tinyint is not even listed.
>>>
>>> I can simply add the function to provide the same type as we currently
>>> read
>>> out of the scan, but I believe we will have other issues with trying to
>>> support this right now as we have not thoroughly tested these other
>>> integer
>>> types.
>>>
>>> I would like to instead propose that we change the behavior of Hive to
>>> read
>>> data of these types into a regular integer columns for now and try to
>>> remove any outstanding references to tinyint and smallint until we can
>>> commit to fully supporting them.
>>>
>>> [1] http://drill.apache.org/docs/supported-data-types/
>>>
>>>
>>
>
> --
> Daniel Barclay
> MapR Technologies
>

Reply via email to