Thanks Daniel, I'll make some sub-JIRAs to try to fill out the task list. This should be a good opportunity for a newbie contribution if someone wants to get to know the Drill code.
On Mon, Jun 8, 2015 at 1:51 PM, Daniel Barclay <[email protected]> wrote: > Note DRILL-2470, "Implement SMALLINT and TINYINT [umbrella]". > > > Jacques Nadeau wrote: > >> I think it would be worthwhile to first open up a set of JIRAs associated >> with finishing support for these datatypes. I'm guessing the scale of >> effort is less than one might initially guess. Once those are opened, it >> would be easier to give feedback on the relative merit of that work versus >> the alternative solution you suggested. >> >> On Mon, Jun 8, 2015 at 11:12 AM, Jason Altekruse < >> [email protected]> >> wrote: >> >> Hello Drillers, >>> >>> I have been working on DRILL-3209, which aims to speed up reading from >>> hive >>> tables by re-planning them as native Drill reads in the case where the >>> tables are backed by files that have available native readers. This will >>> begin with parquet and delimited text files. >>> >>> To provide the same behavior as reading through the Serde interface, I >>> must >>> insert a cast above the read operation to provide the same types that the >>> Hive scan otherwise would. >>> >>> The issue I am seeing is that Hive appears to be reading into both the >>> tinyint and smallint types which I believe are not fully supported >>> (currently my new injected project is failing to find a function to cast >>> to >>> tinyint). See the unsupported note in the docs here [1] for smallint, >>> tinyint is not even listed. >>> >>> I can simply add the function to provide the same type as we currently >>> read >>> out of the scan, but I believe we will have other issues with trying to >>> support this right now as we have not thoroughly tested these other >>> integer >>> types. >>> >>> I would like to instead propose that we change the behavior of Hive to >>> read >>> data of these types into a regular integer columns for now and try to >>> remove any outstanding references to tinyint and smallint until we can >>> commit to fully supporting them. >>> >>> [1] http://drill.apache.org/docs/supported-data-types/ >>> >>> >> > > -- > Daniel Barclay > MapR Technologies >
