Got it. Should be fine, then. On Mon, Jun 8, 2015 at 12:46 PM, Jason Altekruse <[email protected]> wrote:
> I was going to be changing them on the schema side as well. As I am > currently implementing the feature as a rewrite rule, I have to match the > schema of the relational tree I am replacing. To make it work in execution > I have to cast to an integer (or add the tinyint cast). If I choose the > former, the planning will fail on mismatch types between the tinyint > expected from the Hives can that differs from the integer coming out of the > cast. > > On Mon, Jun 8, 2015 at 12:43 PM, Jacques Nadeau <[email protected]> > wrote: > > > The only concern I have around changing the types in execution is that it > > may cause strange behaviors. Are you planning on changing them on the > > schema side as well? That way Calcite wouldn't insert weird expression > > patterns that would cause other problems if you change the execution > side. > > > > On Mon, Jun 8, 2015 at 12:41 PM, Jason Altekruse < > [email protected] > > > > > wrote: > > > > > I am in support of opening JIRAs to enumerate the step necessary to > fill > > in > > > the steps necessary to support these types. However I think it would be > > > good to get a fix into master for the functional bug that is in the > code > > > today. That fix is easy and the only overhead is taking a little more > > space > > > for the data after it has been read into Drill. > > > > > > As we are looking to keep up with our near-monthly release schedule, > I'm > > > uncertain that we can have these types implemented and well tested by > the > > > next release, but I think we very realistically could start testing > Hive > > > more thoroughly after this small fix. > > > > > > On Mon, Jun 8, 2015 at 12:29 PM, Jacques Nadeau <[email protected]> > > > wrote: > > > > > > > I think it would be worthwhile to first open up a set of JIRAs > > associated > > > > with finishing support for these datatypes. I'm guessing the scale > of > > > > effort is less than one might initially guess. Once those are > opened, > > it > > > > would be easier to give feedback on the relative merit of that work > > > versus > > > > the alternative solution you suggested. > > > > > > > > On Mon, Jun 8, 2015 at 11:12 AM, Jason Altekruse < > > > [email protected] > > > > > > > > > wrote: > > > > > > > > > Hello Drillers, > > > > > > > > > > I have been working on DRILL-3209, which aims to speed up reading > > from > > > > hive > > > > > tables by re-planning them as native Drill reads in the case where > > the > > > > > tables are backed by files that have available native readers. This > > > will > > > > > begin with parquet and delimited text files. > > > > > > > > > > To provide the same behavior as reading through the Serde > interface, > > I > > > > must > > > > > insert a cast above the read operation to provide the same types > that > > > the > > > > > Hive scan otherwise would. > > > > > > > > > > The issue I am seeing is that Hive appears to be reading into both > > the > > > > > tinyint and smallint types which I believe are not fully supported > > > > > (currently my new injected project is failing to find a function to > > > cast > > > > to > > > > > tinyint). See the unsupported note in the docs here [1] for > smallint, > > > > > tinyint is not even listed. > > > > > > > > > > I can simply add the function to provide the same type as we > > currently > > > > read > > > > > out of the scan, but I believe we will have other issues with > trying > > to > > > > > support this right now as we have not thoroughly tested these other > > > > integer > > > > > types. > > > > > > > > > > I would like to instead propose that we change the behavior of Hive > > to > > > > read > > > > > data of these types into a regular integer columns for now and try > to > > > > > remove any outstanding references to tinyint and smallint until we > > can > > > > > commit to fully supporting them. > > > > > > > > > > [1] http://drill.apache.org/docs/supported-data-types/ > > > > > > > > > > > > > > >
