Jason, using trim() did the trick: 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`, CAST(trim(x.`a`.`c1`) as INTEGER) from maprfs.cmatta.`cmatta_test` . . . . . . . > x ; +----------+---------+ | row_key | EXPR$1 | +----------+---------+ | row1 | 1 | | row2 | null | | row3 | 5 | | row4 | 7 | +----------+---------+ 4 rows selected (0.542 seconds) 0: jdbc:drill:>
Chris Matta [email protected] 215-701-3146 On Wed, Sep 23, 2015 at 5:15 PM, Jason Altekruse <[email protected]> wrote: > I actually noticed while I was testing this out that it is not working with > casts to date, I will be opening a JIRA for it. That being said, I did find > a unit test for integer, so this should be working. It is possible that the > data in the column is actually a series of spaces instead of the empty > string. Could you try to throw a call to trim() inside of the cast? > > It is also possible that this system option might only impact the cast > functions from varchar to the various other types like double, int, etc. > All of the data read out of HBase is placed in the varbinary type. We might > have decent handling if the data is stored in Hbase in the format we happen > to assume in the cast varbinary to int implementation, but we generally > recommend using the convert_from functions when dealing with binary. I > don't know exactly how this option was implemented, so it might only apply > with varchar. We were primarily focused on issues with lots of case > statements when querying text files when it was added. > > - Jason > > On Wed, Sep 23, 2015 at 1:49 PM, Christopher Matta <[email protected]> > wrote: > > > This system setting was set, yet I still experience the > > NumberFormatException with MapR-DB when a column contains a NULL value: > > > > 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`, > > CAST(x.`a`.`c1` as varchar(64)) from maprfs.cmatta.`cmatta_test` > > . . . . . . . > x; > > +----------+---------+ > > | row_key | EXPR$1 | > > +----------+---------+ > > | row1 | 1 | > > | row2 | | > > | row3 | 5 | > > | row4 | 7 | > > +----------+---------+ > > 4 rows selected (0.64 seconds) > > 0: jdbc:drill:> select name, type, bool_val from sys.options where > > name like '%null%'; > > +-------------------------------------------------+---------+-----------+ > > | name | type | bool_val | > > +-------------------------------------------------+---------+-----------+ > > | drill.exec.functions.cast_empty_string_to_null | SYSTEM | true | > > +-------------------------------------------------+---------+-----------+ > > 1 row selected (0.661 seconds) > > 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`, > > CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x; > > Error: SYSTEM ERROR: NumberFormatException: > > > > Fragment 0:0 > > > > [Error Id: 2a0d104c-33cd-4680-80fe-f908147b5c0a on > > se-node11.se.lab:31010] (state=,code=0) > > > > Shouldn’t that have returned a null for row2? > > > > > > Chris Matta > > [email protected] > > 215-701-3146 > > > > On Wed, Sep 23, 2015 at 3:14 PM, Jason Altekruse < > [email protected] > > > > > wrote: > > > > > Sudheesh, > > > > > > What you said is true for all of the options besides this one. When it > > was > > > added it was decided that users of Drill would likely want the results > of > > > their queries to be consistent for all users, so this option only takes > > > effect at the system level. Unfortunately there is not an error > returned > > > when it is set at the session level, the option just does nothing. > > > > > > Since then it seems like users are expected it to function like the > rest > > of > > > the options (they are afraid of changing other peoples results once > > several > > > users are all running queries), so they have been reluctant to set it > at > > > the system level. > > > > > > There is an outstanding JIRA to change this to make it consistent with > > the > > > others: https://issues.apache.org/jira/browse/DRILL-3259 > > > > > > If this JIRA really isn't going to be scheduled before 1.4, I think it > > > might be good to create a sub-task for improving the experience right > now > > > to at least tell users when they set it at the session level that it is > > not > > > allowed. > > > > > > > > > > > > On Wed, Sep 23, 2015 at 6:01 AM, Christopher Matta <[email protected]> > > > wrote: > > > > > > > Just bumping this, I’m sure other users are going to be running > across > > > this > > > > eventually and I wanted to see if what I’m experiencing is expected > > (even > > > > after setting ALTER SESSION SET > > > > `drill.exec.functions.cast_empty_string_to_null` = true;), or if > it’s a > > > > bug. > > > > > > > > > > > > Chris Matta > > > > [email protected] > > > > 215-701-3146 > > > > > > > > On Tue, Sep 22, 2015 at 2:33 PM, Christopher Matta <[email protected]> > > > > wrote: > > > > > > > > > Was the result I saw expected? Seems like I got back an empty > string > > > and > > > > > not a NULL. > > > > > > > > > > Chris Matta > > > > > [email protected] > > > > > 215-701-3146 > > > > > > > > > > On Tue, Sep 22, 2015 at 1:48 PM, Sudheesh Katkam < > > [email protected] > > > > > > > > > wrote: > > > > > > > > > >> See below: > > > > >> > > > > >> > On Sep 21, 2015, at 8:22 AM, USC <[email protected]> wrote: > > > > >> > > > > > >> > Hi, > > > > >> > This is a system wide setting. Meaning, you need to say > > > > >> > > > > > >> > Alter system set > `drill.exec.functions.cast_empty_string_to_null` > > = > > > > >> true; > > > > >> > > > > >> To clarify, all options available through sys.options (except some > > > used > > > > >> for testing) can be set at system and session level. So what Chris > > did > > > > >> (setting at session level) works. > > > > >> > > > > >> Thank you, > > > > >> Sudheesh > > > > >> > > > > >> > > > > > >> > > > > > >> > Sent from my iPhone > > > > >> > > > > > >> >> On Sep 21, 2015, at 7:18 AM, Christopher Matta < > [email protected]> > > > > >> wrote: > > > > >> >> > > > > >> >> I’m not sure if it worked, the result looks the same when > casting > > > as > > > > a > > > > >> >> string (empty field, not a NULL value): > > > > >> >> > > > > >> >> 0: jdbc:drill:> ALTER SESSION SET > > > > >> >> `drill.exec.functions.cast_empty_string_to_null` = true; > > > > >> >> > > > +-------+----------------------------------------------------------+ > > > > >> >> | ok | summary > > > | > > > > >> >> > > > +-------+----------------------------------------------------------+ > > > > >> >> | true | drill.exec.functions.cast_empty_string_to_null > updated. > > > | > > > > >> >> > > > +-------+----------------------------------------------------------+ > > > > >> >> 1 row selected (1.606 seconds) > > > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > > > `row_key`, > > > > >> >> CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x; > > > > >> >> Error: SYSTEM ERROR: NumberFormatException: > > > > >> >> > > > > >> >> Fragment 0:0 > > > > >> >> > > > > >> >> [Error Id: 33e94b4d-6450-40bf-9f2c-bbbfab9f5990 on > > > > >> >> se-node10.se.lab:31010] (state=,code=0) > > > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > > > `row_key`, > > > > >> >> CASE WHERE x.`a`.`c1` is not null CAST(x.`a`.`c1` as INTEGER) > fr > > > > >> >> Command canceled.`cmatta_test` x; > > > > >> >> 0: jdbc:drill:> select 'hello' from sys.version; > > > > >> >> +---------+ > > > > >> >> | EXPR$0 | > > > > >> >> +---------+ > > > > >> >> | hello | > > > > >> >> +---------+ > > > > >> >> 1 row selected (0.417 seconds) > > > > >> >> 0: jdbc:drill:> select cast(NULL as INTEGER) from sys.version; > > > > >> >> +---------+ > > > > >> >> | EXPR$0 | > > > > >> >> +---------+ > > > > >> >> | null | > > > > >> >> +---------+ > > > > >> >> 1 row selected (0.4 seconds) > > > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > > > `row_key`, > > > > >> >> CAST(CAST(x.`a`.`c1` as varchar(64)) as INTEGER) from > maprfs.cma > > > > >> >> tta.`cmatta_test` x; > > > > >> >> Error: SYSTEM ERROR: NumberFormatException: > > > > >> >> > > > > >> >> Fragment 0:0 > > > > >> >> > > > > >> >> [Error Id: 71593a43-54ac-4e1d-b3d8-21a2d4d4acd6 on > > > > >> >> se-node10.se.lab:31010] (state=,code=0) > > > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > > > `row_key`, > > > > >> >> CAST(x.`a`.`c1` as varchar(64)) from > maprfs.cmatta.`cmatta_test` > > > > >> >> x; > > > > >> >> +----------+---------+ > > > > >> >> | row_key | EXPR$1 | > > > > >> >> +----------+---------+ > > > > >> >> | row1 | 1 | > > > > >> >> | row2 | | > > > > >> >> | row3 | 5 | > > > > >> >> | row4 | 7 | > > > > >> >> +----------+---------+ > > > > >> >> 4 rows selected (0.54 seconds) > > > > >> >> 0: jdbc:drill:> > > > > >> >> > > > > >> >> Is this how it’s expected to work? > > > > >> >> > > > > >> >> > > > > >> >> Chris Matta > > > > >> >> [email protected] > > > > >> >> 215-701-3146 > > > > >> >> > > > > >> >>> On Fri, Sep 18, 2015 at 9:56 PM, Jacques Nadeau < > > > [email protected] > > > > > > > > > >> wrote: > > > > >> >>> > > > > >> >>> Does this system option not work: > > > > >> >>> > > > > >> >>> ALTER SESSION SET > > > `drill.exec.functions.cast_empty_string_to_null` = > > > > >> true; > > > > >> >>> > > > > >> >>> The reason the bug was marked INVALID is that SQL engines (not > > > sure > > > > >> about > > > > >> >>> the spec) don't allow casting from empty string to number. The > > > > system > > > > >> >>> option above is supposed to allow changing this behavior from > > the > > > > SQL > > > > >> >>> standard for your type of situation. That being said, I see > the > > > docs > > > > >> say > > > > >> >>> "not supported in this release". Not sure why that is there. > Can > > > you > > > > >> give > > > > >> >>> it a try? > > > > >> >>> > > > > >> >>> That being said, it seems like the original issue was a NPE > not > > a > > > > >> NFE. That > > > > >> >>> definitely seems like something else. > > > > >> >>> > > > > >> >>> > > > > >> >>> -- > > > > >> >>> Jacques Nadeau > > > > >> >>> CTO and Co-Founder, Dremio > > > > >> >>> > > > > >> >>> On Thu, Sep 17, 2015 at 10:53 AM, Christopher Matta < > > > > [email protected]> > > > > >> >>> wrote: > > > > >> >>> > > > > >> >>>> Here is my attempt at building a reproduction, btw, it seems > > like > > > > >> this is > > > > >> >>>> the same issue as DRILL-862 > > > > >> >>>> <https://issues.apache.org/jira/browse/DRILL-862> where > > Jacques > > > > >> >>> determined > > > > >> >>>> the error to be invalid. Is trying to cast an empty string, > or > > > null > > > > >> value > > > > >> >>>> to an integer invalid? What's the workaround? > > > > >> >>>> > > > > >> >>>> Data > > > > >> >>>> > > > > >> >>>> row1,1,2 > > > > >> >>>> row2,,4 > > > > >> >>>> row3,5,6 > > > > >> >>>> row4,7,8 > > > > >> >>>> > > > > >> >>>> Create Table > > > > >> >>>> > > > > >> >>>> $ maprcli table create -path > /user/cmatta/projects/cmatta_test > > > > >> >>>> $ maprcli table cf create -path > > /user/cmatta/projects/cmatta_test > > > > >> >>> -cfname a > > > > >> >>>> > > > > >> >>>> Load into Hbase table: > > > > >> >>>> > > > > >> >>>> hbase org.apache.hadoop.hbase.mapreduce.ImportTsv > > > > >> >>>> -Dimporttsv.separator=',' > > > > -Dimporttsv.columns=HBASE_ROW_KEY,a:c1,a:c2 > > > > >> >>>> /user/cmatta/projects/cmatta_test > > > > >> >>>> maprfs:///user/cmatta/projects/testdata_hbase_null > > > > >> >>>> > > > > >> >>>> Query (error): > > > > >> >>>> > > > > >> >>>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > > > >> `row_key`, > > > > >> >>>> CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` > x; > > > > >> >>>> Error: SYSTEM ERROR: NumberFormatException: > > > > >> >>>> > > > > >> >>>> Fragment 0:0 > > > > >> >>>> > > > > >> >>>> [Error Id: cbcb3327-3699-4191-9c26-9b95c9922690 on > > > > >> >>>> se-node11.se.lab:31010] (state=,code=0) > > > > >> >>>> > > > > >> >>>> Query that works on the column (c2) that doesn’t have a NULL > > > value: > > > > >> >>>> > > > > >> >>>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > > > >> `row_key`, > > > > >> >>>> CAST(x.`a`.`c2` as INTEGER) from maprfs.cmatta.`cmatta_test` > x; > > > > >> >>>> +----------+---------+ > > > > >> >>>> | row_key | EXPR$1 | > > > > >> >>>> +----------+---------+ > > > > >> >>>> | row1 | 2 | > > > > >> >>>> | row2 | 4 | > > > > >> >>>> | row3 | 6 | > > > > >> >>>> | row4 | 8 | > > > > >> >>>> +----------+---------+ > > > > >> >>>> 4 rows selected (0.308 seconds) > > > > >> >>>> > > > > >> >>>> > > > > >> >>>> Chris Matta > > > > >> >>>> [email protected] > > > > >> >>>> 215-701-3146 > > > > >> >>>> > > > > >> >>>> On Tue, Sep 15, 2015 at 3:09 PM, Christopher Matta < > > > > [email protected]> > > > > >> >>>> wrote: > > > > >> >>>> > > > > >> >>>>> Did a Jira get opened for this? If not, Nathaniel, can you > > > provide > > > > >> me > > > > >> >>>> with > > > > >> >>>>> your query and a sample data set so I can open one? > > > > >> >>>>> > > > > >> >>>>> Chris Matta > > > > >> >>>>> [email protected] > > > > >> >>>>> 215-701-3146 > > > > >> >>>>> > > > > >> >>>>> On Thu, Sep 10, 2015 at 5:03 PM, Jason Altekruse < > > > > >> >>>> [email protected] > > > > >> >>>>>> wrote: > > > > >> >>>>> > > > > >> >>>>>> A SQL level null is different than a null at the JAVA level > > > that > > > > >> would > > > > >> >>>> be > > > > >> >>>>>> giving this exception (we don't represent nulls with an > > actual > > > > null > > > > >> >>> java > > > > >> >>>>>> object). There might be a way to work around it, but this > is > > a > > > > bug > > > > >> in > > > > >> >>>>>> Drill. You should be able to make a cast between compatible > > > types > > > > >> even > > > > >> >>>> if > > > > >> >>>>>> there are nulls in the dataset. > > > > >> >>>>>> > > > > >> >>>>>> Could you open a JIRA and put as much description about > your > > > data > > > > >> and > > > > >> >>>>>> query > > > > >> >>>>>> as possible? If you have the time to create a small repro, > > > > loading > > > > >> an > > > > >> >>>>>> Hbase > > > > >> >>>>>> table with a small set of data that produces the problem > that > > > > >> would be > > > > >> >>>>>> very > > > > >> >>>>>> helpful. In any case, whatever you can provide will be > useful > > > the > > > > >> dev > > > > >> >>>> who > > > > >> >>>>>> picks up the JIRA. > > > > >> >>>>>> > > > > >> >>>>>> On Thu, Sep 10, 2015 at 12:49 PM, Christopher Matta < > > > > >> [email protected]> > > > > >> >>>>>> wrote: > > > > >> >>>>>> > > > > >> >>>>>>> Does changing the query to something like this work?: > > > > >> >>>>>>> > > > > >> >>>>>>> SELECT CAST(CASE WHEN table.cf.myColumn IS NOT NULL THEN > > > > >> >>>>>>> table.cf.myColumn ELSE NULL END as INTEGER) > > > > >> >>>>>>> ... > > > > >> >>>>>>> > > > > >> >>>>>>> > > > > >> >>>>>>> > > > > >> >>>>>>> Chris Matta > > > > >> >>>>>>> [email protected] > > > > >> >>>>>>> 215-701-3146 > > > > >> >>>>>>> > > > > >> >>>>>>> On Thu, Sep 10, 2015 at 12:49 PM, Nathaniel Auvil < > > > > >> >>>>>>> [email protected] > > > > >> >>>>>>>> wrote: > > > > >> >>>>>>> > > > > >> >>>>>>>> i have a MaprDB table which has nulls for some of the > > > columns. > > > > >> >>>> When i > > > > >> >>>>>>>> create a select where i CAST(table.cf.myColumn as > INTEGER) > > > and > > > > >> >>> there > > > > >> >>>>>> are > > > > >> >>>>>>>> nulls instead of simply returning null for the value, > there > > > is > > > > a > > > > >> >>>>>>>> NullPointerException. > > > > >> >>>>>>>> > > > > >> >>>>>>>> Why not simply return null for the value if it is missing > > > from > > > > a > > > > >> >>>> row? > > > > >> >>>>>>>> > > > > >> >>>>>>> > > > > >> >>>>>> > > > > >> >>>>> > > > > >> >>>>> > > > > >> >>>> > > > > >> >>> > > > > >> > > > > >> > > > > > > > > > > > > > > >
