This system setting was set, yet I still experience the NumberFormatException with MapR-DB when a column contains a NULL value:
0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`, CAST(x.`a`.`c1` as varchar(64)) from maprfs.cmatta.`cmatta_test` . . . . . . . > x; +----------+---------+ | row_key | EXPR$1 | +----------+---------+ | row1 | 1 | | row2 | | | row3 | 5 | | row4 | 7 | +----------+---------+ 4 rows selected (0.64 seconds) 0: jdbc:drill:> select name, type, bool_val from sys.options where name like '%null%'; +-------------------------------------------------+---------+-----------+ | name | type | bool_val | +-------------------------------------------------+---------+-----------+ | drill.exec.functions.cast_empty_string_to_null | SYSTEM | true | +-------------------------------------------------+---------+-----------+ 1 row selected (0.661 seconds) 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`, CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x; Error: SYSTEM ERROR: NumberFormatException: Fragment 0:0 [Error Id: 2a0d104c-33cd-4680-80fe-f908147b5c0a on se-node11.se.lab:31010] (state=,code=0) Shouldn’t that have returned a null for row2? Chris Matta [email protected] 215-701-3146 On Wed, Sep 23, 2015 at 3:14 PM, Jason Altekruse <[email protected]> wrote: > Sudheesh, > > What you said is true for all of the options besides this one. When it was > added it was decided that users of Drill would likely want the results of > their queries to be consistent for all users, so this option only takes > effect at the system level. Unfortunately there is not an error returned > when it is set at the session level, the option just does nothing. > > Since then it seems like users are expected it to function like the rest of > the options (they are afraid of changing other peoples results once several > users are all running queries), so they have been reluctant to set it at > the system level. > > There is an outstanding JIRA to change this to make it consistent with the > others: https://issues.apache.org/jira/browse/DRILL-3259 > > If this JIRA really isn't going to be scheduled before 1.4, I think it > might be good to create a sub-task for improving the experience right now > to at least tell users when they set it at the session level that it is not > allowed. > > > > On Wed, Sep 23, 2015 at 6:01 AM, Christopher Matta <[email protected]> > wrote: > > > Just bumping this, I’m sure other users are going to be running across > this > > eventually and I wanted to see if what I’m experiencing is expected (even > > after setting ALTER SESSION SET > > `drill.exec.functions.cast_empty_string_to_null` = true;), or if it’s a > > bug. > > > > > > Chris Matta > > [email protected] > > 215-701-3146 > > > > On Tue, Sep 22, 2015 at 2:33 PM, Christopher Matta <[email protected]> > > wrote: > > > > > Was the result I saw expected? Seems like I got back an empty string > and > > > not a NULL. > > > > > > Chris Matta > > > [email protected] > > > 215-701-3146 > > > > > > On Tue, Sep 22, 2015 at 1:48 PM, Sudheesh Katkam <[email protected] > > > > > wrote: > > > > > >> See below: > > >> > > >> > On Sep 21, 2015, at 8:22 AM, USC <[email protected]> wrote: > > >> > > > >> > Hi, > > >> > This is a system wide setting. Meaning, you need to say > > >> > > > >> > Alter system set `drill.exec.functions.cast_empty_string_to_null` = > > >> true; > > >> > > >> To clarify, all options available through sys.options (except some > used > > >> for testing) can be set at system and session level. So what Chris did > > >> (setting at session level) works. > > >> > > >> Thank you, > > >> Sudheesh > > >> > > >> > > > >> > > > >> > Sent from my iPhone > > >> > > > >> >> On Sep 21, 2015, at 7:18 AM, Christopher Matta <[email protected]> > > >> wrote: > > >> >> > > >> >> I’m not sure if it worked, the result looks the same when casting > as > > a > > >> >> string (empty field, not a NULL value): > > >> >> > > >> >> 0: jdbc:drill:> ALTER SESSION SET > > >> >> `drill.exec.functions.cast_empty_string_to_null` = true; > > >> >> > +-------+----------------------------------------------------------+ > > >> >> | ok | summary > | > > >> >> > +-------+----------------------------------------------------------+ > > >> >> | true | drill.exec.functions.cast_empty_string_to_null updated. > | > > >> >> > +-------+----------------------------------------------------------+ > > >> >> 1 row selected (1.606 seconds) > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > `row_key`, > > >> >> CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x; > > >> >> Error: SYSTEM ERROR: NumberFormatException: > > >> >> > > >> >> Fragment 0:0 > > >> >> > > >> >> [Error Id: 33e94b4d-6450-40bf-9f2c-bbbfab9f5990 on > > >> >> se-node10.se.lab:31010] (state=,code=0) > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > `row_key`, > > >> >> CASE WHERE x.`a`.`c1` is not null CAST(x.`a`.`c1` as INTEGER) fr > > >> >> Command canceled.`cmatta_test` x; > > >> >> 0: jdbc:drill:> select 'hello' from sys.version; > > >> >> +---------+ > > >> >> | EXPR$0 | > > >> >> +---------+ > > >> >> | hello | > > >> >> +---------+ > > >> >> 1 row selected (0.417 seconds) > > >> >> 0: jdbc:drill:> select cast(NULL as INTEGER) from sys.version; > > >> >> +---------+ > > >> >> | EXPR$0 | > > >> >> +---------+ > > >> >> | null | > > >> >> +---------+ > > >> >> 1 row selected (0.4 seconds) > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > `row_key`, > > >> >> CAST(CAST(x.`a`.`c1` as varchar(64)) as INTEGER) from maprfs.cma > > >> >> tta.`cmatta_test` x; > > >> >> Error: SYSTEM ERROR: NumberFormatException: > > >> >> > > >> >> Fragment 0:0 > > >> >> > > >> >> [Error Id: 71593a43-54ac-4e1d-b3d8-21a2d4d4acd6 on > > >> >> se-node10.se.lab:31010] (state=,code=0) > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > `row_key`, > > >> >> CAST(x.`a`.`c1` as varchar(64)) from maprfs.cmatta.`cmatta_test` > > >> >> x; > > >> >> +----------+---------+ > > >> >> | row_key | EXPR$1 | > > >> >> +----------+---------+ > > >> >> | row1 | 1 | > > >> >> | row2 | | > > >> >> | row3 | 5 | > > >> >> | row4 | 7 | > > >> >> +----------+---------+ > > >> >> 4 rows selected (0.54 seconds) > > >> >> 0: jdbc:drill:> > > >> >> > > >> >> Is this how it’s expected to work? > > >> >> > > >> >> > > >> >> Chris Matta > > >> >> [email protected] > > >> >> 215-701-3146 > > >> >> > > >> >>> On Fri, Sep 18, 2015 at 9:56 PM, Jacques Nadeau < > [email protected] > > > > > >> wrote: > > >> >>> > > >> >>> Does this system option not work: > > >> >>> > > >> >>> ALTER SESSION SET > `drill.exec.functions.cast_empty_string_to_null` = > > >> true; > > >> >>> > > >> >>> The reason the bug was marked INVALID is that SQL engines (not > sure > > >> about > > >> >>> the spec) don't allow casting from empty string to number. The > > system > > >> >>> option above is supposed to allow changing this behavior from the > > SQL > > >> >>> standard for your type of situation. That being said, I see the > docs > > >> say > > >> >>> "not supported in this release". Not sure why that is there. Can > you > > >> give > > >> >>> it a try? > > >> >>> > > >> >>> That being said, it seems like the original issue was a NPE not a > > >> NFE. That > > >> >>> definitely seems like something else. > > >> >>> > > >> >>> > > >> >>> -- > > >> >>> Jacques Nadeau > > >> >>> CTO and Co-Founder, Dremio > > >> >>> > > >> >>> On Thu, Sep 17, 2015 at 10:53 AM, Christopher Matta < > > [email protected]> > > >> >>> wrote: > > >> >>> > > >> >>>> Here is my attempt at building a reproduction, btw, it seems like > > >> this is > > >> >>>> the same issue as DRILL-862 > > >> >>>> <https://issues.apache.org/jira/browse/DRILL-862> where Jacques > > >> >>> determined > > >> >>>> the error to be invalid. Is trying to cast an empty string, or > null > > >> value > > >> >>>> to an integer invalid? What's the workaround? > > >> >>>> > > >> >>>> Data > > >> >>>> > > >> >>>> row1,1,2 > > >> >>>> row2,,4 > > >> >>>> row3,5,6 > > >> >>>> row4,7,8 > > >> >>>> > > >> >>>> Create Table > > >> >>>> > > >> >>>> $ maprcli table create -path /user/cmatta/projects/cmatta_test > > >> >>>> $ maprcli table cf create -path /user/cmatta/projects/cmatta_test > > >> >>> -cfname a > > >> >>>> > > >> >>>> Load into Hbase table: > > >> >>>> > > >> >>>> hbase org.apache.hadoop.hbase.mapreduce.ImportTsv > > >> >>>> -Dimporttsv.separator=',' > > -Dimporttsv.columns=HBASE_ROW_KEY,a:c1,a:c2 > > >> >>>> /user/cmatta/projects/cmatta_test > > >> >>>> maprfs:///user/cmatta/projects/testdata_hbase_null > > >> >>>> > > >> >>>> Query (error): > > >> >>>> > > >> >>>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > >> `row_key`, > > >> >>>> CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x; > > >> >>>> Error: SYSTEM ERROR: NumberFormatException: > > >> >>>> > > >> >>>> Fragment 0:0 > > >> >>>> > > >> >>>> [Error Id: cbcb3327-3699-4191-9c26-9b95c9922690 on > > >> >>>> se-node11.se.lab:31010] (state=,code=0) > > >> >>>> > > >> >>>> Query that works on the column (c2) that doesn’t have a NULL > value: > > >> >>>> > > >> >>>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as > > >> `row_key`, > > >> >>>> CAST(x.`a`.`c2` as INTEGER) from maprfs.cmatta.`cmatta_test` x; > > >> >>>> +----------+---------+ > > >> >>>> | row_key | EXPR$1 | > > >> >>>> +----------+---------+ > > >> >>>> | row1 | 2 | > > >> >>>> | row2 | 4 | > > >> >>>> | row3 | 6 | > > >> >>>> | row4 | 8 | > > >> >>>> +----------+---------+ > > >> >>>> 4 rows selected (0.308 seconds) > > >> >>>> > > >> >>>> > > >> >>>> Chris Matta > > >> >>>> [email protected] > > >> >>>> 215-701-3146 > > >> >>>> > > >> >>>> On Tue, Sep 15, 2015 at 3:09 PM, Christopher Matta < > > [email protected]> > > >> >>>> wrote: > > >> >>>> > > >> >>>>> Did a Jira get opened for this? If not, Nathaniel, can you > provide > > >> me > > >> >>>> with > > >> >>>>> your query and a sample data set so I can open one? > > >> >>>>> > > >> >>>>> Chris Matta > > >> >>>>> [email protected] > > >> >>>>> 215-701-3146 > > >> >>>>> > > >> >>>>> On Thu, Sep 10, 2015 at 5:03 PM, Jason Altekruse < > > >> >>>> [email protected] > > >> >>>>>> wrote: > > >> >>>>> > > >> >>>>>> A SQL level null is different than a null at the JAVA level > that > > >> would > > >> >>>> be > > >> >>>>>> giving this exception (we don't represent nulls with an actual > > null > > >> >>> java > > >> >>>>>> object). There might be a way to work around it, but this is a > > bug > > >> in > > >> >>>>>> Drill. You should be able to make a cast between compatible > types > > >> even > > >> >>>> if > > >> >>>>>> there are nulls in the dataset. > > >> >>>>>> > > >> >>>>>> Could you open a JIRA and put as much description about your > data > > >> and > > >> >>>>>> query > > >> >>>>>> as possible? If you have the time to create a small repro, > > loading > > >> an > > >> >>>>>> Hbase > > >> >>>>>> table with a small set of data that produces the problem that > > >> would be > > >> >>>>>> very > > >> >>>>>> helpful. In any case, whatever you can provide will be useful > the > > >> dev > > >> >>>> who > > >> >>>>>> picks up the JIRA. > > >> >>>>>> > > >> >>>>>> On Thu, Sep 10, 2015 at 12:49 PM, Christopher Matta < > > >> [email protected]> > > >> >>>>>> wrote: > > >> >>>>>> > > >> >>>>>>> Does changing the query to something like this work?: > > >> >>>>>>> > > >> >>>>>>> SELECT CAST(CASE WHEN table.cf.myColumn IS NOT NULL THEN > > >> >>>>>>> table.cf.myColumn ELSE NULL END as INTEGER) > > >> >>>>>>> ... > > >> >>>>>>> > > >> >>>>>>> > > >> >>>>>>> > > >> >>>>>>> Chris Matta > > >> >>>>>>> [email protected] > > >> >>>>>>> 215-701-3146 > > >> >>>>>>> > > >> >>>>>>> On Thu, Sep 10, 2015 at 12:49 PM, Nathaniel Auvil < > > >> >>>>>>> [email protected] > > >> >>>>>>>> wrote: > > >> >>>>>>> > > >> >>>>>>>> i have a MaprDB table which has nulls for some of the > columns. > > >> >>>> When i > > >> >>>>>>>> create a select where i CAST(table.cf.myColumn as INTEGER) > and > > >> >>> there > > >> >>>>>> are > > >> >>>>>>>> nulls instead of simply returning null for the value, there > is > > a > > >> >>>>>>>> NullPointerException. > > >> >>>>>>>> > > >> >>>>>>>> Why not simply return null for the value if it is missing > from > > a > > >> >>>> row? > > >> >>>>>>>> > > >> >>>>>>> > > >> >>>>>> > > >> >>>>> > > >> >>>>> > > >> >>>> > > >> >>> > > >> > > >> > > > > > >
