I actually noticed while I was testing this out that it is not working with
casts to date, I will be opening a JIRA for it. That being said, I did find
a unit test for integer, so this should be working. It is possible that the
data in the column is actually a series of spaces instead of the empty
string. Could you try to throw a call to trim() inside of the cast?

It is also possible that this system option might only impact the cast
functions from varchar to the various other types like double, int, etc.
All of the data read out of HBase is placed in the varbinary type. We might
have decent handling if the data is stored in Hbase in the format we happen
to assume in the cast varbinary to int implementation, but we generally
recommend using the convert_from functions when dealing with binary. I
don't know exactly how this option was implemented, so it might only apply
with varchar. We were primarily focused on issues with lots of case
statements when querying text files when it was added.

- Jason

On Wed, Sep 23, 2015 at 1:49 PM, Christopher Matta <[email protected]> wrote:

> This system setting was set, yet I still experience the
> NumberFormatException with MapR-DB when a column contains a NULL value:
>
> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`,
> CAST(x.`a`.`c1` as varchar(64)) from maprfs.cmatta.`cmatta_test`
> . . . . . . . >  x;
> +----------+---------+
> | row_key  | EXPR$1  |
> +----------+---------+
> | row1     | 1       |
> | row2     |         |
> | row3     | 5       |
> | row4     | 7       |
> +----------+---------+
> 4 rows selected (0.64 seconds)
> 0: jdbc:drill:> select name, type, bool_val from sys.options where
> name like '%null%';
> +-------------------------------------------------+---------+-----------+
> |                      name                       |  type   | bool_val  |
> +-------------------------------------------------+---------+-----------+
> | drill.exec.functions.cast_empty_string_to_null  | SYSTEM  | true      |
> +-------------------------------------------------+---------+-----------+
> 1 row selected (0.661 seconds)
> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`,
> CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x;
> Error: SYSTEM ERROR: NumberFormatException:
>
> Fragment 0:0
>
> [Error Id: 2a0d104c-33cd-4680-80fe-f908147b5c0a on
> se-node11.se.lab:31010] (state=,code=0)
>
> Shouldn’t that have returned a null for row2?
> ​
>
> Chris Matta
> [email protected]
> 215-701-3146
>
> On Wed, Sep 23, 2015 at 3:14 PM, Jason Altekruse <[email protected]
> >
> wrote:
>
> > Sudheesh,
> >
> > What you said is true for all of the options besides this one. When it
> was
> > added it was decided that users of Drill would likely want the results of
> > their queries to be consistent for all users, so this option only takes
> > effect at the system level. Unfortunately there is not an error returned
> > when it is set at the session level, the option just does nothing.
> >
> > Since then it seems like users are expected it to function like the rest
> of
> > the options (they are afraid of changing other peoples results once
> several
> > users are all running queries), so they have been reluctant to set it at
> > the system level.
> >
> > There is an outstanding JIRA to change this to make it consistent with
> the
> > others: https://issues.apache.org/jira/browse/DRILL-3259
> >
> > If this JIRA really isn't going to be scheduled before 1.4, I think it
> > might be good to create a sub-task for improving the experience right now
> > to at least tell users when they set it at the session level that it is
> not
> > allowed.
> >
> >
> >
> > On Wed, Sep 23, 2015 at 6:01 AM, Christopher Matta <[email protected]>
> > wrote:
> >
> > > Just bumping this, I’m sure other users are going to be running across
> > this
> > > eventually and I wanted to see if what I’m experiencing is expected
> (even
> > > after setting ALTER SESSION SET
> > > `drill.exec.functions.cast_empty_string_to_null` = true;), or if it’s a
> > > bug.
> > > ​
> > >
> > > Chris Matta
> > > [email protected]
> > > 215-701-3146
> > >
> > > On Tue, Sep 22, 2015 at 2:33 PM, Christopher Matta <[email protected]>
> > > wrote:
> > >
> > > > Was the result I saw expected? Seems like I got back an empty string
> > and
> > > > not a NULL.
> > > >
> > > > Chris Matta
> > > > [email protected]
> > > > 215-701-3146
> > > >
> > > > On Tue, Sep 22, 2015 at 1:48 PM, Sudheesh Katkam <
> [email protected]
> > >
> > > > wrote:
> > > >
> > > >> See below:
> > > >>
> > > >> > On Sep 21, 2015, at 8:22 AM, USC <[email protected]> wrote:
> > > >> >
> > > >> > Hi,
> > > >> > This is a system wide setting. Meaning, you need to say
> > > >> >
> > > >> > Alter system set `drill.exec.functions.cast_empty_string_to_null`
> =
> > > >> true;
> > > >>
> > > >> To clarify, all options available through sys.options (except some
> > used
> > > >> for testing) can be set at system and session level. So what Chris
> did
> > > >> (setting at session level) works.
> > > >>
> > > >> Thank you,
> > > >> Sudheesh
> > > >>
> > > >> >
> > > >> >
> > > >> > Sent from my iPhone
> > > >> >
> > > >> >> On Sep 21, 2015, at 7:18 AM, Christopher Matta <[email protected]>
> > > >> wrote:
> > > >> >>
> > > >> >> I’m not sure if it worked, the result looks the same when casting
> > as
> > > a
> > > >> >> string (empty field, not a NULL value):
> > > >> >>
> > > >> >> 0: jdbc:drill:> ALTER SESSION SET
> > > >> >> `drill.exec.functions.cast_empty_string_to_null` = true;
> > > >> >>
> > +-------+----------------------------------------------------------+
> > > >> >> |  ok   |                         summary
> > |
> > > >> >>
> > +-------+----------------------------------------------------------+
> > > >> >> | true  | drill.exec.functions.cast_empty_string_to_null updated.
> > |
> > > >> >>
> > +-------+----------------------------------------------------------+
> > > >> >> 1 row selected (1.606 seconds)
> > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > `row_key`,
> > > >> >> CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x;
> > > >> >> Error: SYSTEM ERROR: NumberFormatException:
> > > >> >>
> > > >> >> Fragment 0:0
> > > >> >>
> > > >> >> [Error Id: 33e94b4d-6450-40bf-9f2c-bbbfab9f5990 on
> > > >> >> se-node10.se.lab:31010] (state=,code=0)
> > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > `row_key`,
> > > >> >> CASE WHERE x.`a`.`c1` is not null CAST(x.`a`.`c1` as INTEGER) fr
> > > >> >> Command canceled.`cmatta_test` x;
> > > >> >> 0: jdbc:drill:> select 'hello' from sys.version;
> > > >> >> +---------+
> > > >> >> | EXPR$0  |
> > > >> >> +---------+
> > > >> >> | hello   |
> > > >> >> +---------+
> > > >> >> 1 row selected (0.417 seconds)
> > > >> >> 0: jdbc:drill:> select cast(NULL as INTEGER) from sys.version;
> > > >> >> +---------+
> > > >> >> | EXPR$0  |
> > > >> >> +---------+
> > > >> >> | null    |
> > > >> >> +---------+
> > > >> >> 1 row selected (0.4 seconds)
> > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > `row_key`,
> > > >> >> CAST(CAST(x.`a`.`c1` as varchar(64)) as INTEGER) from maprfs.cma
> > > >> >> tta.`cmatta_test` x;
> > > >> >> Error: SYSTEM ERROR: NumberFormatException:
> > > >> >>
> > > >> >> Fragment 0:0
> > > >> >>
> > > >> >> [Error Id: 71593a43-54ac-4e1d-b3d8-21a2d4d4acd6 on
> > > >> >> se-node10.se.lab:31010] (state=,code=0)
> > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > `row_key`,
> > > >> >> CAST(x.`a`.`c1` as varchar(64)) from maprfs.cmatta.`cmatta_test`
> > > >> >> x;
> > > >> >> +----------+---------+
> > > >> >> | row_key  | EXPR$1  |
> > > >> >> +----------+---------+
> > > >> >> | row1     | 1       |
> > > >> >> | row2     |         |
> > > >> >> | row3     | 5       |
> > > >> >> | row4     | 7       |
> > > >> >> +----------+---------+
> > > >> >> 4 rows selected (0.54 seconds)
> > > >> >> 0: jdbc:drill:>
> > > >> >>
> > > >> >> Is this how it’s expected to work?
> > > >> >>
> > > >> >>
> > > >> >> Chris Matta
> > > >> >> [email protected]
> > > >> >> 215-701-3146
> > > >> >>
> > > >> >>> On Fri, Sep 18, 2015 at 9:56 PM, Jacques Nadeau <
> > [email protected]
> > > >
> > > >> wrote:
> > > >> >>>
> > > >> >>> Does this system option not work:
> > > >> >>>
> > > >> >>> ALTER SESSION SET
> > `drill.exec.functions.cast_empty_string_to_null` =
> > > >> true;
> > > >> >>>
> > > >> >>> The reason the bug was marked INVALID is that SQL engines (not
> > sure
> > > >> about
> > > >> >>> the spec) don't allow casting from empty string to number. The
> > > system
> > > >> >>> option above is supposed to allow changing this behavior from
> the
> > > SQL
> > > >> >>> standard for your type of situation. That being said, I see the
> > docs
> > > >> say
> > > >> >>> "not supported in this release". Not sure why that is there. Can
> > you
> > > >> give
> > > >> >>> it a try?
> > > >> >>>
> > > >> >>> That being said, it seems like the original issue was a NPE not
> a
> > > >> NFE. That
> > > >> >>> definitely seems like something else.
> > > >> >>>
> > > >> >>>
> > > >> >>> --
> > > >> >>> Jacques Nadeau
> > > >> >>> CTO and Co-Founder, Dremio
> > > >> >>>
> > > >> >>> On Thu, Sep 17, 2015 at 10:53 AM, Christopher Matta <
> > > [email protected]>
> > > >> >>> wrote:
> > > >> >>>
> > > >> >>>> Here is my attempt at building a reproduction, btw, it seems
> like
> > > >> this is
> > > >> >>>> the same issue as DRILL-862
> > > >> >>>> <https://issues.apache.org/jira/browse/DRILL-862> where
> Jacques
> > > >> >>> determined
> > > >> >>>> the error to be invalid. Is trying to cast an empty string, or
> > null
> > > >> value
> > > >> >>>> to an integer invalid? What's the workaround?
> > > >> >>>>
> > > >> >>>> Data
> > > >> >>>>
> > > >> >>>> row1,1,2
> > > >> >>>> row2,,4
> > > >> >>>> row3,5,6
> > > >> >>>> row4,7,8
> > > >> >>>>
> > > >> >>>> Create Table
> > > >> >>>>
> > > >> >>>> $ maprcli table create -path /user/cmatta/projects/cmatta_test
> > > >> >>>> $ maprcli table cf create -path
> /user/cmatta/projects/cmatta_test
> > > >> >>> -cfname a
> > > >> >>>>
> > > >> >>>> Load into Hbase table:
> > > >> >>>>
> > > >> >>>> hbase org.apache.hadoop.hbase.mapreduce.ImportTsv
> > > >> >>>> -Dimporttsv.separator=','
> > > -Dimporttsv.columns=HBASE_ROW_KEY,a:c1,a:c2
> > > >> >>>> /user/cmatta/projects/cmatta_test
> > > >> >>>> maprfs:///user/cmatta/projects/testdata_hbase_null
> > > >> >>>>
> > > >> >>>> Query (error):
> > > >> >>>>
> > > >> >>>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > >> `row_key`,
> > > >> >>>> CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x;
> > > >> >>>> Error: SYSTEM ERROR: NumberFormatException:
> > > >> >>>>
> > > >> >>>> Fragment 0:0
> > > >> >>>>
> > > >> >>>> [Error Id: cbcb3327-3699-4191-9c26-9b95c9922690 on
> > > >> >>>> se-node11.se.lab:31010] (state=,code=0)
> > > >> >>>>
> > > >> >>>> Query that works on the column (c2) that doesn’t have a NULL
> > value:
> > > >> >>>>
> > > >> >>>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > >> `row_key`,
> > > >> >>>> CAST(x.`a`.`c2` as INTEGER) from maprfs.cmatta.`cmatta_test` x;
> > > >> >>>> +----------+---------+
> > > >> >>>> | row_key  | EXPR$1  |
> > > >> >>>> +----------+---------+
> > > >> >>>> | row1     | 2       |
> > > >> >>>> | row2     | 4       |
> > > >> >>>> | row3     | 6       |
> > > >> >>>> | row4     | 8       |
> > > >> >>>> +----------+---------+
> > > >> >>>> 4 rows selected (0.308 seconds)
> > > >> >>>>
> > > >> >>>>
> > > >> >>>> Chris Matta
> > > >> >>>> [email protected]
> > > >> >>>> 215-701-3146
> > > >> >>>>
> > > >> >>>> On Tue, Sep 15, 2015 at 3:09 PM, Christopher Matta <
> > > [email protected]>
> > > >> >>>> wrote:
> > > >> >>>>
> > > >> >>>>> Did a Jira get opened for this? If not, Nathaniel, can you
> > provide
> > > >> me
> > > >> >>>> with
> > > >> >>>>> your query and a sample data set so I can open one?
> > > >> >>>>>
> > > >> >>>>> Chris Matta
> > > >> >>>>> [email protected]
> > > >> >>>>> 215-701-3146
> > > >> >>>>>
> > > >> >>>>> On Thu, Sep 10, 2015 at 5:03 PM, Jason Altekruse <
> > > >> >>>> [email protected]
> > > >> >>>>>> wrote:
> > > >> >>>>>
> > > >> >>>>>> A SQL level null is different than a null at the JAVA level
> > that
> > > >> would
> > > >> >>>> be
> > > >> >>>>>> giving this exception (we don't represent nulls with an
> actual
> > > null
> > > >> >>> java
> > > >> >>>>>> object). There might be a way to work around it, but this is
> a
> > > bug
> > > >> in
> > > >> >>>>>> Drill. You should be able to make a cast between compatible
> > types
> > > >> even
> > > >> >>>> if
> > > >> >>>>>> there are nulls in the dataset.
> > > >> >>>>>>
> > > >> >>>>>> Could you open a JIRA and put as much description about your
> > data
> > > >> and
> > > >> >>>>>> query
> > > >> >>>>>> as possible? If you have the time to create a small repro,
> > > loading
> > > >> an
> > > >> >>>>>> Hbase
> > > >> >>>>>> table with a small set of data that produces the problem that
> > > >> would be
> > > >> >>>>>> very
> > > >> >>>>>> helpful. In any case, whatever you can provide will be useful
> > the
> > > >> dev
> > > >> >>>> who
> > > >> >>>>>> picks up the JIRA.
> > > >> >>>>>>
> > > >> >>>>>> On Thu, Sep 10, 2015 at 12:49 PM, Christopher Matta <
> > > >> [email protected]>
> > > >> >>>>>> wrote:
> > > >> >>>>>>
> > > >> >>>>>>> Does changing the query to something like this work?:
> > > >> >>>>>>>
> > > >> >>>>>>> SELECT CAST(CASE WHEN table.cf.myColumn IS NOT NULL THEN
> > > >> >>>>>>> table.cf.myColumn ELSE NULL END as INTEGER)
> > > >> >>>>>>> ...
> > > >> >>>>>>>
> > > >> >>>>>>>
> > > >> >>>>>>>
> > > >> >>>>>>> Chris Matta
> > > >> >>>>>>> [email protected]
> > > >> >>>>>>> 215-701-3146
> > > >> >>>>>>>
> > > >> >>>>>>> On Thu, Sep 10, 2015 at 12:49 PM, Nathaniel Auvil <
> > > >> >>>>>>> [email protected]
> > > >> >>>>>>>> wrote:
> > > >> >>>>>>>
> > > >> >>>>>>>> i have a MaprDB table which has nulls for some of the
> > columns.
> > > >> >>>> When i
> > > >> >>>>>>>> create a select where i CAST(table.cf.myColumn as INTEGER)
> > and
> > > >> >>> there
> > > >> >>>>>> are
> > > >> >>>>>>>> nulls instead of simply returning null for the value, there
> > is
> > > a
> > > >> >>>>>>>> NullPointerException.
> > > >> >>>>>>>>
> > > >> >>>>>>>> Why not simply return null for the value if it is missing
> > from
> > > a
> > > >> >>>> row?
> > > >> >>>>>>>>
> > > >> >>>>>>>
> > > >> >>>>>>
> > > >> >>>>>
> > > >> >>>>>
> > > >> >>>>
> > > >> >>>
> > > >>
> > > >>
> > > >
> > >
> >
>

Reply via email to