Jason, using trim() did the trick:

0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`,
CAST(trim(x.`a`.`c1`) as INTEGER) from maprfs.cmatta.`cmatta_test`
. . . . . . . > x ;
+----------+---------+
| row_key  | EXPR$1  |
+----------+---------+
| row1     | 1       |
| row2     | null    |
| row3     | 5       |
| row4     | 7       |
+----------+---------+
4 rows selected (0.542 seconds)
0: jdbc:drill:>

​

Chris Matta
[email protected]
215-701-3146

On Wed, Sep 23, 2015 at 5:15 PM, Jason Altekruse <[email protected]>
wrote:

> I actually noticed while I was testing this out that it is not working with
> casts to date, I will be opening a JIRA for it. That being said, I did find
> a unit test for integer, so this should be working. It is possible that the
> data in the column is actually a series of spaces instead of the empty
> string. Could you try to throw a call to trim() inside of the cast?
>
> It is also possible that this system option might only impact the cast
> functions from varchar to the various other types like double, int, etc.
> All of the data read out of HBase is placed in the varbinary type. We might
> have decent handling if the data is stored in Hbase in the format we happen
> to assume in the cast varbinary to int implementation, but we generally
> recommend using the convert_from functions when dealing with binary. I
> don't know exactly how this option was implemented, so it might only apply
> with varchar. We were primarily focused on issues with lots of case
> statements when querying text files when it was added.
>
> - Jason
>
> On Wed, Sep 23, 2015 at 1:49 PM, Christopher Matta <[email protected]>
> wrote:
>
> > This system setting was set, yet I still experience the
> > NumberFormatException with MapR-DB when a column contains a NULL value:
> >
> > 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`,
> > CAST(x.`a`.`c1` as varchar(64)) from maprfs.cmatta.`cmatta_test`
> > . . . . . . . >  x;
> > +----------+---------+
> > | row_key  | EXPR$1  |
> > +----------+---------+
> > | row1     | 1       |
> > | row2     |         |
> > | row3     | 5       |
> > | row4     | 7       |
> > +----------+---------+
> > 4 rows selected (0.64 seconds)
> > 0: jdbc:drill:> select name, type, bool_val from sys.options where
> > name like '%null%';
> > +-------------------------------------------------+---------+-----------+
> > |                      name                       |  type   | bool_val  |
> > +-------------------------------------------------+---------+-----------+
> > | drill.exec.functions.cast_empty_string_to_null  | SYSTEM  | true      |
> > +-------------------------------------------------+---------+-----------+
> > 1 row selected (0.661 seconds)
> > 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`,
> > CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x;
> > Error: SYSTEM ERROR: NumberFormatException:
> >
> > Fragment 0:0
> >
> > [Error Id: 2a0d104c-33cd-4680-80fe-f908147b5c0a on
> > se-node11.se.lab:31010] (state=,code=0)
> >
> > Shouldn’t that have returned a null for row2?
> > ​
> >
> > Chris Matta
> > [email protected]
> > 215-701-3146
> >
> > On Wed, Sep 23, 2015 at 3:14 PM, Jason Altekruse <
> [email protected]
> > >
> > wrote:
> >
> > > Sudheesh,
> > >
> > > What you said is true for all of the options besides this one. When it
> > was
> > > added it was decided that users of Drill would likely want the results
> of
> > > their queries to be consistent for all users, so this option only takes
> > > effect at the system level. Unfortunately there is not an error
> returned
> > > when it is set at the session level, the option just does nothing.
> > >
> > > Since then it seems like users are expected it to function like the
> rest
> > of
> > > the options (they are afraid of changing other peoples results once
> > several
> > > users are all running queries), so they have been reluctant to set it
> at
> > > the system level.
> > >
> > > There is an outstanding JIRA to change this to make it consistent with
> > the
> > > others: https://issues.apache.org/jira/browse/DRILL-3259
> > >
> > > If this JIRA really isn't going to be scheduled before 1.4, I think it
> > > might be good to create a sub-task for improving the experience right
> now
> > > to at least tell users when they set it at the session level that it is
> > not
> > > allowed.
> > >
> > >
> > >
> > > On Wed, Sep 23, 2015 at 6:01 AM, Christopher Matta <[email protected]>
> > > wrote:
> > >
> > > > Just bumping this, I’m sure other users are going to be running
> across
> > > this
> > > > eventually and I wanted to see if what I’m experiencing is expected
> > (even
> > > > after setting ALTER SESSION SET
> > > > `drill.exec.functions.cast_empty_string_to_null` = true;), or if
> it’s a
> > > > bug.
> > > > ​
> > > >
> > > > Chris Matta
> > > > [email protected]
> > > > 215-701-3146
> > > >
> > > > On Tue, Sep 22, 2015 at 2:33 PM, Christopher Matta <[email protected]>
> > > > wrote:
> > > >
> > > > > Was the result I saw expected? Seems like I got back an empty
> string
> > > and
> > > > > not a NULL.
> > > > >
> > > > > Chris Matta
> > > > > [email protected]
> > > > > 215-701-3146
> > > > >
> > > > > On Tue, Sep 22, 2015 at 1:48 PM, Sudheesh Katkam <
> > [email protected]
> > > >
> > > > > wrote:
> > > > >
> > > > >> See below:
> > > > >>
> > > > >> > On Sep 21, 2015, at 8:22 AM, USC <[email protected]> wrote:
> > > > >> >
> > > > >> > Hi,
> > > > >> > This is a system wide setting. Meaning, you need to say
> > > > >> >
> > > > >> > Alter system set
> `drill.exec.functions.cast_empty_string_to_null`
> > =
> > > > >> true;
> > > > >>
> > > > >> To clarify, all options available through sys.options (except some
> > > used
> > > > >> for testing) can be set at system and session level. So what Chris
> > did
> > > > >> (setting at session level) works.
> > > > >>
> > > > >> Thank you,
> > > > >> Sudheesh
> > > > >>
> > > > >> >
> > > > >> >
> > > > >> > Sent from my iPhone
> > > > >> >
> > > > >> >> On Sep 21, 2015, at 7:18 AM, Christopher Matta <
> [email protected]>
> > > > >> wrote:
> > > > >> >>
> > > > >> >> I’m not sure if it worked, the result looks the same when
> casting
> > > as
> > > > a
> > > > >> >> string (empty field, not a NULL value):
> > > > >> >>
> > > > >> >> 0: jdbc:drill:> ALTER SESSION SET
> > > > >> >> `drill.exec.functions.cast_empty_string_to_null` = true;
> > > > >> >>
> > > +-------+----------------------------------------------------------+
> > > > >> >> |  ok   |                         summary
> > > |
> > > > >> >>
> > > +-------+----------------------------------------------------------+
> > > > >> >> | true  | drill.exec.functions.cast_empty_string_to_null
> updated.
> > > |
> > > > >> >>
> > > +-------+----------------------------------------------------------+
> > > > >> >> 1 row selected (1.606 seconds)
> > > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > > `row_key`,
> > > > >> >> CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x;
> > > > >> >> Error: SYSTEM ERROR: NumberFormatException:
> > > > >> >>
> > > > >> >> Fragment 0:0
> > > > >> >>
> > > > >> >> [Error Id: 33e94b4d-6450-40bf-9f2c-bbbfab9f5990 on
> > > > >> >> se-node10.se.lab:31010] (state=,code=0)
> > > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > > `row_key`,
> > > > >> >> CASE WHERE x.`a`.`c1` is not null CAST(x.`a`.`c1` as INTEGER)
> fr
> > > > >> >> Command canceled.`cmatta_test` x;
> > > > >> >> 0: jdbc:drill:> select 'hello' from sys.version;
> > > > >> >> +---------+
> > > > >> >> | EXPR$0  |
> > > > >> >> +---------+
> > > > >> >> | hello   |
> > > > >> >> +---------+
> > > > >> >> 1 row selected (0.417 seconds)
> > > > >> >> 0: jdbc:drill:> select cast(NULL as INTEGER) from sys.version;
> > > > >> >> +---------+
> > > > >> >> | EXPR$0  |
> > > > >> >> +---------+
> > > > >> >> | null    |
> > > > >> >> +---------+
> > > > >> >> 1 row selected (0.4 seconds)
> > > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > > `row_key`,
> > > > >> >> CAST(CAST(x.`a`.`c1` as varchar(64)) as INTEGER) from
> maprfs.cma
> > > > >> >> tta.`cmatta_test` x;
> > > > >> >> Error: SYSTEM ERROR: NumberFormatException:
> > > > >> >>
> > > > >> >> Fragment 0:0
> > > > >> >>
> > > > >> >> [Error Id: 71593a43-54ac-4e1d-b3d8-21a2d4d4acd6 on
> > > > >> >> se-node10.se.lab:31010] (state=,code=0)
> > > > >> >> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > > `row_key`,
> > > > >> >> CAST(x.`a`.`c1` as varchar(64)) from
> maprfs.cmatta.`cmatta_test`
> > > > >> >> x;
> > > > >> >> +----------+---------+
> > > > >> >> | row_key  | EXPR$1  |
> > > > >> >> +----------+---------+
> > > > >> >> | row1     | 1       |
> > > > >> >> | row2     |         |
> > > > >> >> | row3     | 5       |
> > > > >> >> | row4     | 7       |
> > > > >> >> +----------+---------+
> > > > >> >> 4 rows selected (0.54 seconds)
> > > > >> >> 0: jdbc:drill:>
> > > > >> >>
> > > > >> >> Is this how it’s expected to work?
> > > > >> >>
> > > > >> >>
> > > > >> >> Chris Matta
> > > > >> >> [email protected]
> > > > >> >> 215-701-3146
> > > > >> >>
> > > > >> >>> On Fri, Sep 18, 2015 at 9:56 PM, Jacques Nadeau <
> > > [email protected]
> > > > >
> > > > >> wrote:
> > > > >> >>>
> > > > >> >>> Does this system option not work:
> > > > >> >>>
> > > > >> >>> ALTER SESSION SET
> > > `drill.exec.functions.cast_empty_string_to_null` =
> > > > >> true;
> > > > >> >>>
> > > > >> >>> The reason the bug was marked INVALID is that SQL engines (not
> > > sure
> > > > >> about
> > > > >> >>> the spec) don't allow casting from empty string to number. The
> > > > system
> > > > >> >>> option above is supposed to allow changing this behavior from
> > the
> > > > SQL
> > > > >> >>> standard for your type of situation. That being said, I see
> the
> > > docs
> > > > >> say
> > > > >> >>> "not supported in this release". Not sure why that is there.
> Can
> > > you
> > > > >> give
> > > > >> >>> it a try?
> > > > >> >>>
> > > > >> >>> That being said, it seems like the original issue was a NPE
> not
> > a
> > > > >> NFE. That
> > > > >> >>> definitely seems like something else.
> > > > >> >>>
> > > > >> >>>
> > > > >> >>> --
> > > > >> >>> Jacques Nadeau
> > > > >> >>> CTO and Co-Founder, Dremio
> > > > >> >>>
> > > > >> >>> On Thu, Sep 17, 2015 at 10:53 AM, Christopher Matta <
> > > > [email protected]>
> > > > >> >>> wrote:
> > > > >> >>>
> > > > >> >>>> Here is my attempt at building a reproduction, btw, it seems
> > like
> > > > >> this is
> > > > >> >>>> the same issue as DRILL-862
> > > > >> >>>> <https://issues.apache.org/jira/browse/DRILL-862> where
> > Jacques
> > > > >> >>> determined
> > > > >> >>>> the error to be invalid. Is trying to cast an empty string,
> or
> > > null
> > > > >> value
> > > > >> >>>> to an integer invalid? What's the workaround?
> > > > >> >>>>
> > > > >> >>>> Data
> > > > >> >>>>
> > > > >> >>>> row1,1,2
> > > > >> >>>> row2,,4
> > > > >> >>>> row3,5,6
> > > > >> >>>> row4,7,8
> > > > >> >>>>
> > > > >> >>>> Create Table
> > > > >> >>>>
> > > > >> >>>> $ maprcli table create -path
> /user/cmatta/projects/cmatta_test
> > > > >> >>>> $ maprcli table cf create -path
> > /user/cmatta/projects/cmatta_test
> > > > >> >>> -cfname a
> > > > >> >>>>
> > > > >> >>>> Load into Hbase table:
> > > > >> >>>>
> > > > >> >>>> hbase org.apache.hadoop.hbase.mapreduce.ImportTsv
> > > > >> >>>> -Dimporttsv.separator=','
> > > > -Dimporttsv.columns=HBASE_ROW_KEY,a:c1,a:c2
> > > > >> >>>> /user/cmatta/projects/cmatta_test
> > > > >> >>>> maprfs:///user/cmatta/projects/testdata_hbase_null
> > > > >> >>>>
> > > > >> >>>> Query (error):
> > > > >> >>>>
> > > > >> >>>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > > >> `row_key`,
> > > > >> >>>> CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test`
> x;
> > > > >> >>>> Error: SYSTEM ERROR: NumberFormatException:
> > > > >> >>>>
> > > > >> >>>> Fragment 0:0
> > > > >> >>>>
> > > > >> >>>> [Error Id: cbcb3327-3699-4191-9c26-9b95c9922690 on
> > > > >> >>>> se-node11.se.lab:31010] (state=,code=0)
> > > > >> >>>>
> > > > >> >>>> Query that works on the column (c2) that doesn’t have a NULL
> > > value:
> > > > >> >>>>
> > > > >> >>>> 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as
> > > > >> `row_key`,
> > > > >> >>>> CAST(x.`a`.`c2` as INTEGER) from maprfs.cmatta.`cmatta_test`
> x;
> > > > >> >>>> +----------+---------+
> > > > >> >>>> | row_key  | EXPR$1  |
> > > > >> >>>> +----------+---------+
> > > > >> >>>> | row1     | 2       |
> > > > >> >>>> | row2     | 4       |
> > > > >> >>>> | row3     | 6       |
> > > > >> >>>> | row4     | 8       |
> > > > >> >>>> +----------+---------+
> > > > >> >>>> 4 rows selected (0.308 seconds)
> > > > >> >>>>
> > > > >> >>>>
> > > > >> >>>> Chris Matta
> > > > >> >>>> [email protected]
> > > > >> >>>> 215-701-3146
> > > > >> >>>>
> > > > >> >>>> On Tue, Sep 15, 2015 at 3:09 PM, Christopher Matta <
> > > > [email protected]>
> > > > >> >>>> wrote:
> > > > >> >>>>
> > > > >> >>>>> Did a Jira get opened for this? If not, Nathaniel, can you
> > > provide
> > > > >> me
> > > > >> >>>> with
> > > > >> >>>>> your query and a sample data set so I can open one?
> > > > >> >>>>>
> > > > >> >>>>> Chris Matta
> > > > >> >>>>> [email protected]
> > > > >> >>>>> 215-701-3146
> > > > >> >>>>>
> > > > >> >>>>> On Thu, Sep 10, 2015 at 5:03 PM, Jason Altekruse <
> > > > >> >>>> [email protected]
> > > > >> >>>>>> wrote:
> > > > >> >>>>>
> > > > >> >>>>>> A SQL level null is different than a null at the JAVA level
> > > that
> > > > >> would
> > > > >> >>>> be
> > > > >> >>>>>> giving this exception (we don't represent nulls with an
> > actual
> > > > null
> > > > >> >>> java
> > > > >> >>>>>> object). There might be a way to work around it, but this
> is
> > a
> > > > bug
> > > > >> in
> > > > >> >>>>>> Drill. You should be able to make a cast between compatible
> > > types
> > > > >> even
> > > > >> >>>> if
> > > > >> >>>>>> there are nulls in the dataset.
> > > > >> >>>>>>
> > > > >> >>>>>> Could you open a JIRA and put as much description about
> your
> > > data
> > > > >> and
> > > > >> >>>>>> query
> > > > >> >>>>>> as possible? If you have the time to create a small repro,
> > > > loading
> > > > >> an
> > > > >> >>>>>> Hbase
> > > > >> >>>>>> table with a small set of data that produces the problem
> that
> > > > >> would be
> > > > >> >>>>>> very
> > > > >> >>>>>> helpful. In any case, whatever you can provide will be
> useful
> > > the
> > > > >> dev
> > > > >> >>>> who
> > > > >> >>>>>> picks up the JIRA.
> > > > >> >>>>>>
> > > > >> >>>>>> On Thu, Sep 10, 2015 at 12:49 PM, Christopher Matta <
> > > > >> [email protected]>
> > > > >> >>>>>> wrote:
> > > > >> >>>>>>
> > > > >> >>>>>>> Does changing the query to something like this work?:
> > > > >> >>>>>>>
> > > > >> >>>>>>> SELECT CAST(CASE WHEN table.cf.myColumn IS NOT NULL THEN
> > > > >> >>>>>>> table.cf.myColumn ELSE NULL END as INTEGER)
> > > > >> >>>>>>> ...
> > > > >> >>>>>>>
> > > > >> >>>>>>>
> > > > >> >>>>>>>
> > > > >> >>>>>>> Chris Matta
> > > > >> >>>>>>> [email protected]
> > > > >> >>>>>>> 215-701-3146
> > > > >> >>>>>>>
> > > > >> >>>>>>> On Thu, Sep 10, 2015 at 12:49 PM, Nathaniel Auvil <
> > > > >> >>>>>>> [email protected]
> > > > >> >>>>>>>> wrote:
> > > > >> >>>>>>>
> > > > >> >>>>>>>> i have a MaprDB table which has nulls for some of the
> > > columns.
> > > > >> >>>> When i
> > > > >> >>>>>>>> create a select where i CAST(table.cf.myColumn as
> INTEGER)
> > > and
> > > > >> >>> there
> > > > >> >>>>>> are
> > > > >> >>>>>>>> nulls instead of simply returning null for the value,
> there
> > > is
> > > > a
> > > > >> >>>>>>>> NullPointerException.
> > > > >> >>>>>>>>
> > > > >> >>>>>>>> Why not simply return null for the value if it is missing
> > > from
> > > > a
> > > > >> >>>> row?
> > > > >> >>>>>>>>
> > > > >> >>>>>>>
> > > > >> >>>>>>
> > > > >> >>>>>
> > > > >> >>>>>
> > > > >> >>>>
> > > > >> >>>
> > > > >>
> > > > >>
> > > > >
> > > >
> > >
> >
>

Reply via email to