Hi,

I have nothing meaningful to add but I had share that this BigInt
assumption has caused more grief than any other single feature in Drill.

I would go so far as to say that the "type intolerance" and lack of
"sensible conversion"  is the biggest hurdle on the way to fulfilling the
"eliminate ETL and start working on your data right away" promise.

Don't get me wrong. I love Drill and we plan on using quite a lot.

Regards,
  -Stefan

On Fri, Sep 18, 2015 at 9:59 PM, Jacques Nadeau <[email protected]> wrote:

> This is a limitation of really late appearing fields. Right now, depending
> on the situation, if a value doesn't show up in the first ~4k, we assume
> that the value is BigInt. I think a developer is working on improving this
> behavior right now. I'll ping them to see when we might have a fix.
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Fri, Sep 18, 2015 at 1:04 AM, Mustafa Engin Sözer <
> [email protected]> wrote:
>
> > Hi Andries,
> >
> > I've already tried writing where and case statements but none of them
> > worked unfortunately. It's still:
> >
> > DATA_READ ERROR: Error parsing JSON - You tried to write a VarChar type
> > when you are using a ValueWriter of type NullableIntWriterImpl.
> >
> > When I try this on sqlline, I can confirm that the output of the query is
> > written until 5665th row and then the Varchar value comes up and the
> query
> > fails.
> >
> > Used queries:
> >
> > 1) SELECT CASE WHEN field_a IS NULL THEN 1 ELSE 0 END  from
> > dfs.poc.`poc.json` t;
> > 2) SELECT field_a FROM dfs.poc.`poc.json` where field_a is not null; (I
> > actually also need those rows but anyway, this is also not working)
> >
> > As you see from the first query, even if I do not select the field itself
> > in the statement as part of the output, the problem occurs during the
> > reading of json file. (Specifically this field)
> >
> > On 17 September 2015 at 17:15, Andries Engelbrecht <
> > [email protected]> wrote:
> >
> > > Don’t know what the use case is for the queries you are trying to run.
> > >
> > > See if these 2 workarounds can work for your needs.
> > >
> > > 1. The simplest, if you are not interested in the records where the
> field
> > > value is null or not null.
> > > Use a predicate to filter out the records.
> > >
> > > 2. Use a case statement before casting to a specific data type to
> handle
> > > the records with the null field differently.
> > >
> > > —Andries
> > >
> > >
> > > > On Sep 16, 2015, at 5:22 AM, Mustafa Engin Sözer <
> > > [email protected]> wrote:
> > > >
> > > > By the way, I forgot to mention that we use Drill 1.0 currently.
> > > > Additionall point:
> > > >
> > > > I've did some other tests and the point is that even if I change that
> > > > varchar field into an integer (e.g. 123456) without the double
> quotes,
> > it
> > > > still does not work. The only way it works is if I set that also to
> > null.
> > > > That's really weird. I might be missing something here but can't
> figure
> > > out
> > > > what at the moment.
> > > >
> > > > On 16 September 2015 at 10:57, Mustafa Engin Sözer <
> > > > [email protected]> wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> I'm having an issue here. I have the following sample set as json:
> > > >>
> > > >> {
> > > >> "field_a":null
> > > >> }
> > > >> {
> > > >> "field_a":"e900511b2bff6b9d33cc"
> > > >> }
> > > >>
> > > >> Due to some problems in the dataset, I had to already set:
> > > >>
> > > >> store.json.all_text_mode to true
> > > >>
> > > >> But even now, when i try to query the dataset, the following error
> is
> > > >> thrown:
> > > >>
> > > >> DATA_READ ERROR: Error parsing JSON - You tried to write a VarChar
> > type
> > > >> when you are using a ValueWriter of type NullableIntWriterImpl.
> > > >>
> > > >>
> > > >> There are more than 40K rows already in this dataset (I've just
> > > summarized
> > > >> the related part here). The thing is up to
> > > >> "field_a":"e900511b2bff6b9d33cc", the value of field_a was always
> > null.
> > > And
> > > >> I presume that Drill already assigned a NullableInt type during
> schema
> > > >> recovery, thus when it faces a Varchar value at the 5665th record,
> it
> > > just
> > > >> fails.
> > > >>
> > > >> As far as I know, if I enclose the null value with double quotes,
> then
> > > >> it's not really a true null representation. At the end, the question
> > > is: do
> > > >> you know what might be the problem and is there any workaround or
> > > setting
> > > >> to overcome this issue?
> > > >>
> > > >> Another thing is: when store.json.all_text_mode is set to true, why
> > does
> > > >> Drill still recognize this field as NullableInt ? Shouldn't it
> > consider
> > > >> everything as Varchar already?
> > > >>
> > > >> Thanks a lot for your help.
> > > >>
> > > >> Cheers,
> > > >> --
> > > >>
> > > >> *M. Engin Sözer*
> > > >> Junior Datawarehouse Manager
> > > >> [email protected]
> > > >>
> > > >> Goodgame Studios
> > > >> Theodorstr. 42-90, House 9
> > > >> 22761 Hamburg, Germany
> > > >> Phone: +49 (0)40 219 880 -0
> > > >> *www.goodgamestudios.com <http://www.goodgamestudios.com>*
> > > >>
> > > >> Goodgame Studios is a branch of Altigi GmbH
> > > >> Altigi GmbH, District court Hamburg, HRB 99869
> > > >> Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek,
> Fabian
> > > >> Ritter
> > > >>
> > > >>
> > > >
> > > >
> > > > --
> > > >
> > > > *M. Engin Sözer*
> > > > Junior Datawarehouse Manager
> > > > [email protected]
> > > >
> > > > Goodgame Studios
> > > > Theodorstr. 42-90, House 9
> > > > 22761 Hamburg, Germany
> > > > Phone: +49 (0)40 219 880 -0
> > > > *www.goodgamestudios.com <http://www.goodgamestudios.com>*
> > > >
> > > > Goodgame Studios is a branch of Altigi GmbH
> > > > Altigi GmbH, District court Hamburg, HRB 99869
> > > > Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek,
> Fabian
> > > > Ritter
> > >
> > >
> >
> >
> > --
> >
> > *M. Engin Sözer*
> > Junior Datawarehouse Manager
> > [email protected]
> >
> > Goodgame Studios
> > Theodorstr. 42-90, House 9
> > 22761 Hamburg, Germany
> > Phone: +49 (0)40 219 880 -0
> > *www.goodgamestudios.com <http://www.goodgamestudios.com>*
> >
> > Goodgame Studios is a branch of Altigi GmbH
> > Altigi GmbH, District court Hamburg, HRB 99869
> > Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek, Fabian
> > Ritter
> >
>

Reply via email to