This is a limitation of really late appearing fields. Right now, depending on the situation, if a value doesn't show up in the first ~4k, we assume that the value is BigInt. I think a developer is working on improving this behavior right now. I'll ping them to see when we might have a fix.
-- Jacques Nadeau CTO and Co-Founder, Dremio On Fri, Sep 18, 2015 at 1:04 AM, Mustafa Engin Sözer < [email protected]> wrote: > Hi Andries, > > I've already tried writing where and case statements but none of them > worked unfortunately. It's still: > > DATA_READ ERROR: Error parsing JSON - You tried to write a VarChar type > when you are using a ValueWriter of type NullableIntWriterImpl. > > When I try this on sqlline, I can confirm that the output of the query is > written until 5665th row and then the Varchar value comes up and the query > fails. > > Used queries: > > 1) SELECT CASE WHEN field_a IS NULL THEN 1 ELSE 0 END from > dfs.poc.`poc.json` t; > 2) SELECT field_a FROM dfs.poc.`poc.json` where field_a is not null; (I > actually also need those rows but anyway, this is also not working) > > As you see from the first query, even if I do not select the field itself > in the statement as part of the output, the problem occurs during the > reading of json file. (Specifically this field) > > On 17 September 2015 at 17:15, Andries Engelbrecht < > [email protected]> wrote: > > > Don’t know what the use case is for the queries you are trying to run. > > > > See if these 2 workarounds can work for your needs. > > > > 1. The simplest, if you are not interested in the records where the field > > value is null or not null. > > Use a predicate to filter out the records. > > > > 2. Use a case statement before casting to a specific data type to handle > > the records with the null field differently. > > > > —Andries > > > > > > > On Sep 16, 2015, at 5:22 AM, Mustafa Engin Sözer < > > [email protected]> wrote: > > > > > > By the way, I forgot to mention that we use Drill 1.0 currently. > > > Additionall point: > > > > > > I've did some other tests and the point is that even if I change that > > > varchar field into an integer (e.g. 123456) without the double quotes, > it > > > still does not work. The only way it works is if I set that also to > null. > > > That's really weird. I might be missing something here but can't figure > > out > > > what at the moment. > > > > > > On 16 September 2015 at 10:57, Mustafa Engin Sözer < > > > [email protected]> wrote: > > > > > >> Hi everyone, > > >> > > >> I'm having an issue here. I have the following sample set as json: > > >> > > >> { > > >> "field_a":null > > >> } > > >> { > > >> "field_a":"e900511b2bff6b9d33cc" > > >> } > > >> > > >> Due to some problems in the dataset, I had to already set: > > >> > > >> store.json.all_text_mode to true > > >> > > >> But even now, when i try to query the dataset, the following error is > > >> thrown: > > >> > > >> DATA_READ ERROR: Error parsing JSON - You tried to write a VarChar > type > > >> when you are using a ValueWriter of type NullableIntWriterImpl. > > >> > > >> > > >> There are more than 40K rows already in this dataset (I've just > > summarized > > >> the related part here). The thing is up to > > >> "field_a":"e900511b2bff6b9d33cc", the value of field_a was always > null. > > And > > >> I presume that Drill already assigned a NullableInt type during schema > > >> recovery, thus when it faces a Varchar value at the 5665th record, it > > just > > >> fails. > > >> > > >> As far as I know, if I enclose the null value with double quotes, then > > >> it's not really a true null representation. At the end, the question > > is: do > > >> you know what might be the problem and is there any workaround or > > setting > > >> to overcome this issue? > > >> > > >> Another thing is: when store.json.all_text_mode is set to true, why > does > > >> Drill still recognize this field as NullableInt ? Shouldn't it > consider > > >> everything as Varchar already? > > >> > > >> Thanks a lot for your help. > > >> > > >> Cheers, > > >> -- > > >> > > >> *M. Engin Sözer* > > >> Junior Datawarehouse Manager > > >> [email protected] > > >> > > >> Goodgame Studios > > >> Theodorstr. 42-90, House 9 > > >> 22761 Hamburg, Germany > > >> Phone: +49 (0)40 219 880 -0 > > >> *www.goodgamestudios.com <http://www.goodgamestudios.com>* > > >> > > >> Goodgame Studios is a branch of Altigi GmbH > > >> Altigi GmbH, District court Hamburg, HRB 99869 > > >> Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek, Fabian > > >> Ritter > > >> > > >> > > > > > > > > > -- > > > > > > *M. Engin Sözer* > > > Junior Datawarehouse Manager > > > [email protected] > > > > > > Goodgame Studios > > > Theodorstr. 42-90, House 9 > > > 22761 Hamburg, Germany > > > Phone: +49 (0)40 219 880 -0 > > > *www.goodgamestudios.com <http://www.goodgamestudios.com>* > > > > > > Goodgame Studios is a branch of Altigi GmbH > > > Altigi GmbH, District court Hamburg, HRB 99869 > > > Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek, Fabian > > > Ritter > > > > > > > -- > > *M. Engin Sözer* > Junior Datawarehouse Manager > [email protected] > > Goodgame Studios > Theodorstr. 42-90, House 9 > 22761 Hamburg, Germany > Phone: +49 (0)40 219 880 -0 > *www.goodgamestudios.com <http://www.goodgamestudios.com>* > > Goodgame Studios is a branch of Altigi GmbH > Altigi GmbH, District court Hamburg, HRB 99869 > Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek, Fabian > Ritter >
