Working with non-sane data - JSON Types

John Omernik Mon, 18 Jan 2016 14:13:11 -0800

I am working a LARGE volume of data (I state that because even my first
reaction was "I'll just write a simple sed command and fix this data up
lickity split)


However, lots of files, lots of data, so let's avoid that as the initial
answer if possible. (Ideally I am looking for an "on read" solution in
Drill)

Basically, when I try to read a file, I get this error:

Error: DATA_READ ERROR: You tried to start when you are using a ValueWriter
of type SingleMapWriter.

The field in question had a silly setup, if it's empty they use {} if it's
not empty then it's an array of data.

So:

"field1":{}
or
"field1":[{"foo":bar"}, {"bar":"foo"}]

I am pretty sure this is the error. Point: I am not sure the error message
I provided helps me to understand intuitively, perhaps some TLC on the
error messages could help less Drill aware users to know what's actually
breaking (in fairness, the message in 1.4 showed me the line, column, and
field which helped me to infer what could POSSIBLY be wrong).

So, is there away to address this without reprocessing a lot of data?  An
option in Drill that would allow a dirty read of some sort?

Thanks in advance!!

John

Working with non-sane data - JSON Types

Reply via email to