I've seen similar issues when dealing with json files having schema change
across records. For example, across two records, if a field is String in
one and Int in another, I hit this issue.

As a work-around, I'd set store.json.all_text_mode to true. This would read
all fields thru the json as Strings.

I'm not sure such an option exists for parquet. May be if you set this
before your CTAS statement (where it reads json files), you could work
around it.

-Abhishek

On Tue, Sep 1, 2015 at 6:19 PM, Stefán Baxter <[email protected]>
wrote:

> Hi,
>
> I'm battling minor inconsistencies in 2 Parquet file generated from the
> same(ish) json structure. (product of 2 separate CTAS processes but the
> json was compatible before conversion)
>
> I can not create query that reads from them both and this is the error I
> get:
>
> [Error Id: 4ee4c131-31fc-4252-a664-5a2e855349fb on localhost:31010]
>   (java.lang.IllegalStateException) Failure while reading vector.  Expected
> vector class of org.apache.drill.exec.vector.NullableVarCharVector but was
> holding vector class org.apache.drill.exec.vector.NullableIntVector.
>
> Turning on verbose logging produces a stacktrace that gives me no usable
> information regarding tracking down the field or the value.
>
> I'm assuming, because that has happened to me too many times before using
> Drill, that this is a null value that is interpreted as numeric value that
> then clashes with a string value.
>
> Is there anyone here that can assist me in working around this?
>
> (there are no data-type-changes in these files and the only difference may
> be fields that are missing in one and present in the other)
>
> Regards,
>  -Stefán
>

Reply via email to