The problem was that the json field names were not proper scala/java
field names (they contained spaces, dashes, plus signs, and various
other symbols). Works now. Thanks.
On 08/26/2014 04:06 PM, Nathan Howell wrote:
I've used it successfully for schemas containing a mix of nested structs
and arrays.. somewhere in the 50-100 column range.
-n
On 8/26/14, 1:01 PM, "Jim" <[email protected]> wrote:
Funny you should mention that. I tried that first. It failed on the
saveAsParquetFile with a cryptic:
java.lang.RuntimeException: Unsupported dataType:
StructType(ArrayBuffer(StructField( ... 500 columns worth of the
same...) [1.7784] failure: `,' expected but `A' found"
I assumed this had to do with not including a schema.
On 08/26/2014 03:31 PM, Dmitriy Ryaboy wrote:
Nice -- using Spark to infer the json schema. Also a good way to do
that.
Does it handle nesting and everything?