Re: StructType has more rows, than corresponding Row has objects.
Davies, that seemed to be my issue, my colleague helped me to resolved it. The problem was that we build RDD and corresponding StructType by ourselves (no json, parquet, cassandra, etc - we take a list of business objects and convert them to Rows, then infer struct type) and I missed one thing. -- Be well! Jean Morozov On Tue, Oct 6, 2015 at 1:58 AM, Davies Liuwrote: > Could you tell us a way to reproduce this failure? Reading from JSON or > Parquet? > > On Mon, Oct 5, 2015 at 4:28 AM, Eugene Morozov > wrote: > > Hi, > > > > We're building our own framework on top of spark and we give users pretty > > complex schema to work with. That requires from us to build dataframes by > > ourselves: we transform business objects to rows and struct types and > uses > > these two to create dataframe. > > > > Everything was fine until I started to upgrade to spark 1.5.0 (from > 1.3.1). > > Seems to be catalyst engine has been changed and now using almost the > same > > code to produce rows and struct types I have the following: > > http://ibin.co/2HzUsoe9O96l, some of rows in the end result have > different > > number of values and corresponding struct types. > > > > I'm almost sure it's my own fault, but there is always a small chance, > that > > something is wrong in spark codebase. If you've seen something similar > or if > > there is a jira for smth similar, I'd be glad to know. Thanks. > > -- > > Be well! > > Jean Morozov >
Re: StructType has more rows, than corresponding Row has objects.
Could you tell us a way to reproduce this failure? Reading from JSON or Parquet? On Mon, Oct 5, 2015 at 4:28 AM, Eugene Morozovwrote: > Hi, > > We're building our own framework on top of spark and we give users pretty > complex schema to work with. That requires from us to build dataframes by > ourselves: we transform business objects to rows and struct types and uses > these two to create dataframe. > > Everything was fine until I started to upgrade to spark 1.5.0 (from 1.3.1). > Seems to be catalyst engine has been changed and now using almost the same > code to produce rows and struct types I have the following: > http://ibin.co/2HzUsoe9O96l, some of rows in the end result have different > number of values and corresponding struct types. > > I'm almost sure it's my own fault, but there is always a small chance, that > something is wrong in spark codebase. If you've seen something similar or if > there is a jira for smth similar, I'd be glad to know. Thanks. > -- > Be well! > Jean Morozov - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
StructType has more rows, than corresponding Row has objects.
Hi, We're building our own framework on top of spark and we give users pretty complex schema to work with. That requires from us to build dataframes by ourselves: we transform business objects to rows and struct types and uses these two to create dataframe. Everything was fine until I started to upgrade to spark 1.5.0 (from 1.3.1). Seems to be catalyst engine has been changed and now using almost the same code to produce rows and struct types I have the following: http://ibin.co/2HzUsoe9O96l, some of rows in the end result have different number of values and corresponding struct types. I'm almost sure it's my own fault, but there is always a small chance, that something is wrong in spark codebase. If you've seen something similar or if there is a jira for smth similar, I'd be glad to know. Thanks. -- Be well! Jean Morozov