[ 
https://issues.apache.org/jira/browse/SPARK-17952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Baghel updated SPARK-17952:
--------------------------------
    Description: 
As per latest spark documentation for Java at 
http://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection,
 

{quote}
"Nested JavaBeans and List or Array fields are supported though". 
{quote}

However nested JavaBean is not working. Please see the below code.

SubCategory class

{code}
public class SubCategory implements Serializable{
        private String id;
        private String name;
        
        public String getId() {
                return id;
        }
        public void setId(String id) {
                this.id = id;
        }
        public String getName() {
                return name;
        }
        public void setName(String name) {
                this.name = name;
        }       
}

{code}

Category class

{code}
public class Category implements Serializable{
        private String id;
        private SubCategory subCategory;
        
        public String getId() {
                return id;
        }
        public void setId(String id) {
                this.id = id;
        }
        public SubCategory getSubCategory() {
                return subCategory;
        }
        public void setSubCategory(SubCategory subCategory) {
                this.subCategory = subCategory;
        }
}
{code}

SparkSample class

{code}
public class SparkSample {
        public static void main(String[] args) throws IOException {             
                
                SparkSession spark = SparkSession
                                .builder()
                                .appName("SparkSample")
                                .master("local")
                                .getOrCreate();
                //SubCategory
                SubCategory sub = new SubCategory();
                sub.setId("sc-111");
                sub.setName("Sub-1");
                //Category
                Category category = new Category();
                category.setId("s-111");
                category.setSubCategory(sub);
                //categoryList
                List<Category> categoryList = new ArrayList<Category>();
                categoryList.add(category);
                 //DF
                Dataset<Row> dframe = spark.createDataFrame(categoryList, 
Category.class);      
                dframe.show();          
        }
}
{code}


Above code throws below error.

{code}
Exception in thread "main" scala.MatchError: com.sample.SubCategory@e7391d (of 
class com.sample.SubCategory)
        at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:256)
        at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:251)
        at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:103)
        at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:403)
        at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106)
        at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
        at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1106)
        at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1104)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$class.toStream(Iterator.scala:1322)
        at scala.collection.AbstractIterator.toStream(Iterator.scala:1336)
        at 
scala.collection.TraversableOnce$class.toSeq(TraversableOnce.scala:298)
        at scala.collection.AbstractIterator.toSeq(Iterator.scala:1336)
        at 
org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:373)
        at com.sample.SparkSample.main(SparkSample.java:33)
{code}


createDataFrame method throws above exception. But I observed that 
createDataset method works fine with below code.

{code}
Encoder<Category> encoder = Encoders.bean(Category.class); 
Dataset<Category> dframe = spark.createDataset(categoryList, encoder);
dframe.show();
{code}

  was:
As per latest spark documentation for Java at 
http://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection,
 "Nested JavaBeans and List or Array fields are supported though". However 
nested JavaBean is not working. Please see the below code.

SubCategory class

{code}
public class SubCategory implements Serializable{
        private String id;
        private String name;
        
        public String getId() {
                return id;
        }
        public void setId(String id) {
                this.id = id;
        }
        public String getName() {
                return name;
        }
        public void setName(String name) {
                this.name = name;
        }       
}

{code}

Category class

{code}
public class Category implements Serializable{
        private String id;
        private SubCategory subCategory;
        
        public String getId() {
                return id;
        }
        public void setId(String id) {
                this.id = id;
        }
        public SubCategory getSubCategory() {
                return subCategory;
        }
        public void setSubCategory(SubCategory subCategory) {
                this.subCategory = subCategory;
        }
}
{code}

SparkSample class

{code}
public class SparkSample {
        public static void main(String[] args) throws IOException {             
                
                SparkSession spark = SparkSession
                                .builder()
                                .appName("SparkSample")
                                .master("local")
                                .getOrCreate();
                //SubCategory
                SubCategory sub = new SubCategory();
                sub.setId("sc-111");
                sub.setName("Sub-1");
                //Category
                Category category = new Category();
                category.setId("s-111");
                category.setSubCategory(sub);
                //categoryList
                List<Category> categoryList = new ArrayList<Category>();
                categoryList.add(category);
                 //DF
                Dataset<Row> dframe = spark.createDataFrame(categoryList, 
Category.class);      
                dframe.show();          
        }
}
{code}


Above code throws below error.

{code}
Exception in thread "main" scala.MatchError: com.sample.SubCategory@e7391d (of 
class com.sample.SubCategory)
        at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:256)
        at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:251)
        at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:103)
        at 
org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:403)
        at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106)
        at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at 
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
        at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1106)
        at 
org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1104)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$class.toStream(Iterator.scala:1322)
        at scala.collection.AbstractIterator.toStream(Iterator.scala:1336)
        at 
scala.collection.TraversableOnce$class.toSeq(TraversableOnce.scala:298)
        at scala.collection.AbstractIterator.toSeq(Iterator.scala:1336)
        at 
org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:373)
        at com.sample.SparkSample.main(SparkSample.java:33)
{code}


createDataFrame method throws above exception. But I observed that 
createDataset method works fine with below code.

{code}
Encoder<Category> encoder = Encoders.bean(Category.class); 
Dataset<Category> dframe = spark.createDataset(categoryList, encoder);
dframe.show();
{code}


> Java SparkSession createDataFrame method throws exception for nested JavaBeans
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-17952
>                 URL: https://issues.apache.org/jira/browse/SPARK-17952
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 2.0.0, 2.0.1
>            Reporter: Amit Baghel
>
> As per latest spark documentation for Java at 
> http://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection,
>  
> {quote}
> "Nested JavaBeans and List or Array fields are supported though". 
> {quote}
> However nested JavaBean is not working. Please see the below code.
> SubCategory class
> {code}
> public class SubCategory implements Serializable{
>       private String id;
>       private String name;
>       
>       public String getId() {
>               return id;
>       }
>       public void setId(String id) {
>               this.id = id;
>       }
>       public String getName() {
>               return name;
>       }
>       public void setName(String name) {
>               this.name = name;
>       }       
> }
> {code}
> Category class
> {code}
> public class Category implements Serializable{
>       private String id;
>       private SubCategory subCategory;
>       
>       public String getId() {
>               return id;
>       }
>       public void setId(String id) {
>               this.id = id;
>       }
>       public SubCategory getSubCategory() {
>               return subCategory;
>       }
>       public void setSubCategory(SubCategory subCategory) {
>               this.subCategory = subCategory;
>       }
> }
> {code}
> SparkSample class
> {code}
> public class SparkSample {
>       public static void main(String[] args) throws IOException {             
>                 
>               SparkSession spark = SparkSession
>                               .builder()
>                               .appName("SparkSample")
>                               .master("local")
>                               .getOrCreate();
>               //SubCategory
>               SubCategory sub = new SubCategory();
>               sub.setId("sc-111");
>               sub.setName("Sub-1");
>               //Category
>               Category category = new Category();
>               category.setId("s-111");
>               category.setSubCategory(sub);
>               //categoryList
>               List<Category> categoryList = new ArrayList<Category>();
>               categoryList.add(category);
>                //DF
>               Dataset<Row> dframe = spark.createDataFrame(categoryList, 
> Category.class);      
>               dframe.show();          
>       }
> }
> {code}
> Above code throws below error.
> {code}
> Exception in thread "main" scala.MatchError: com.sample.SubCategory@e7391d 
> (of class com.sample.SubCategory)
>       at 
> org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:256)
>       at 
> org.apache.spark.sql.catalyst.CatalystTypeConverters$StructConverter.toCatalystImpl(CatalystTypeConverters.scala:251)
>       at 
> org.apache.spark.sql.catalyst.CatalystTypeConverters$CatalystTypeConverter.toCatalyst(CatalystTypeConverters.scala:103)
>       at 
> org.apache.spark.sql.catalyst.CatalystTypeConverters$$anonfun$createToCatalystConverter$2.apply(CatalystTypeConverters.scala:403)
>       at 
> org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106)
>       at 
> org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1106)
>       at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>       at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>       at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>       at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
>       at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>       at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
>       at 
> org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1106)
>       at 
> org.apache.spark.sql.SQLContext$$anonfun$beansToRows$1.apply(SQLContext.scala:1104)
>       at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>       at scala.collection.Iterator$class.toStream(Iterator.scala:1322)
>       at scala.collection.AbstractIterator.toStream(Iterator.scala:1336)
>       at 
> scala.collection.TraversableOnce$class.toSeq(TraversableOnce.scala:298)
>       at scala.collection.AbstractIterator.toSeq(Iterator.scala:1336)
>       at 
> org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:373)
>       at com.sample.SparkSample.main(SparkSample.java:33)
> {code}
> createDataFrame method throws above exception. But I observed that 
> createDataset method works fine with below code.
> {code}
> Encoder<Category> encoder = Encoders.bean(Category.class); 
> Dataset<Category> dframe = spark.createDataset(categoryList, encoder);
> dframe.show();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to