Re: Java to show struct field from a Dataframe

2016-12-17 Thread Richard Xin
 blockquote, div.yahoo_quoted { margin-left: 0 !important; border-left:1px 
#715FFA solid !important; padding-left:1ex !important; background-color:white 
!important; }  Super, that works! Thanks


Sent from Yahoo Mail for iPhone


On Sunday, December 18, 2016, 11:28 AM, Yong Zhang <java8...@hotmail.com> wrote:

-- P {margin-top:0;margin-bottom:0;}
Why not you just return the struct you defined, instead of an array?




            @Override
            public Row call(Double x, Double y) throws Exception {
                Row row = RowFactory.create(x, y);
                return row;
            }



From: Richard Xin <richardxin...@yahoo.com>
Sent: Saturday, December 17, 2016 8:53 PM
To: Yong Zhang; zjp_j...@163.com; user
Subject: Re: Java to show struct field from a Dataframe I tried to transform 
root
 |-- latitude: double (nullable = false)
 |-- longitude: double (nullable = false)
 |-- name: string (nullable = true)
to: 
root
 |-- name: string (nullable = true)
 |-- location: struct (nullable = true)
 |    |-- longitude: double (nullable = true)
 |    |-- latitude: double (nullable = true)
Code snippet is as followings:
        sqlContext.udf().register("toLocation", new UDF2<Double, Double, Row>() 
{
            @Override
            public Row call(Double x, Double y) throws Exception {
                Row row = RowFactory.create(new double[] { x, y });
                return row;
            }
        }, DataTypes.createStructType(new StructField[] { 
                new StructField("longitude", DataTypes.DoubleType, true, 
Metadata.empty()),
                new StructField("latitude", DataTypes.DoubleType, true, 
Metadata.empty())
            }));
        DataFrame transformedDf1 = citiesDF.withColumn("location",
                callUDF("toLocation", col("longitude"), col("latitude")));
        
transformedDf1.drop("latitude").drop("longitude").schema().printTreeString();  
// prints schema tree OK as expected
transformedDf.show();  // java.lang.ClassCastException: [D cannot be 
cast to java.lang.Double

seems to me that the ReturnType of the UDF2 might be the root cause. but not 
sure how to correct.
Thanks,Richard




On Sunday, December 18, 2016 7:15 AM, Yong Zhang <java8...@hotmail.com> wrote:


"[D" type means a double array type. So this error simple means you have 
double[] data, but Spark needs to cast it to Double, as your schema defined.
The error message clearly indicates the data doesn't match with  the type 
specified in the schema.
I wonder how you are so sure about your data? Do you check it under other tool?
Yong

From: Richard Xin <richardxin...@yahoo.com.INVALID>
Sent: Saturday, December 17, 2016 10:56 AM
To: zjp_j...@163.com; user
Subject: Re: Java to show struct field from a Dataframe data is good


On Saturday, December 17, 2016 11:50 PM, "zjp_j...@163.com" <zjp_j...@163.com> 
wrote:


I think the causation is your invanlid Double data , have u checked your data ?
zjp_j...@163.com
 From: Richard XinDate: 2016-12-17 23:28To: UserSubject: Java to show struct 
field from a Dataframelet's say I have a DataFrame with schema of 
followings:root
 |-- name: string (nullable = true)
 |-- location: struct (nullable = true)
 |    |-- longitude: double (nullable = true)
 |    |-- latitude: double (nullable = true)
df.show(); throws following exception:

java.lang.ClassCastException: [D cannot be cast to java.lang.Double
    at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:119)
    at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getDouble(rows.scala:44)
    at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getDouble(rows.scala:221)
    at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)

Any advise?Thanks in advance.Richard






 



Re: Java to show struct field from a Dataframe

2016-12-17 Thread Yong Zhang
Why not you just return the struct you defined, instead of an array?


@Override
public Row call(Double x, Double y) throws Exception {
Row row = RowFactory.create(x, y);
return row;
}



From: Richard Xin <richardxin...@yahoo.com>
Sent: Saturday, December 17, 2016 8:53 PM
To: Yong Zhang; zjp_j...@163.com; user
Subject: Re: Java to show struct field from a Dataframe

I tried to transform
root
 |-- latitude: double (nullable = false)
 |-- longitude: double (nullable = false)
 |-- name: string (nullable = true)

to:
root
 |-- name: string (nullable = true)
 |-- location: struct (nullable = true)
 ||-- longitude: double (nullable = true)
 ||-- latitude: double (nullable = true)

Code snippet is as followings:

sqlContext.udf().register("toLocation", new UDF2<Double, Double, Row>() 
{
@Override
public Row call(Double x, Double y) throws Exception {
Row row = RowFactory.create(new double[] { x, y });
return row;
}
}, DataTypes.createStructType(new StructField[] {
new StructField("longitude", DataTypes.DoubleType, true, 
Metadata.empty()),
new StructField("latitude", DataTypes.DoubleType, true, 
Metadata.empty())
}));

DataFrame transformedDf1 = citiesDF.withColumn("location",
callUDF("toLocation", col("longitude"), col("latitude")));

transformedDf1.drop("latitude").drop("longitude").schema().printTreeString();  
// prints schema tree OK as expected

transformedDf.show();  // java.lang.ClassCastException: [D cannot be 
cast to java.lang.Double


seems to me that the ReturnType of the UDF2 might be the root cause. but not 
sure how to correct.

Thanks,
Richard




On Sunday, December 18, 2016 7:15 AM, Yong Zhang <java8...@hotmail.com> wrote:


"[D" type means a double array type. So this error simple means you have 
double[] data, but Spark needs to cast it to Double, as your schema defined.

The error message clearly indicates the data doesn't match with  the type 
specified in the schema.

I wonder how you are so sure about your data? Do you check it under other tool?

Yong



From: Richard Xin <richardxin...@yahoo.com.INVALID>
Sent: Saturday, December 17, 2016 10:56 AM
To: zjp_j...@163.com; user
Subject: Re: Java to show struct field from a Dataframe

data is good


On Saturday, December 17, 2016 11:50 PM, "zjp_j...@163.com" <zjp_j...@163.com> 
wrote:


I think the causation is your invanlid Double data , have u checked your data ?


zjp_j...@163.com

From: Richard Xin<mailto:richardxin...@yahoo.com.INVALID>
Date: 2016-12-17 23:28
To: User<mailto:user@spark.apache.org>
Subject: Java to show struct field from a Dataframe
let's say I have a DataFrame with schema of followings:
root
 |-- name: string (nullable = true)
 |-- location: struct (nullable = true)
 ||-- longitude: double (nullable = true)
 ||-- latitude: double (nullable = true)

df.show(); throws following exception:

java.lang.ClassCastException: [D cannot be cast to java.lang.Double
at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:119)
at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getDouble(rows.scala:44)
at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getDouble(rows.scala:221)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)


Any advise?
Thanks in advance.
Richard






Re: Java to show struct field from a Dataframe

2016-12-17 Thread Richard Xin
I tried to transform 
root
 |-- latitude: double (nullable = false)
 |-- longitude: double (nullable = false)
 |-- name: string (nullable = true)
to: 
root
 |-- name: string (nullable = true)
 |-- location: struct (nullable = true)
 |    |-- longitude: double (nullable = true)
 |    |-- latitude: double (nullable = true)
Code snippet is as followings:
        sqlContext.udf().register("toLocation", new UDF2<Double, Double, Row>() 
{
            @Override
            public Row call(Double x, Double y) throws Exception {
                Row row = RowFactory.create(new double[] { x, y });
                return row;
            }
        }, DataTypes.createStructType(new StructField[] { 
                new StructField("longitude", DataTypes.DoubleType, true, 
Metadata.empty()),
                new StructField("latitude", DataTypes.DoubleType, true, 
Metadata.empty()) 
            }));
        DataFrame transformedDf1 = citiesDF.withColumn("location",
                callUDF("toLocation", col("longitude"), col("latitude")));
        
transformedDf1.drop("latitude").drop("longitude").schema().printTreeString();  
// prints schema tree OK as expected
transformedDf.show();  // java.lang.ClassCastException: [D cannot be 
cast to java.lang.Double

seems to me that the ReturnType of the UDF2 might be the root cause. but not 
sure how to correct.
Thanks,Richard


 

On Sunday, December 18, 2016 7:15 AM, Yong Zhang <java8...@hotmail.com> 
wrote:
 

 #yiv1972361746 #yiv1972361746 -- P 
{margin-top:0;margin-bottom:0;}#yiv1972361746 "[D" type means a double array 
type. So this error simple means you have double[] data, but Spark needs to 
cast it to Double, as your schema defined.
The error message clearly indicates the data doesn't match with  the type 
specified in the schema.
I wonder how you are so sure about your data? Do you check it under other tool?
Yong

From: Richard Xin <richardxin...@yahoo.com.INVALID>
Sent: Saturday, December 17, 2016 10:56 AM
To: zjp_j...@163.com; user
Subject: Re: Java to show struct field from a Dataframe data is good


On Saturday, December 17, 2016 11:50 PM, "zjp_j...@163.com" <zjp_j...@163.com> 
wrote:


I think the causation is your invanlid Double data , have u checked your data ?
zjp_j...@163.com
 From: Richard XinDate: 2016-12-17 23:28To: UserSubject: Java to show struct 
field from a Dataframelet's say I have a DataFrame with schema of 
followings:root
 |-- name: string (nullable = true)
 |-- location: struct (nullable = true)
 |    |-- longitude: double (nullable = true)
 |    |-- latitude: double (nullable = true)
df.show(); throws following exception:

java.lang.ClassCastException: [D cannot be cast to java.lang.Double
    at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:119)
    at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getDouble(rows.scala:44)
    at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getDouble(rows.scala:221)
    at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)

Any advise?Thanks in advance.Richard





   

Re: Java to show struct field from a Dataframe

2016-12-17 Thread Yong Zhang
"[D" type means a double array type. So this error simple means you have 
double[] data, but Spark needs to cast it to Double, as your schema defined.


The error message clearly indicates the data doesn't match with  the type 
specified in the schema.


I wonder how you are so sure about your data? Do you check it under other tool?


Yong



From: Richard Xin <richardxin...@yahoo.com.INVALID>
Sent: Saturday, December 17, 2016 10:56 AM
To: zjp_j...@163.com; user
Subject: Re: Java to show struct field from a Dataframe

data is good


On Saturday, December 17, 2016 11:50 PM, "zjp_j...@163.com" <zjp_j...@163.com> 
wrote:


I think the causation is your invanlid Double data , have u checked your data ?


zjp_j...@163.com

From: Richard Xin<mailto:richardxin...@yahoo.com.INVALID>
Date: 2016-12-17 23:28
To: User<mailto:user@spark.apache.org>
Subject: Java to show struct field from a Dataframe
let's say I have a DataFrame with schema of followings:
root
 |-- name: string (nullable = true)
 |-- location: struct (nullable = true)
 ||-- longitude: double (nullable = true)
 ||-- latitude: double (nullable = true)

df.show(); throws following exception:

java.lang.ClassCastException: [D cannot be cast to java.lang.Double
at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:119)
at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getDouble(rows.scala:44)
at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getDouble(rows.scala:221)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)


Any advise?
Thanks in advance.
Richard




Re: Java to show struct field from a Dataframe

2016-12-17 Thread Richard Xin
data is good
 

On Saturday, December 17, 2016 11:50 PM, "zjp_j...@163.com" 
 wrote:
 

 #yiv7434848277 body {line-height:1.5;}#yiv7434848277 blockquote 
{margin-top:0px;margin-bottom:0px;margin-left:0.5em;}#yiv7434848277 
div.yiv7434848277foxdiv20161217234614718397 {}#yiv7434848277 body 
{font-size:10.5pt;color:rgb(0, 0, 0);line-height:1.5;}I think the causation is 
your invanlid Double data , have u checked your data ?
zjp_j...@163.com
 From: Richard XinDate: 2016-12-17 23:28To: UserSubject: Java to show struct 
field from a Dataframelet's say I have a DataFrame with schema of 
followings:root
 |-- name: string (nullable = true)
 |-- location: struct (nullable = true)
 |    |-- longitude: double (nullable = true)
 |    |-- latitude: double (nullable = true)
df.show(); throws following exception:

java.lang.ClassCastException: [D cannot be cast to java.lang.Double
    at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:119)
    at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getDouble(rows.scala:44)
    at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getDouble(rows.scala:221)
    at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)

Any advise?Thanks in advance.Richard



   

Re: Java to show struct field from a Dataframe

2016-12-17 Thread zjp_j...@163.com
I think the causation is your invanlid Double data , have u checked your data ?



zjp_j...@163.com
 
From: Richard Xin
Date: 2016-12-17 23:28
To: User
Subject: Java to show struct field from a Dataframe
let's say I have a DataFrame with schema of followings:
root
 |-- name: string (nullable = true)
 |-- location: struct (nullable = true)
 ||-- longitude: double (nullable = true)
 ||-- latitude: double (nullable = true)

df.show(); throws following exception:

java.lang.ClassCastException: [D cannot be cast to java.lang.Double
at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:119)
at 
org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getDouble(rows.scala:44)
at 
org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getDouble(rows.scala:221)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
 Source)


Any advise?
Thanks in advance.
Richard