from:"cdecleene"

Re: Spark 1.6.2 can read hive tables created with sqoop, but Spark 2.0.0 cannot

2016-08-11 Thread cdecleene

The data is uncorrupted as I can create the dataframe from the underlying raw
parquet from spark 2.0.0 if instead of using SparkSession.sql() to create a
dataframe I use SparkSession.read.parquet(). 





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-6-2-can-read-hive-tables-created-with-sqoop-but-Spark-2-0-0-cannot-tp27502p27516.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark 1.6.2 can read hive tables created with sqoop, but Spark 2.0.0 cannot

2016-08-10 Thread cdecleene

Using the scala api instead of the python api yields the same results.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-6-2-can-read-hive-tables-created-with-sqoop-but-Spark-2-0-0-cannot-tp27502p27506.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Spark 1.6.2 can read hive tables created with sqoop, but Spark 2.0.0 cannot

2016-08-09 Thread cdecleene

Some details of an example table hive table that spark 2.0 could not read...  

SerDe Library:  
org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat:
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
OutputFormat:   
org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat

COLUMN_STATS_ACCURATE   false   
kite.compression.type   snappy  
numFiles0
numRows -1
rawDataSize -1
totalSize0

All fields within the table are of type "string" and there are less than 20
of them. 

When I say that spark 2.0 cannot read the hive table, I mean that when I
attempt to execute the following from a pyspark shell... 

spark = SparkSession.builder.enableHiveSupport().getOrCreate()
df = spark.sql("SELECT * FROM dra_agency_analytics.raw_ewt_agcy_dim")

... the dataframe df has the correct number of rows and the correct columns,
but all values read as "None". 




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-6-2-can-read-hive-tables-created-with-sqoop-but-Spark-2-0-0-cannot-tp27502.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark 1.6.2 can read hive tables created with sqoop, but Spark 2.0.0 cannot

Re: Spark 1.6.2 can read hive tables created with sqoop, but Spark 2.0.0 cannot

Spark 1.6.2 can read hive tables created with sqoop, but Spark 2.0.0 cannot

3 matches

Site Navigation

Mail list logo

Footer information