Navin Goel created SPARK-19742:
----------------------------------
Summary: When using SparkSession to write a taset to Hive the
schema is ignored
Key: SPARK-19742
URL: https://issues.apache.org/jira/browse/SPARK-19742
Project: Spark
Issue Type: Bug
Components: Java API
Affects Versions: 2.0.1
Environment: Running on Ubuntu with HDP 2.4.
Reporter: Navin Goel
I am saving a Dataset that is created form reading a json and some selects and
filters into a hive table. The dataset.write().insertInto function does not
look at schema when writing to the table but instead writes in order to the
hive table.
The schemas for both the tables are same.
schema printed from spark of the dataset being written:
StructType(StructField(countrycode,StringType,true),
StructField(systemflag,StringType,true),
StructField(classcode,StringType,true), StructField(classname,StringType,true),
StructField(rangestart,StringType,true), StructField(rangeend,StringType,true),
StructField(tablename,StringType,true),
StructField(last_updated_date,TimestampType,true))
Schema of the dataset after loading the same table from Hive:
StructType(StructField(systemflag,StringType,true),
StructField(RangeEnd,StringType,true), StructField(classcode,StringType,true),
StructField(classname,StringType,true),
StructField(last_updated_date,TimestampType,true),
StructField(countrycode,StringType,true),
StructField(rangestart,StringType,true), StructField(tablename,StringType,true))
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]