Navin Goel created SPARK-19742:
----------------------------------

             Summary: When using SparkSession to write a taset to Hive the 
schema is ignored
                 Key: SPARK-19742
                 URL: https://issues.apache.org/jira/browse/SPARK-19742
             Project: Spark
          Issue Type: Bug
          Components: Java API
    Affects Versions: 2.0.1
         Environment: Running on Ubuntu with HDP 2.4.
            Reporter: Navin Goel


I am saving a Dataset that is created form reading a json and some selects and 
filters into a hive table. The dataset.write().insertInto function does not 
look at schema when writing to the table but instead writes in order to the 
hive table.

The schemas for both the tables are same.

schema printed from spark of the dataset being written:
StructType(StructField(countrycode,StringType,true), 
StructField(systemflag,StringType,true), 
StructField(classcode,StringType,true), StructField(classname,StringType,true), 
StructField(rangestart,StringType,true), StructField(rangeend,StringType,true), 
StructField(tablename,StringType,true), 
StructField(last_updated_date,TimestampType,true))

Schema of the dataset after loading the same table from Hive:
StructType(StructField(systemflag,StringType,true), 
StructField(RangeEnd,StringType,true), StructField(classcode,StringType,true), 
StructField(classname,StringType,true), 
StructField(last_updated_date,TimestampType,true), 
StructField(countrycode,StringType,true), 
StructField(rangestart,StringType,true), StructField(tablename,StringType,true))



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to