[GitHub] carbondata pull request #2923: [CARBONDATA-3101] Fixed dataload failure when...

manishgupta88 Sun, 18 Nov 2018 21:47:56 -0800

Github user manishgupta88 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2923#discussion_r234499385
  
    --- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableQueryTestCase.scala
 ---
    @@ -437,6 +437,20 @@ test("Creation of partition table should fail if the 
colname in table schema and
         sql("drop datamap if exists preaggTable on table partitionTable")
       }
     
    +  test("validate data in partition table after dropping and adding a 
column") {
    +    sql("drop table if exists par")
    +    sql("create table par(name string) partitioned by (age double) stored 
by " +
    +              "'carbondata'")
    +    sql(s"load data local inpath '$resourcesPath/uniqwithoutheader.csv' 
into table par options" +
    +        s"('header'='false')")
    +    sql("alter table par drop columns(name)")
    +    sql("alter table par add columns(name string)")
    +    sql(s"load data local inpath '$resourcesPath/uniqwithoutheader.csv' 
into table par options" +
    +        s"('header'='false')")
    --- End diff --
    
    keeping partition column at the end is carbondata behavior which may or may 
not be known to user. For a normal table whenever a column is dropped and 
added, the added column data should either be added as the last column in csv 
file or it should be mapped through fileheader which is the correct behavior.
    As you are using the same csv file in your test case without changing the 
order of data and providing header the above explained behavior might not hold 
true. Please revisit the changes and take opinion from other PMC's/Committers 
on this behavioral change

---

[GitHub] carbondata pull request #2923: [CARBONDATA-3101] Fixed dataload failure when...

Reply via email to