Hi 

Can you explain which scenario will generate two carbondata files with
different schema?

Regards
Liang


akashrn5 wrote
> Hi dev,
> 
> Currently we have a validation that if there are two carbondata files in a
> location with different schema, then we fail the query. I think there is
> no
> need to fail. If you see the parquet behavior also we cna understand.
> 
> Here i think failing is not good, we can read the latets schema from
> latest
> carbondata file in the given location and based on that read all the files
> and give query output. For the columns which are not present in some data
> files, it wil have null values for the new column.
> 
> But here basically we do not merge schema. we can maintain the same now
> also, only thing is can take latest schma.
> 
> for example:
> 1. one data file with columns a,b and c. 2nd file is with columns
> a,b,c,d,e. then can read and create table with 5 columns or 3 columns
> which
> ever is latest and create table(This will be when user does not specify
> schema). If he species table will be created with specified schema.
> 
> I have created a jira for this
> https://issues.apache.org/jira/browse/CARBONDATA-3287
> If any input, please let me know.
> 
> Regards,
> Akash





--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Reply via email to