Re: Optimize and refactor insert into command

2020-01-02 Thread Ravindra Pesala
Hi, +1 It’s a long pending work. Most welcome. Regards, Ravindra. On Fri, 20 Dec 2019 at 7:55 AM, Ajantha Bhat wrote: > Currently carbondata "insert into" uses the CarbonLoadDataCommand itself. > Load process has steps like parsing and converter step with bad record > support. > Insert into

Re: Optimize and refactor insert into command

2020-01-01 Thread sujith chacko
@ajantha Even from carbon to carbon table, the scenarios which i mentioned may be applicable. as i told above even though the schemas are same in all aspect but if there is a difference in column properties how you are going to handle. If the destination table needs a bad record feature enabled

Re: Optimize and refactor insert into command

2020-01-01 Thread Ajantha Bhat
Hi sujith, I still keep converter step for some scenarios like insert from parquet to carbon, we need an optimized converter here to convert from timestamp long value (divide by 1000) and convert null values of direct dictionary to 1. So, for the scenarios you mentioned, I will be using this flow

Re: Optimize and refactor insert into command

2020-01-01 Thread sujith chacko
Hi Ajantha, Thanks for your initiative, I have couple of questions even though. a) As per your explanation the dataset validation is already done as part of the source table, this is what you mean? What I understand is the insert select queries are going to get some benefits since we don’t do

Re: Optimize and refactor insert into command

2019-12-27 Thread Jacky Li
Definitely +1, please feel free to create JIRA issue and PR Regards, Jacky > 2019年12月20日 上午7:55,Ajantha Bhat 写道: > > Currently carbondata "insert into" uses the CarbonLoadDataCommand itself. > Load process has steps like parsing and converter step with bad record > support. > Insert into

Optimize and refactor insert into command

2019-12-19 Thread Ajantha Bhat
Currently carbondata "insert into" uses the CarbonLoadDataCommand itself. Load process has steps like parsing and converter step with bad record support. Insert into doesn't require these steps as data is already validated and converted from source table or dataframe. Some identified changes are