Hello! Can you filter the newly coming data and run CTAS only for those data? It will allow to avoid extra work.
Kind regards Vitalii On Mon, Dec 24, 2018 at 6:53 PM Idea Everywhere <[email protected]> wrote: > Hi Team, > > My current situation: > I have apache drill installed in AWS EC2 (M4.4x large) instances cluster > of 3 nodes. My source data is coming from S3 bucket. > I want to engage drill to read that data from S3, create tables within > itself (using CTAS) while the table data is stored in AWS EFS mapped to ec2 > instances created as mentioned above and allow the user to read the data > from those tables. > Tables and Partitioned tables are created as of now. > > Questions: > 1. It is observed that, when the tables are created, it reads the data > from source and the table is created along with that data (ie., if the > original source is 10GB, the tables stored in the file system are > comparable to that size). However, I have a question, if the source is > growing, how it gets into the CTAS tables or CTAS Partition by tables, so > that queries will result latest output > > Kind Regards > Kiran > >
