Re: Query on tables

Kiran Kumar NS Thu, 27 Dec 2018 06:44:15 -0800

Hi Team,

Is there any update on below questions, please


Kind Regards 
Kiran

Sent from my iPhone

> On Dec 26, 2018, at 10:41 AM, Kiran Kumar NS <[email protected]> wrote:
> 
> Hi Vitalii,
> 
> Thanks for getting back.
> Yes, I did try it. But, I got the table is already present. It is because I 
> don’t want to create another table with new table name for that changed data .
> 
> Also, if it works in any which way, every time the source changes, is there 
> any automatic way to get the drill tables also updated?
> 
> Also, I have another set of queries.
> 
> If I set max proc mem, max direct mem it is not reflected in cluster. I see 
> it in UI. But if I change java heap mem, it reflects. Wanted to understand 
> when we set these settings, how still distributes this setting to cluster. I 
> did observe whatever the settings I do in options link of UI, it reflects in 
> the whole cluster. But, if I do it in drill-env.sh, it does not. I have to go 
> and change in all the nodes. Also, it is mentioned that distrib-env.sh should 
> not be used by users. Essentially, I want to understand, how the 
> configurations are propagated to all nodes in cluster and how can I reap the 
> benefit of additional compute of the cluster so that my queries are executed 
> fast. Currently one CTAS command and partition by commands are taking hours 
> to process GBs of data.
> 
> I can provide statistics of elapsed time of query execution if required, for 
> your analysis.
> 
> Kind Regards 
> Kiran
> 
> Sent from my iPhone
> 
>> On Dec 25, 2018, at 2:16 PM, Vitalii Diravka <[email protected]> wrote:
>> 
>> Hello!
>> 
>> Can you filter the newly coming data and run CTAS only for those data? It 
>> will allow to avoid extra work.
>> 
>> Kind regards 
>> Vitalii 
>> 
>> 
>>> On Mon, Dec 24, 2018 at 6:53 PM Idea Everywhere <[email protected]> 
>>> wrote:
>>> Hi Team,
>>> 
>>> My current situation:
>>> I have apache drill installed in AWS EC2 (M4.4x large) instances cluster of 
>>> 3 nodes. My source data is coming from S3 bucket.
>>> I want to engage drill to read that data from S3, create tables within 
>>> itself (using CTAS) while the table data is stored in AWS EFS mapped to ec2 
>>> instances created as mentioned above and allow the user to read the data 
>>> from those tables.
>>> Tables and Partitioned tables are created as of now.
>>> 
>>> Questions:
>>> 1. It is observed that, when the tables are created, it reads the data from 
>>> source and the table is created along with that data (ie., if the original 
>>> source is 10GB, the tables stored in the file system are comparable to that 
>>> size). However, I have a question, if the source is growing, how it gets 
>>> into the CTAS tables or CTAS Partition by tables, so that queries will 
>>> result latest output
>>> 
>>> Kind Regards
>>> Kiran
>>>

Re: Query on tables

Reply via email to