RE: Drill Configuration Requirements To Query Data in Tera Bytes

Surneni Tilak Tue, 31 Jul 2018 02:12:41 -0700

Hi Abhishek,

Thanks for your response . I will try with the approach that you have suggested 
and come back if I I need any further help.

Best regards, 
________
Tilak 

-----Original Message-----
From: Abhishek Girish [mailto:[email protected]] 
Sent: Monday, July 30, 2018 9:43 PM
To: user <[email protected]>
Subject: Re: Drill Configuration Requirements To Query Data in Tera Bytes

Hey Tilak,

We don't have any official sizing guidelines - for planning a Drill cluster. A 
lot of it depends on the type of queries being executed (simple look-ups vs 
complex joins), data format (columnar data such as Parquet shows best 
performance), and system load (running a single query on nodes dedicated for 
Drill).

It also depends on the type of machines you have - for example with beefy nodes 
with lots of RAM and CPU, you'll need fewer number of nodes running Drill.

I would recommend getting started with a 4-10 node cluster with a good amount 
of memory you can spare. And based on the results try and figure out your own 
sizing guideline (either to add more nodes or increase memory [1]).

If you share more details, it could be possible to suggest more.

[1] http://drill.apache.org/docs/configuring-drill-memory/

On Mon, Jul 30, 2018 at 1:57 AM Surneni Tilak <[email protected]>
wrote:

> Hi Team,
>
> May I know the ideal configuration requirements to query data of size 
> 10 TB with query time under 5 minutes. Please suggest me regarding the 
> number of Drilbits that I have to use and the RAM(Direct-Memory  & 
> Heap_Memory) that each drill bit should consists of to complete the 
> queries within the desired time.
>
> Best regards,
> _________
> Tilak
>
>
>

RE: Drill Configuration Requirements To Query Data in Tera Bytes

Reply via email to