Re: Clarification regarding Apache drill setup

2020-01-21 Thread Dan Blondowski
unsubrscibe On 8/16/19, 7:16 AM, "Nitin Pawar" wrote: This message originated from outside of DHI From my learning and I could be wrong in few things but wait for others to answer as well 1. When stetting up the drill cluster in prod environment to quer

Re: Clarification regarding Apache drill setup

2019-08-16 Thread Ted Dunning
My guess is that spilling to S3 will be disastrously slow. On Fri, Aug 16, 2019 at 9:37 AM Paul Rogers wrote: > Hi Manu, > > To add a bit more background... Drill uses local storage only for spilling > result sets when they are too large for memory. Otherwise, data never > touches disk once re

Re: Clarification regarding Apache drill setup

2019-08-16 Thread Paul Rogers
Hi Manu, To add a bit more background... Drill uses local storage only for spilling result sets when they are too large for memory. Otherwise, data never touches disk once read from S3. Unlike Snowflake, Drill does not cache S3 data locally. This means that, if you query the same file multiple

Re: Clarification regarding Apache drill setup

2019-08-16 Thread Nitin Pawar
>From my learning and I could be wrong in few things but wait for others to answer as well 1. When stetting up the drill cluster in prod environment to query data ranging from several gigabytes to few terabytes hosted in s3/blob storage/cloud storage, what are the considerations for disk space ?

Clarification regarding Apache drill setup

2019-08-15 Thread Manu Mukundan
Hi, My name is Manu and I am working as a Bigdata architect in a small startup company in Kochi, India. Our new project handles visualizing large volume of unstructured data in cloud storage (It can be S3, Azure blob storage or Google cloud storage). We are planning to use Apache Drill as SQL q