Drill with Hadoop cluster

Siddharth Jain Wed, 17 Nov 2021 04:46:40 -0800

Hi,

I am evaluating Drill for requirement to query the HDFS cluster wherethe data is stored in parquet file format.I was able to setup a Drill cluster of 3 Nodes with zookeeper afterfollowing some links.On the storage plugin I setup the hdfs with connection to my hdfs URLand can successfully write SQL query in drill web UI and get the resultsbut this on gets data of 1 node only.


I now have some basics questions-
1. Does the storage plugin needs to point to master node of HDFS cluster?

2. Once a SQL query is fired will it fetch data from all nodes in thecluster or just one node? OR I have to setup the drill on yarn(https://drill.apache.org/docs/drill-on-yarn-introduction/<https://drill.apache.org/docs/drill-on-yarn-introduction/>) to getresult from all nodes?3. My requirement is to use JDBC to query the HDFS cluster (the searchdata can go large) in real time and display result in web UI, do let meknow if Drill will be a

    good fit for this use case

4. Do we have any performance bench marks of Drill against Presto andImpala?


Thanks in advance,
Sidd

Drill with Hadoop cluster

Reply via email to