[
https://issues.apache.org/jira/browse/DRILL-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16327012#comment-16327012
]
Hari Sekhon commented on DRILL-6061:
------------------------------------
Yes this is what we found after raising this, we will point to MapR-FS
/apps/drill/pstore to follow Hadoop layout best practices convention and test.
I think this should be documented a bit better / easier to find, perhaps in FAQ
or a section stating something like "Global Query List - how to see the queries
on the cluster from any Drill node" in the Apache Drill documentation. There is
a MapR community connection to response to this as well:
. This then makes sense to
I recommend also replacing {{<directory to store pstore data>}} with a single
best practice path of {{/apps/drill/pstore}} to standardize this and fall in
line with other apps on Hadoop clusters.
It's also worth documenting the load balancing algorithm used for load
balancing across Drill nodes when acquiring a drillbit via zookeeper quorum
referral (random, round robin, least connection etc).
> Feature Request: Global Query List showing queries from all Drill foreman
> nodes
> -------------------------------------------------------------------------------
>
> Key: DRILL-6061
> URL: https://issues.apache.org/jira/browse/DRILL-6061
> Project: Apache Drill
> Issue Type: New Feature
> Components: Server, Documentation, Metadata, Query Planning &
> Optimization, Tools, Build & Test, Web Server
> Affects Versions: 1.11.0
> Environment: MapR 5.2
> Reporter: Hari Sekhon
> Priority: Major
>
> Feature Request to add a Global Query List to show all queries executed
> across all Drill nodes in a cluster for better management and auditing.
> Right now there doesn't appear to be a way to see all queries across all
> nodes in a Drill cluster. The Web UI on any given Drill node only shows the
> queries coordinated by that local node if acting as the foreman for the
> query, so if using ZooKeeper or a Load Balancer to distribute queries via
> different Drill nodes then the query list will be spread across lots of
> different nodes with no global timeline of queries.
> This seems to leave a bit of a gap in auditing functionality, with the only
> other option that I can think of being immediately available is to limit all
> query submissions via a single foreman node so the query list is complete on
> that node - although that doesn't seem like a great idea in terms of load
> distribution of query planning, coordination and final aggregation steps.
> I've made load balancing configurations for Apache Drill and similar
> technologies that could be used for that purpose with failover support to
> maintain high availability at
> https://github.com/HariSekhon/nagios-plugins/tree/master/haproxy) but would
> still prefer if Drill was designed to store the global list of queries
> submitted in a centralized place.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)