Hi, I would like to know more about how Drill's parallel processing of queries relates, if at all, to the parallel nature of a data source such as Hadeoop. Am I correct in thinking that if a Drill cluster is querying data from a Hadoop cluster, that the drillbits are unaware of where the data resides in HDFS, as their interaction is through the NameNode. If this is the case, how does scaling Drill out help performance if it's always having to route through the NameNode?
Sorry if this is a silly question. I've tried to find the answer by reading the documentation and the mailing list, but I'm still not clear on it. Thanks, Tom
