If files are available through the HDFS API, which includes remote reads, Drill is able to read the files. A good use case for Drill is actually installing on a subset of your nodes to save the overhead of running the server everywhere while still being able to query all of your data. I have not seen this error before, but it looks like a low level HDFS error. Someone might have a better way to suggest testing this, but could you try to write a simple program (could be a map-reduce program, pig script etc.) to read the file and see if it is successful?
On Thu, Aug 20, 2015 at 4:13 AM, Malathi <[email protected]> wrote: > Hi, > > I have drill and zookeeper installed in my laptop. I started HDFS in my > laptop and see that I can query the csv and json files in HDFS. Now I > wanted to query the files located in another laptop. Hence I started hdfs > in the other laptop and when I gave the select * query, it failed(though I > can execute `show files` query without issues). > > The error I am getting is there in the dropbox link: > https://www.dropbox.com/s/5bgyw4jetweczoj/drill.log?dl=0 > > Environment : Both the laptops running Ubuntu > Apache drill version : 1.1.0 > > I have the following questions: > 1) Is it possible to run drill in a machine outside hadoop cluster and > query the hdfs files in the cluster? > 2) If yes, is there any need of additional configuration change? > > Thanks, > Malathi >
