First off, this is my first attempt at drill,
(BTW: congratulations on the release ;-)
so perhaps I misunderstood something

I want to query my parquet files on HDFS.

I setup the 1.0 release on a machine (node1)
that already had CDH5 and a working Zookeeper.
With the hdfs storage plugin config below I can query a parquet file
on the local machine just fine.
E.g.:
0: jdbc:drill:drillbit=localhost> select a,b,c FROM hdfs.`/hdfs/path/test.par` 
limit 5;

  ## drill-override.conf
  drill.exec: {
   cluster-id: "mydrillcluster",
    zk.connect: "node1:2181"
  }
  ## storage plugin config
  {
    "type": "file",
    "enabled": true,
    "connection": "hdfs://127.0.0.1:8020/",
    "workspaces": null,
    "formats": {
      "parquet": {
        "type": "parquet"
      }
    }
  }

Can I query a remote HDFS, by simply pointing the storage plugin config?
After changing the IP address in the connection parameter above, I get this 
error.

0: jdbc:drill:drillbit=localhost> select a,b,c  FROM hdfs.`/tmp/test.par` limit 
5;

Error: PARSE ERROR: From line 1, column 38 to line 1, column 41: Table 
'hdfs./tmp/test.par' not found
[Error Id: 4156f66c-3dac-4e87-b7f8-f0bdc19d57d7 on node1.company.com:31010] 
(state=,code=0)
.....
Caused by: org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: 
From line 1, column 38 to line 1, column 41: Table 'hdfs./tmp/test.par' not 
found

But the namenode:port/path is correct because this workds from node1:

[alan@node1 drill]$ hdfs dfs -fs hdfs://10.10.10.10:8020/ -ls /tmp/test.par
-rw-r--r--   1 alan supergroup    4947359 2015-05-21 13:55 /tmp/test.par


Alan

Reply via email to