Hi,
I am suing Drill on an AWS EMR cluster and trying to connect Hive and
MapR-FS storage plugins.
The hive plugin is connected successfully as inlined below; I created a
dummy table by hive shell and then I observed the following Drill Error
when browsing the dummy table with Drill Exporer:
*[30024]Query execution error. Details:[ IOException: No FileSystem for
scheme: maprfs*
I tried to connect the MapR-FS storage plugin but I am not sure if the port
number I am using is correct.
According to Drill documentation any distributed file system included in
core-site.xml can be used by Drill but I do not see any information related
to MapR-FS in this file. However, when I use the Amazon Hadoop the
information (hdfs nodename ip and port number ) is there and I can connect
to hive storage by Drill.
I wonder if you could let me know what I am missing in crafting the storage
plugins below:
Hive storage plugin:
{
"type": "hive",
"enabled": true,
"configProps": {
"hive.metastore.uris": "thrift://localhost:9083",
"hive.metastore.local": "false",
"hive.metastore.warehouse.dir": "/user/hive/warehouse"
}
}
and maprfs:
{
"type": "file",
"enabled": "false",
"connection": "maprfs://54.224.109.121:8041/",
"workspaces": {
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null
}
},
"formats": {
"psv": {
"type": "text",
"extensions": [
"tbl"
],
"delimiter": "|"
},
"csv": {
"type": "text",
"extensions": [
"csv"
],
"delimiter": ","
},
"tsv": {
"type": "text",
"extensions": [
"tsv"
],
"delimiter": "\t"
},
"parquet": {
"type": "parquet"
},
"json": {
"type": "json"
}
}
}
Thanks,
Alex