Re: Queries over Swift?

Sudheesh Katkam Wed, 14 Sep 2016 10:51:07 -0700

AFAIK, there is no documentation. I am not sure anyone has tried it before. 
That said, from [1], Swift enables Apache Hadoop applications - including 
MapReduce jobs, read and write data to and from instances of the OpenStack 
Swift object store. And Drill uses the HDFS client library. So using Swift 
through Drill should be possible.

My guess.. Create storage plugin named “swift”, copy the contents from the 
“dfs” plugin. I am not sure what the contents of “swift” should be exactly; see 
[1] and [2]. The parameters and values mentioned in the “Configuring” section 
in [1] should be provided through the “config” map in the storage plugin (or 
maybe through conf/core-site.xml in the Drill installation directory).

Something like:
{
  "type": "file",
  "enabled": true,
  "connection": "swift://dmitry.privatecloud/out/results",
  "workspaces": {
    ...
  },
  "formats": {
    ...
  }
  "config": {
    ...
  }
}

A roundabout way could use Swift through S3 [3]. Again, I do not know the exact 
configuration details.

Once you get things to work, you can also add a section to the Drill docs based 
on your experience!

Thank you,
Sudheesh

[1] https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html 
<https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html>
[2] http://drill.apache.org/docs/s3-storage-plugin/ 
<http://drill.apache.org/docs/s3-storage-plugin/>
[3] https://github.com/openstack/swift3 <https://github.com/openstack/swift3>

> On Sep 14, 2016, at 9:50 AM, MattK <[email protected]> wrote:
> 
> The Drill FAQ mentions that Swift can be queried as well as S3.
> 
> I have found an S3 plugin (https://drill.apache.org/docs/s3-storage-plugin/) 
> but nothing yet for docs, examples, or plugins for Swift.
> 
> Is there any documentation available?

Re: Queries over Swift?

Reply via email to