Can you do a NFS connection to the webserver? Then maybe just use a local fs storage plugin with the NFS mount as the workspace.
I have not tried it myself, but it may be an option to test in your case. --Andries > On Oct 7, 2016, at 11:39 AM, Di Pe <[email protected]> wrote: > > Hi, > > I have a couple of 100 csv files on a web server that I can just pull down > via https without any credentials, I wonder how I can write a storage > plugin for drill that pull these files directly from the web web server > without having to download them to the local file system. > > I have a couple of options: > > 1) the plugin could just do to a simple http directory listing to get these > files > 2) I could provide a text file with the urls of the files, simply like > https://mywebserver.com/myfolder/myfile1.csv > https://mywebserver.com/myfolder/myfile2.csv > 3) the web server supports json file listing like this > curl -s https://mywebserver.com/myfolder?format=json | python -m > json.tool > [ > { > "hash": "e5f62378c79ec9c491aa130374dba93b", > "last_modified": "2016-09-30T19:15:45.730950", > "bytes": 211169, > "name": "myfile1.csv", > "content_type": "text/csv" > }, > { > > Option 3 would be the most elegant to me > > > does something like this already exist or would I duplicate the s3 plugin > and modify it? > > like this ? > > Thanks for your help! > dipe > > > { > "type": "file", > "enabled": true, > "connection": "https://mywebserver.com/myfolder?format=json", > "config": null, > "workspaces": { > "root": { > "location": "/", > "writable": false, > "defaultInputFormat": null > }, > "tmp": { > "location": "/tmp", > "writable": true, > "defaultInputFormat": null > } > }, > "formats": { > "psv": { > "type": "text", > "extensions": [ > "tbl" > ], > "delimiter": "|" > }, > "csv": { > "type": "text", > "extensions": [ > "csv" > ], > "delimiter": "," > }, > "tsv": { > "type": "text", > "extensions": [ > "tsv" > ], > "delimiter": "\t" > }, > "parquet": { > "type": "parquet" > }, > "json": { > "type": "json", > "extensions": [ > "json" > ] > }, > "avro": { > "type": "avro" > }, > "sequencefile": { > "type": "sequencefile", > "extensions": [ > "seq" > ] > }, > "csvh": { > "type": "text", > "extensions": [ > "csvh" > ], > "extractHeader": true, > "delimiter": "," > } > } > }
