[ 
https://issues.apache.org/jira/browse/DRILL-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andre Pomp updated DRILL-4306:
------------------------------
    Description: 
Hello everyone,

today I defined my own storage plugin and thereby I observed strange behaviour. 

The plugin has the following structure: 
{
  "type": "file",
  "enabled": true,
  "connection": "file:///",
  "workspaces": {
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null
    },
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null
    }
  },
  "formats": {
    "hs": {
      "type": "json",
      "extensions": [
        "hs"
      ]
    }
  }
}

I prepared a folder with the following files inside. 
.../testdir/test.hs
.../testdir/test1.hs
.../testdir/test2.hs
.../testdir/test3.hss
.../testdir/test4.csv

Based on this folder, I started to prepare queries:

myplugin.`C:Users\someuser\Desktop\testdir`

Here, I expected that the plugin only selects files with the .hs extension. 
However, I explored that all files were loaded instead of loading only .hs 
files as Drill does when querying 
myplugin.`C:Users\someuser\Desktop\testdir\*.hs`

However, I also detected more strange behaviour. 
When the first file in the folder starts with .hs, all files are loaded 
(independent of their extension as described above). However, when the first 
file starts with another extension (e.g., .json, .csv) then Apache Drill fails 
and says that the folder contains invalid extensions. Therefore, I currently 
assume that the current implementation just checks the extension of the first 
File in the folder and then reacts as described above. 

However, in my opinion this behaviour seems strange. 

Usually I would expect that drill acts like it does for performing the 
following query:
myplugin.`C:Users\someuser\Desktop\testdir\*.hs`

Best regards,
André

  was:
Hello everyone,

today I defined my own storage plugin and thereby I observed strange behaviour. 

The plugin has the following structure: 
{
  "type": "file",
  "enabled": true,
  "connection": "file:///",
  "workspaces": {
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null
    },
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null
    }
  },
  "formats": {
    "hs": {
      "type": "json",
      "extensions": [
        "hs"
      ]
    }
  }
}

I prepared a folder with the following files inside. 
.../testdir/test.hs
.../testdir/test1.hs
.../testdir/test2.hs
.../testdir/test3.hss
.../testdir/test4.csv

Based on this folder, I started to prepare queries:

dfs.`C:Users\someuser\Desktop\testdir`

Here, I expected that the plugin only selects files with the .hs extension. 
However, I explored that all files were loaded instead of loading only .hs 
files as Drill does when querying 
dfs.`C:Users\someuser\Desktop\testdir\*.hs`

However, I also detected more strange behaviour. 
When the first file in the folder starts with .hs, all files are loaded 
(independent of their extension as described above). However, when the first 
file starts with another extension (e.g., .json, .csv) then Apache Drill fails 
and says that the folder contains invalid extensions. Therefore, I currently 
assume that the current implementation just checks the extension of the first 
File in the folder and then reacts as described above. 

However, in my opinion this behaviour seems strange. 

Usually I would expect that drill acts like it does for performing the 
following query:
dfs.`C:Users\someuser\Desktop\testdir\*.hs`

Best regards,
André


> Extensions for Storage Plugins are not recognized properly 
> -----------------------------------------------------------
>
>                 Key: DRILL-4306
>                 URL: https://issues.apache.org/jira/browse/DRILL-4306
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Functions - Drill
>    Affects Versions: 1.3.0, 1.4.0
>         Environment: Apache Drill 1.3.0 (and 1.4.0), Windows 10, Java 8
>            Reporter: Andre Pomp
>              Labels: bug, improve
>             Fix For: Future
>
>
> Hello everyone,
> today I defined my own storage plugin and thereby I observed strange 
> behaviour. 
> The plugin has the following structure: 
> {
>   "type": "file",
>   "enabled": true,
>   "connection": "file:///",
>   "workspaces": {
>     "root": {
>       "location": "/",
>       "writable": false,
>       "defaultInputFormat": null
>     },
>     "tmp": {
>       "location": "/tmp",
>       "writable": true,
>       "defaultInputFormat": null
>     }
>   },
>   "formats": {
>     "hs": {
>       "type": "json",
>       "extensions": [
>         "hs"
>       ]
>     }
>   }
> }
> I prepared a folder with the following files inside. 
> .../testdir/test.hs
> .../testdir/test1.hs
> .../testdir/test2.hs
> .../testdir/test3.hss
> .../testdir/test4.csv
> Based on this folder, I started to prepare queries:
> myplugin.`C:Users\someuser\Desktop\testdir`
> Here, I expected that the plugin only selects files with the .hs extension. 
> However, I explored that all files were loaded instead of loading only .hs 
> files as Drill does when querying 
> myplugin.`C:Users\someuser\Desktop\testdir\*.hs`
> However, I also detected more strange behaviour. 
> When the first file in the folder starts with .hs, all files are loaded 
> (independent of their extension as described above). However, when the first 
> file starts with another extension (e.g., .json, .csv) then Apache Drill 
> fails and says that the folder contains invalid extensions. Therefore, I 
> currently assume that the current implementation just checks the extension of 
> the first File in the folder and then reacts as described above. 
> However, in my opinion this behaviour seems strange. 
> Usually I would expect that drill acts like it does for performing the 
> following query:
> myplugin.`C:Users\someuser\Desktop\testdir\*.hs`
> Best regards,
> André



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to