[ 
https://issues.apache.org/jira/browse/DRILL-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087229#comment-16087229
 ] 

Volodymyr Vysotskyi commented on DRILL-4264:
--------------------------------------------

On the one hand user knows that {{tmp}} workspace inside {{dfs}} plugin, so 
user expects that query 
{code:sql}
use dfs.tmp;
{code}
should work. On the other hand query 
{code:sql}
show schemas;
{code}
returns the schema name {{dfs.tmp}}. So user also expects that the query with 
such schema name should work.
{code:sql}
use `dfs.tmp`;
{code}
Both these cases work since schema name and its path are the same.

Schemas names with dots currently does not work in Drill. It is due to the 
handling schema paths in [this 
way|https://github.com/apache/drill/blob/3e8b01d5b0d3013e3811913f0fd6028b22c1ac3f/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserSession.java#L201]

Queries
{code:sql}
SELECT `dfs`.`ds`.`foo.json`.`a`.`b.c` FROM `dfs`.`ds`.`foo.json` (1)
SELECT `dfs.ds.foo.json.a`.`b.c` FROM `dfs.ds`.`foo.json` (2)
SELECT `dfs.ds.foo.json.a.b.c` FROM `dfs.ds.foo.json` (3)
{code}
will not work since Drill allows only table names or aliases before the field 
names. 
Considering only from clause, third case would not work, since Drill assumes 
that {{`dfs.ds.foo.json`}} is the schema name only.
Queries on the directories with dots and table names with dots also works 
correctly. 

Hive has an 
[option|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.support.quoted.identifiers]
 that allows dots in the columns names (by default dots in the columns is 
allowed). 
Parquet also allows field names with dots. 
Also current version of Drill can create parquet files with dots in field 
names, but Drill will fail when querying this file.

> Dots in identifier are not escaped correctly
> --------------------------------------------
>
>                 Key: DRILL-4264
>                 URL: https://issues.apache.org/jira/browse/DRILL-4264
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Codegen
>            Reporter: Alex
>            Assignee: Volodymyr Vysotskyi
>
> If you have some json data like this...
> {code:javascript}
>     {
>       "0.0.1":{
>         "version":"0.0.1",
>         "date_created":"2014-03-15"
>       },
>       "0.1.2":{
>         "version":"0.1.2",
>         "date_created":"2014-05-21"
>       }
>     }
> {code}
> ... there is no way to select any of the rows since their identifiers contain 
> dots and when trying to select them, Drill throws the following error:
> Error: SYSTEM ERROR: UnsupportedOperationException: Unhandled field reference 
> "0.0.1"; a field reference identifier must not have the form of a qualified 
> name
> This must be fixed since there are many json data files containing dots in 
> some of the keys (e.g. when specifying version numbers etc)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to