[jira] [Comment Edited] (DRILL-7308) Incorrect Metadata from text file queries

Paul Rogers (JIRA) Mon, 24 Jun 2019 22:28:10 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16871990#comment-16871990
 ]


Paul Rogers edited comment on DRILL-7308 at 6/25/19 5:27 AM:
-------------------------------------------------------------

The width issue appears to have been introduced with this commit: "DRILL-6847: 
Add Query Metadata to RESTful Interface" (which, ahem, [~cgivre], was your 
PR...). In {{WebUserConnection}}:

{code:java}
          //For DECIMAL type
          if (col.getType().hasPrecision()) {
            dataType.append("(");
            dataType.append(col.getType().getPrecision());

            if (col.getType().hasScale()) {
              dataType.append(", ");
              dataType.append(col.getType().getScale());
            }

            dataType.append(")");
          } else if (col.getType().hasWidth()) {
            //Case for VARCHAR columns with specified width
            dataType.append("(");
            dataType.append(col.getType().getWidth());
            dataType.append(")");
          }
{code}

I did not debug the code, but it appears that {{hasPrecision()}} and 
{{hasScale()}} simply report if the field is set; it does *not* tell us if the 
field is zero.

Also, about a year or so ago, Drill moved {{VARCHAR}} width to the precision 
field, so the supposed {{VARCHAR}} code block is a no-op.

The correct code would be something like:

{code:java}
          //For DECIMAL and VARCHAR types
          if (col.getType().hasPrecision() && col.getType().getPrecision() > 0) 
{
            dataType.append("(");
            dataType.append(col.getType().getPrecision());

            if (col.getType().hasScale() && col.getType().getScale() > 0) {
{code}


was (Author: paul.rogers):
The width issue appears to have been introduced with this commit: "DRILL-6847: 
Add Query Metadata to RESTful Interface" (which, ahem, [~cgivre], was your 
PR...):

{code:java}
          //For DECIMAL type
          if (col.getType().hasPrecision()) {
            dataType.append("(");
            dataType.append(col.getType().getPrecision());

            if (col.getType().hasScale()) {
              dataType.append(", ");
              dataType.append(col.getType().getScale());
            }

            dataType.append(")");
          } else if (col.getType().hasWidth()) {
            //Case for VARCHAR columns with specified width
            dataType.append("(");
            dataType.append(col.getType().getWidth());
            dataType.append(")");
          }
{code}

I did not debug the code, but it appears that {{hasPrecision()}} and 
{{hasScale()}} simply report if the field is set; it does *not* tell us if the 
field is zero.

Also, about a year or so ago, Drill moved {{VARCHAR}} width to the precision 
field, so the supposed {{VARCHAR}} code block is a no-op.

The correct code would be something like:

{code:java}
          //For DECIMAL and VARCHAR types
          if (col.getType().hasPrecision() && col.getType().getPrecision() > 0) 
{
            dataType.append("(");
            dataType.append(col.getType().getPrecision());

            if (col.getType().hasScale() && col.getType().getScale() > 0) {
{code}

> Incorrect Metadata from text file queries
> -----------------------------------------
>
>                 Key: DRILL-7308
>                 URL: https://issues.apache.org/jira/browse/DRILL-7308
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Metadata
>    Affects Versions: 1.17.0
>            Reporter: Charles Givre
>            Priority: Major
>         Attachments: Screen Shot 2019-06-24 at 3.16.40 PM.png, domains.csvh
>
>
> I'm noticing some strange behavior with the newest version of Drill.  If you 
> query a CSV file, you get the following metadata:
> {code:sql}
> SELECT * FROM dfs.test.`domains.csvh` LIMIT 1
> {code}
> {code:json}
> {
>   "queryId": "22eee85f-c02c-5878-9735-091d18788061",
>   "columns": [
>     "domain"
>   ],
>   "rows": [}
>    {       "domain": "thedataist.com"     }  ],
>   "metadata": [
>     "VARCHAR(0, 0)",
>     "VARCHAR(0, 0)"
>   ],
>   "queryState": "COMPLETED",
>   "attemptedAutoLimit": 0
> }
> {code}
> There are two issues here:
> 1.  VARCHAR now has precision
> 2.  There are twice as many columns as there should be.
> Additionally, if you query a regular CSV, without the columns extracted, you 
> get the following:
> {code:json}
>  "rows": [
>  { 
>       "columns": "[\"ACCT_NUM\",\"PRODUCT\",\"MONTH\",\"REVENUE\"]"     }
>   ],
>    "metadata": [
>      "VARCHAR(0, 0)",
>      "VARCHAR(0, 0)"
>    ],
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (DRILL-7308) Incorrect Metadata from text file queries

Reply via email to