Paul Rogers created DRILL-5553:
----------------------------------
Summary: SELECT *, columns produces nonsense results
Key: DRILL-5553
URL: https://issues.apache.org/jira/browse/DRILL-5553
Project: Apache Drill
Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers
Priority: Minor
Consider the case discussed in DRILL-5551. Create a slight variation.
Input file: CSV with headers:
{code}
a,b,c
10,foo,bar
{code}
As in DRILL-5550, CSV plugin is configured to use headers.
Run this (admittedly strange) query:
{code}
SELECT *, columns FROM `dfs.data.example.csv`
{code}
The resulting schema is:
{code}
BatchSchema [fields=[
a(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)],
b(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)],
c(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)],
columns(INT:OPTIONAL) [$bits$(UINT1:REQUIRED), columns(INT:OPTIONAL)]],
selectionVector=NONE]
{code}
To make it easier to read:
{code}
a(VARCHAR:REQUIRED),
b(VARCHAR:REQUIRED).
c(VARCHAR:REQUIRED),
columns(INT:OPTIONAL)
{code}
In DRILL-5551, {{columns}} changes meaning from an array of columns to a blank
normal column. Here, it changes meaning again to a nullable Int (our normal
"placeholder" for missing columns.)
Expected:
1. That, per DRILL-5552, no other column reference can occur with "*".
2. If item 1 is not fixed, that the scanner (or text reader) forbid the use of
either "*" or "columns" with other column references.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)