Lucian Poth created DRILL-5535:
----------------------------------
Summary: Paging Problem with Querying Directories
Key: DRILL-5535
URL: https://issues.apache.org/jira/browse/DRILL-5535
Project: Apache Drill
Issue Type: Bug
Components: Functions - Drill
Affects Versions: 1.10.0
Environment: Debian 8
Hadoop with Kerberos security
Reporter: Lucian Poth
Problem comes with the following Drill query:
"SELECT * FROM <<mySource>>
WHERE (dir0='Test1' AND dir1='TestDataSourceID1')
OR (dir0='Test2' AND dir1='TestDataSourceID2')
LIMIT 2 OFFSET 0"
If this call gets run twice it is randomly set which file will be in the
result. So if a query is created which should page my result I won't be able to
tell which source was used for the result.
Due two the fact that if file1 contains the columns a, b, c and column b, c, d
I also will get a problem with the result as the first results will for example
contain the columns a, b, c and the second half of the results will contain a,
b, c, d with a filled with null.
As in the example on your webpage
(https://drill.apache.org/docs/querying-directories/) where you query specific
columns and order the result without any paging I am wondering if this problem
only occurs while using the star in the query.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)