Jaehwa Jung created TAJO-2093:
---------------------------------
Summary: Partition Pruning doesn't handle Constant folding
occasionally for BETWEEN clause.
Key: TAJO-2093
URL: https://issues.apache.org/jira/browse/TAJO-2093
Project: Tajo
Issue Type: Bug
Components: Planner/Optimizer
Affects Versions: 0.12.0, 0.11.1
Reporter: Jaehwa Jung
Assignee: Jaehwa Jung
Fix For: 0.11.2
In some cases, Partition Pruning doesn't handle Constant folding for
{{BETWEEN|| clause as shown below.
{code:xml}
default> CREATE EXTERNAL TABLE student (
id INT,
name TEXT,
grade TEXT,
phone TEXT
) USING TEXT
PARTITION BY COLUMN (entrance_date TEXT)
LOCATION '/data/student';
default> \dfs -ls /data/student;
Found 6 items
drwx------ - tajo supergroup 0 2016-03-14 11:13
/data/student/entrance_date=20160308
drwx------ - tajo supergroup 0 2016-03-14 11:13
/data/student/entrance_date=20160309
drwx------ - tajo supergroup 0 2016-03-14 11:13
/data/student/entrance_date=20160310
drwx------ - tajo supergroup 0 2016-03-14 11:13
/data/student/entrance_date=20160311
drwx------ - tajo supergroup 0 2016-03-14 11:13
/data/student/entrance_date=20160312
drwx------ - tajo supergroup 0 2016-03-14 11:13
/data/student/entrance_date=20160313
default> \! date
2016. 03. 14. (MON) 15:19:27 KST
default> SELECT * FROM student WHERE entrance_date BETWEEN
TO_CHAR(ADD_DAYS(CURRENT_DATE(), -2) , 'YYYYMMDD') AND TO_CHAR(
ADD_DAYS(CURRENT_DATE(), -1) , 'YYYYMMDD');
{code}
Above select query returns two rows as expected. But when I found logical plan
on Tajo web UI, tajo retrieves all sub-directories of the example table as
following:
{code:xml}
PARTITIONS_SCAN(4) on default.student
=> filter: default.student.entrance_date (TEXT) BETWEEN ASYMMETRIC
to_char(add_days(current_date(),-2),YYYYMMDD) AND
to_char(add_days(current_date(),-1),YYYYMMDD)
=> target list: default.student.id (INT4), default.student.name (TEXT),
default.student.grade (TEXT), default.student.phone (TEXT),
default.student.entrance_date (TEXT)
=> num of filtered paths: 6
=> out schema: {(5) default.student.id (INT4), default.student.name (TEXT),
default.student.grade (TEXT), default.student.phone (TEXT),
default.student.entrance_date (TEXT)}
=> in schema: {(4) default.student.id (INT4), default.student.name (TEXT),
default.student.grade (TEXT), default.student.phone (TEXT)}
=> 0: hdfs://localhost:9010/data/student/entrance_date=20160308
=> 1: hdfs://localhost:9010/data/student/entrance_date=20160309
=> 2: hdfs://localhost:9010/data/student/entrance_date=20160310
=> 3: hdfs://localhost:9010/data/student/entrance_date=20160311
=> 4: hdfs://localhost:9010/data/student/entrance_date=20160312
=> 5: hdfs://localhost:9010/data/student/entrance_date=20160313
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)