Robert Hou created DRILL-6594:
---------------------------------

             Summary: Data batches for Project operator are not being split 
properly and exceed the maximum specified
                 Key: DRILL-6594
                 URL: https://issues.apache.org/jira/browse/DRILL-6594
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Relational Operators
    Affects Versions: 1.14.0
            Reporter: Robert Hou
            Assignee: Karthikeyan Manivannan
             Fix For: 1.14.0


I ran this query:
alter session set `drill.exec.memory.operator.project.output_batch_size` = 
131072;
alter session set `planner.width.max_per_node` = 1;
alter session set `planner.width.max_per_query` = 1;
select * from (
select
case when false then c.CharacterValuea else i.IntegerValuea end IntegerValuea,
case when false then c.CharacterValueb else i.IntegerValueb end IntegerValueb,
case when false then c.CharacterValuec else i.IntegerValuec end IntegerValuec,
case when false then c.CharacterValued else i.IntegerValued end IntegerValued,
case when false then c.CharacterValuee else i.IntegerValuee end IntegerValuee
from (select * from dfs.`/drill/testdata/batch_memory/character5_1MB.parquet` 
order by CharacterValuea) c,
dfs.`/drill/testdata/batch_memory/integer5_1MB.parquet` i
where i.Index = c.Index and
c.CharacterValuea = '1234567890123100') limit 10;

An incoming batch looks like this:
2018-06-14 19:28:10,905 [24dcdbc7-2f42-16a9-56f1-9cf58bc549bc:frag:5:0] DEBUG 
o.a.d.e.p.i.p.ProjectMemoryManager - BATCH_STATS, incoming: Batch size:

{ Records: 32768, Total size: 20512768, Data size: 9175040, Gross row width: 
626, Net row width: 280, Density: 45% }
An outgoing batch looks like this:
2018-06-14 19:28:10,911 [24dcdbc7-2f42-16a9-56f1-9cf58bc549bc:frag:5:0] DEBUG 
o.a.d.e.p.i.p.ProjectRecordBatch - BATCH_STATS, outgoing: Batch size: { 
Records: 1023, Total size: 11018240, Data size: 138105, Gross row width: 10771, 
Net row width: 135, Density: 2% }

The data size (138105) exceeds the maximum batch size (131072).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to