Re: Can not apply my custom filter instead of EnumerableCalc

2018-02-08 Thread Masayuki Takahashi
> But I can not apply my arrow filter instead of EnumerableCalc. Would
> you tell me the reason?

I solved my problem by adding a Arrow Convention to ArrowFilter. Thanks, Julian.

> Would you mind logging a JIRA case for it, to track progress?

Created:
https://issues.apache.org/jira/browse/CALCITE-2173




2018-02-06 2:22 GMT+09:00 Masayuki Takahashi :
> Dear Julian,
>
>> I am delighted that someone is implementing an Arrow adapter. It would
>> be a good replacement for the Enumerable adapter someday. Would you
>> mind logging a JIRA case for it, to track progress?
>
> Now I don't know the right way to scan data with columnar-oriented
> style in Calcite, but I have a little idea, so I would like you to
> see it if you like. I will create a JIRA case and describe it after I
> solved this problem.
>
>> Maybe the cost of ArrowFilter followed ArrowToEnumerableConverter is
>> higher than ArrowToEnumerableConverter followed by EnumerableCalc.
>
> I see. I will look around the cost calculation. Thanks.
>
> 2018-02-05 14:05 GMT+09:00 Julian Hyde :
>> I am delighted that someone is implementing an Arrow adapter. It would
>> be a good replacement for the Enumerable adapter someday. Would you
>> mind logging a JIRA case for it, to track progress?
>>
>> I think the reason that ArrowFilter is not being chosen is because of
>> cost. Note this:
>>
>>> [rel#56:ArrowFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],expr#0..3={inputs},expr#4=?0,expr#5==($t2,
>>>  
>>> $t4),N_NATIONKEY=$t0,N_NAME=$t1,N_REGIONKEY=$t2,N_COMMENT=$t3,$condition=$t5)]
>>> PLANNER = org.apache.calcite.plan.volcano.VolcanoPlanner@499683c4; TICK = 
>>> 28/26; PHASE = OPTIMIZE; COST = {inf}
>>
>> Maybe the cost of ArrowFilter followed ArrowToEnumerableConverter is
>> higher than ArrowToEnumerableConverter followed by EnumerableCalc.
>>
>> Julian
>>
>>
>> On Sun, Feb 4, 2018 at 1:46 AM, Masayuki Takahashi
>>  wrote:
>>> Hi,
>>>
>>> I try to implement Arrow Adapter.
>>>
>>> https://github.com/masayuki038/calcite/tree/arrow/arrow/src/main/java/org/apache/calcite/adapter/arrow
>>>
>>> But I can not apply my arrow filter instead of EnumerableCalc. Would
>>> you tell me the reason?
>>>
>>> ArrowFilter was created by ArrowFilterTableScanRule.
>>> (logs are follow:)
>>>
>>> ---
>>> PLANNER = org.apache.calcite.plan.volcano.VolcanoPlanner@499683c4;
>>> TICK = 27/25; PHASE = OPTIMIZE; COST = {inf}
>>> Pop match: rule [ArrowFilterTableScanRule] rels
>>> [rel#31:LogicalFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],condition==($2,
>>> ?0)), rel#0:ArrowTableScan.ARROW.[](table=[SAMPLES,
>>> NATIONSSF],fields=[0, 1, 2, 3])]
>>> call#254: Apply rule [ArrowFilterTableScanRule] to
>>> [rel#31:LogicalFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],condition==($2,
>>> ?0)), rel#0:ArrowTableScan.ARROW.[](table=[SAMPLES,
>>> NATIONSSF],fields=[0, 1, 2, 3])]
>>> Transform to: rel#56 via ArrowFilterTableScanRule
>>> call#254 generated 1 successors:
>>> [rel#56:ArrowFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],expr#0..3={inputs},expr#4=?0,expr#5==($t2,
>>> $t4),N_NATIONKEY=$t0,N_NAME=$t1,N_REGIONKEY=$t2,N_COMMENT=$t3,$condition=$t5)]
>>> PLANNER = org.apache.calcite.plan.volcano.VolcanoPlanner@499683c4;
>>> TICK = 28/26; PHASE = OPTIMIZE; COST = {inf}
>>> ---
>>>
>>> But it was not present in the final plan.
>>>
>>> ---
>>> Plan after physical tweaks: EnumerableCalc(expr#0..2=[{inputs}],
>>> expr#3=[?0], expr#4=[=($t2, $t3)], proj#0..1=[{exprs}],
>>> $condition=[$t4]): rowcount = 15.0, cumulative cost = {125.0 rows,
>>> 911.0 cpu, 0.0 io}, id = 80
>>> ArrowToEnumerableConverter: rowcount = 100.0, cumulative cost = {110.0
>>> rows, 111.0 cpu, 0.0 io}, id = 70
>>> ArrowTableScan(table=[[SAMPLES, NATIONSSF]], fields=[[0, 1, 2]]):
>>> rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io},
>>> id = 54
>>> ---
>>>
>>> I would like to know what rules EnumerableCalc chosen.
>>>
>>> thanks.
>>>
>>> --
>>> Masayuki Takahashi
>
>
>
> --
> 高橋 真之



-- 
高橋 真之


Re: Can not apply my custom filter instead of EnumerableCalc

2018-02-05 Thread Masayuki Takahashi
Dear Julian,

> I am delighted that someone is implementing an Arrow adapter. It would
> be a good replacement for the Enumerable adapter someday. Would you
> mind logging a JIRA case for it, to track progress?

Now I don't know the right way to scan data with columnar-oriented
style in Calcite, but I have a little idea, so I would like you to
see it if you like. I will create a JIRA case and describe it after I
solved this problem.

> Maybe the cost of ArrowFilter followed ArrowToEnumerableConverter is
> higher than ArrowToEnumerableConverter followed by EnumerableCalc.

I see. I will look around the cost calculation. Thanks.

2018-02-05 14:05 GMT+09:00 Julian Hyde :
> I am delighted that someone is implementing an Arrow adapter. It would
> be a good replacement for the Enumerable adapter someday. Would you
> mind logging a JIRA case for it, to track progress?
>
> I think the reason that ArrowFilter is not being chosen is because of
> cost. Note this:
>
>> [rel#56:ArrowFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],expr#0..3={inputs},expr#4=?0,expr#5==($t2,
>>  
>> $t4),N_NATIONKEY=$t0,N_NAME=$t1,N_REGIONKEY=$t2,N_COMMENT=$t3,$condition=$t5)]
>> PLANNER = org.apache.calcite.plan.volcano.VolcanoPlanner@499683c4; TICK = 
>> 28/26; PHASE = OPTIMIZE; COST = {inf}
>
> Maybe the cost of ArrowFilter followed ArrowToEnumerableConverter is
> higher than ArrowToEnumerableConverter followed by EnumerableCalc.
>
> Julian
>
>
> On Sun, Feb 4, 2018 at 1:46 AM, Masayuki Takahashi
>  wrote:
>> Hi,
>>
>> I try to implement Arrow Adapter.
>>
>> https://github.com/masayuki038/calcite/tree/arrow/arrow/src/main/java/org/apache/calcite/adapter/arrow
>>
>> But I can not apply my arrow filter instead of EnumerableCalc. Would
>> you tell me the reason?
>>
>> ArrowFilter was created by ArrowFilterTableScanRule.
>> (logs are follow:)
>>
>> ---
>> PLANNER = org.apache.calcite.plan.volcano.VolcanoPlanner@499683c4;
>> TICK = 27/25; PHASE = OPTIMIZE; COST = {inf}
>> Pop match: rule [ArrowFilterTableScanRule] rels
>> [rel#31:LogicalFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],condition==($2,
>> ?0)), rel#0:ArrowTableScan.ARROW.[](table=[SAMPLES,
>> NATIONSSF],fields=[0, 1, 2, 3])]
>> call#254: Apply rule [ArrowFilterTableScanRule] to
>> [rel#31:LogicalFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],condition==($2,
>> ?0)), rel#0:ArrowTableScan.ARROW.[](table=[SAMPLES,
>> NATIONSSF],fields=[0, 1, 2, 3])]
>> Transform to: rel#56 via ArrowFilterTableScanRule
>> call#254 generated 1 successors:
>> [rel#56:ArrowFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],expr#0..3={inputs},expr#4=?0,expr#5==($t2,
>> $t4),N_NATIONKEY=$t0,N_NAME=$t1,N_REGIONKEY=$t2,N_COMMENT=$t3,$condition=$t5)]
>> PLANNER = org.apache.calcite.plan.volcano.VolcanoPlanner@499683c4;
>> TICK = 28/26; PHASE = OPTIMIZE; COST = {inf}
>> ---
>>
>> But it was not present in the final plan.
>>
>> ---
>> Plan after physical tweaks: EnumerableCalc(expr#0..2=[{inputs}],
>> expr#3=[?0], expr#4=[=($t2, $t3)], proj#0..1=[{exprs}],
>> $condition=[$t4]): rowcount = 15.0, cumulative cost = {125.0 rows,
>> 911.0 cpu, 0.0 io}, id = 80
>> ArrowToEnumerableConverter: rowcount = 100.0, cumulative cost = {110.0
>> rows, 111.0 cpu, 0.0 io}, id = 70
>> ArrowTableScan(table=[[SAMPLES, NATIONSSF]], fields=[[0, 1, 2]]):
>> rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io},
>> id = 54
>> ---
>>
>> I would like to know what rules EnumerableCalc chosen.
>>
>> thanks.
>>
>> --
>> Masayuki Takahashi



-- 
高橋 真之


Re: Can not apply my custom filter instead of EnumerableCalc

2018-02-04 Thread Julian Hyde
I am delighted that someone is implementing an Arrow adapter. It would
be a good replacement for the Enumerable adapter someday. Would you
mind logging a JIRA case for it, to track progress?

I think the reason that ArrowFilter is not being chosen is because of
cost. Note this:

> [rel#56:ArrowFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],expr#0..3={inputs},expr#4=?0,expr#5==($t2,
>  
> $t4),N_NATIONKEY=$t0,N_NAME=$t1,N_REGIONKEY=$t2,N_COMMENT=$t3,$condition=$t5)]
> PLANNER = org.apache.calcite.plan.volcano.VolcanoPlanner@499683c4; TICK = 
> 28/26; PHASE = OPTIMIZE; COST = {inf}

Maybe the cost of ArrowFilter followed ArrowToEnumerableConverter is
higher than ArrowToEnumerableConverter followed by EnumerableCalc.

Julian


On Sun, Feb 4, 2018 at 1:46 AM, Masayuki Takahashi
 wrote:
> Hi,
>
> I try to implement Arrow Adapter.
>
> https://github.com/masayuki038/calcite/tree/arrow/arrow/src/main/java/org/apache/calcite/adapter/arrow
>
> But I can not apply my arrow filter instead of EnumerableCalc. Would
> you tell me the reason?
>
> ArrowFilter was created by ArrowFilterTableScanRule.
> (logs are follow:)
>
> ---
> PLANNER = org.apache.calcite.plan.volcano.VolcanoPlanner@499683c4;
> TICK = 27/25; PHASE = OPTIMIZE; COST = {inf}
> Pop match: rule [ArrowFilterTableScanRule] rels
> [rel#31:LogicalFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],condition==($2,
> ?0)), rel#0:ArrowTableScan.ARROW.[](table=[SAMPLES,
> NATIONSSF],fields=[0, 1, 2, 3])]
> call#254: Apply rule [ArrowFilterTableScanRule] to
> [rel#31:LogicalFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],condition==($2,
> ?0)), rel#0:ArrowTableScan.ARROW.[](table=[SAMPLES,
> NATIONSSF],fields=[0, 1, 2, 3])]
> Transform to: rel#56 via ArrowFilterTableScanRule
> call#254 generated 1 successors:
> [rel#56:ArrowFilter.NONE.[](input=rel#17:Subset#0.ARROW.[],expr#0..3={inputs},expr#4=?0,expr#5==($t2,
> $t4),N_NATIONKEY=$t0,N_NAME=$t1,N_REGIONKEY=$t2,N_COMMENT=$t3,$condition=$t5)]
> PLANNER = org.apache.calcite.plan.volcano.VolcanoPlanner@499683c4;
> TICK = 28/26; PHASE = OPTIMIZE; COST = {inf}
> ---
>
> But it was not present in the final plan.
>
> ---
> Plan after physical tweaks: EnumerableCalc(expr#0..2=[{inputs}],
> expr#3=[?0], expr#4=[=($t2, $t3)], proj#0..1=[{exprs}],
> $condition=[$t4]): rowcount = 15.0, cumulative cost = {125.0 rows,
> 911.0 cpu, 0.0 io}, id = 80
> ArrowToEnumerableConverter: rowcount = 100.0, cumulative cost = {110.0
> rows, 111.0 cpu, 0.0 io}, id = 70
> ArrowTableScan(table=[[SAMPLES, NATIONSSF]], fields=[[0, 1, 2]]):
> rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io},
> id = 54
> ---
>
> I would like to know what rules EnumerableCalc chosen.
>
> thanks.
>
> --
> Masayuki Takahashi