Re: Catalyst optimizer cpu/Io cost

2016-06-10 Thread Kazuaki Ishizaki
Hi
Yin Huai's slide is avaiable at 
http://www.slideshare.net/databricks/deep-dive-into-catalyst-apache-spark-20s-optimizer

Kazuaki Ishizaki



From:   Takeshi Yamamuro <linguin@gmail.com>
To: Srinivasan Hariharan02 <srinivasan_...@infosys.com>
Cc: "user@spark.apache.org" <user@spark.apache.org>
Date:   2016/06/10 18:09
Subject:        Re: Catalyst optimizer cpu/Io cost



Hi,

There no way to retrieve that information in spark.
In fact,  the current optimizer only consider the byte size of outputs in 
LogicalPlan.
Related code can be found in 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L90

If you want to know more about catalyst, you can check the Yin Huai's 
slide in spark summit 2016. 
https://spark-summit.org/2016/speakers/yin-huai/
# Note: the slide is not available now, and it seems it will be in a few 
weeks.

// maropu


On Fri, Jun 10, 2016 at 3:29 PM, Srinivasan Hariharan02 <
srinivasan_...@infosys.com> wrote:
Hi,,
 
How can I get spark sql query cpu and Io cost after optimizing for the 
best logical plan. Is there any api to retrieve this information?. If 
anyone point me to the code where actually cpu and Io cost computed in 
catalyst module. 
 
Regards,
Srinivasan Hariharan
+91-9940395830
 
 
 



-- 
---
Takeshi Yamamuro




RE: Catalyst optimizer cpu/Io cost

2016-06-10 Thread Srinivasan Hariharan02
Thanks Mich for your reply. I am curious to know one thing,  Hive uses CBO 
which take into account of cpu cost, Does hive optimizer has any advantage over 
spark catalyst optimizer?.

Regards,
Srinivasan Hariharan
+91-9940395830

From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: Friday, June 10, 2016 3:27 PM
To: Srinivasan Hariharan02 <srinivasan_...@infosys.com>
Cc: Takeshi Yamamuro <linguin@gmail.com>; user@spark.apache.org
Subject: Re: Catalyst optimizer cpu/Io cost

in an SMP system such as Oracle or Sybase the CBO will take into account LIO, 
PIO and CPU costing or use some empirical  costing.

In a distributed system like Spark with so many nodes that may not be that easy 
or its contribution to the Catalyst decision may be subject to variations that 
may not make it worthwhile.

HTH


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 10 June 2016 at 10:45, Srinivasan Hariharan02 
<srinivasan_...@infosys.com<mailto:srinivasan_...@infosys.com>> wrote:
Thanks Takeshi. Is there any reason for not using I/o cpu cost in catalyst 
optimizer?.  Some sql engines which  leverages  Apache calcite has cost planner 
like volcanoPlanner which takes cpu and io cost for plan optimization.

Regards,
Srinivasan Hariharan
+91-9940395830<tel:%2B91-9940395830>

From: Takeshi Yamamuro 
[mailto:linguin@gmail.com<mailto:linguin@gmail.com>]
Sent: Friday, June 10, 2016 2:38 PM
To: Srinivasan Hariharan02 
<srinivasan_...@infosys.com<mailto:srinivasan_...@infosys.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: Catalyst optimizer cpu/Io cost

Hi,

There no way to retrieve that information in spark.
In fact,  the current optimizer only consider the byte size of outputs in 
LogicalPlan.
Related code can be found in 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L90

If you want to know more about catalyst, you can check the Yin Huai's slide in 
spark summit 2016.
https://spark-summit.org/2016/speakers/yin-huai/
# Note: the slide is not available now, and it seems it will be in a few weeks.

// maropu


On Fri, Jun 10, 2016 at 3:29 PM, Srinivasan Hariharan02 
<srinivasan_...@infosys.com<mailto:srinivasan_...@infosys.com>> wrote:
Hi,,

How can I get spark sql query cpu and Io cost after optimizing for the best 
logical plan. Is there any api to retrieve this information?. If anyone point 
me to the code where actually cpu and Io cost computed in catalyst module.

Regards,
Srinivasan Hariharan
+91-9940395830<tel:%2B91-9940395830>






--
---
Takeshi Yamamuro



Re: Catalyst optimizer cpu/Io cost

2016-06-10 Thread Mich Talebzadeh
in an SMP system such as Oracle or Sybase the CBO will take into account
LIO, PIO and CPU costing or use some empirical  costing.

In a distributed system like Spark with so many nodes that may not be that
easy or its contribution to the Catalyst decision may be subject to
variations that may not make it worthwhile.

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 10 June 2016 at 10:45, Srinivasan Hariharan02 <srinivasan_...@infosys.com
> wrote:

> Thanks Takeshi. Is there any reason for not using I/o cpu cost in catalyst
> optimizer?.  Some sql engines which  leverages  Apache calcite has cost
> planner like volcanoPlanner which takes cpu and io cost for plan
> optimization.
>
>
>
> *Regards,*
>
> *Srinivasan Hariharan*
>
> *+91-9940395830 <%2B91-9940395830>*
>
>
>
> *From:* Takeshi Yamamuro [mailto:linguin@gmail.com]
> *Sent:* Friday, June 10, 2016 2:38 PM
> *To:* Srinivasan Hariharan02 <srinivasan_...@infosys.com>
> *Cc:* user@spark.apache.org
> *Subject:* Re: Catalyst optimizer cpu/Io cost
>
>
>
> Hi,
>
>
>
> There no way to retrieve that information in spark.
>
> In fact,  the current optimizer only consider the byte size of outputs in
> LogicalPlan.
>
> Related code can be found in
> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L90
>
>
>
> If you want to know more about catalyst, you can check the Yin Huai's
> slide in spark summit 2016.
>
> https://spark-summit.org/2016/speakers/yin-huai/
>
> # Note: the slide is not available now, and it seems it will be in a few
> weeks.
>
>
>
> // maropu
>
>
>
>
>
> On Fri, Jun 10, 2016 at 3:29 PM, Srinivasan Hariharan02 <
> srinivasan_...@infosys.com> wrote:
>
> Hi,,
>
>
>
> How can I get spark sql query cpu and Io cost after optimizing for the
> best logical plan. Is there any api to retrieve this information?. If
> anyone point me to the code where actually cpu and Io cost computed in
> catalyst module.
>
>
>
> *Regards,*
>
> *Srinivasan Hariharan*
>
> *+91-9940395830 <%2B91-9940395830>*
>
>
>
>
>
>
>
>
>
>
>
> --
>
> ---
> Takeshi Yamamuro
>


RE: Catalyst optimizer cpu/Io cost

2016-06-10 Thread Srinivasan Hariharan02
Thanks Takeshi. Is there any reason for not using I/o cpu cost in catalyst 
optimizer?.  Some sql engines which  leverages  Apache calcite has cost planner 
like volcanoPlanner which takes cpu and io cost for plan optimization.

Regards,
Srinivasan Hariharan
+91-9940395830

From: Takeshi Yamamuro [mailto:linguin@gmail.com]
Sent: Friday, June 10, 2016 2:38 PM
To: Srinivasan Hariharan02 <srinivasan_...@infosys.com>
Cc: user@spark.apache.org
Subject: Re: Catalyst optimizer cpu/Io cost

Hi,

There no way to retrieve that information in spark.
In fact,  the current optimizer only consider the byte size of outputs in 
LogicalPlan.
Related code can be found in 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L90

If you want to know more about catalyst, you can check the Yin Huai's slide in 
spark summit 2016.
https://spark-summit.org/2016/speakers/yin-huai/
# Note: the slide is not available now, and it seems it will be in a few weeks.

// maropu


On Fri, Jun 10, 2016 at 3:29 PM, Srinivasan Hariharan02 
<srinivasan_...@infosys.com<mailto:srinivasan_...@infosys.com>> wrote:
Hi,,

How can I get spark sql query cpu and Io cost after optimizing for the best 
logical plan. Is there any api to retrieve this information?. If anyone point 
me to the code where actually cpu and Io cost computed in catalyst module.

Regards,
Srinivasan Hariharan
+91-9940395830<tel:%2B91-9940395830>






--
---
Takeshi Yamamuro


Re: Catalyst optimizer cpu/Io cost

2016-06-10 Thread Takeshi Yamamuro
Hi,

There no way to retrieve that information in spark.
In fact,  the current optimizer only consider the byte size of outputs in
LogicalPlan.
Related code can be found in
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L90

If you want to know more about catalyst, you can check the Yin Huai's slide
in spark summit 2016.
https://spark-summit.org/2016/speakers/yin-huai/
# Note: the slide is not available now, and it seems it will be in a few
weeks.

// maropu


On Fri, Jun 10, 2016 at 3:29 PM, Srinivasan Hariharan02 <
srinivasan_...@infosys.com> wrote:

> Hi,,
>
> How can I get spark sql query cpu and Io cost after optimizing for the
> best logical plan. Is there any api to retrieve this information?. If
> anyone point me to the code where actually cpu and Io cost computed in
> catalyst module.
>
> *Regards,*
> *Srinivasan Hariharan*
> *+91-9940395830 <%2B91-9940395830>*
>
>
>
>



-- 
---
Takeshi Yamamuro