Re: optimization of tree plan generated by Hive...

Carl Steinbach Wed, 08 Feb 2012 18:52:46 -0800

Hi Alexis,

Work is already underway to add the YSmart optimizer to Hive. Please take a
look at https://issues.apache.org/jira/browse/HIVE-2206.


Thanks.

Carl

On Wed, Feb 8, 2012 at 6:17 PM, Alexis De La Cruz Toledo <
alexis...@gmail.com> wrote:

> Hi! My name is Alexis. I am a master student of Cinvestav, DF, México.
> Actually I am doing my thesis work and I would like to participate in
> Google Summer of Code 2012 (
> http://google-melange.appspot.com/gsoc/events/google/gsoc2012)
> I'm interesting in improve Hive and I have been studying hadoop and hive.
>
> I have interesting about the tree plan generated by Hive.
> Call me the attention that Hive read many times the same table
> and generate many jobs hadoop when the query can be
> expressed in less Jobs and with only one read of the table
> if I program the same query in hadoop.
>
> I think that I can reduce the number of jobs to process a query
> and read the tables one time too, no matter if used it on several jobs.
>
> The solution could be raised of two ways:
>
> 1. Changing the part when the DAG is created, making the optimizations in
> this moment.
> 2. After that the DAG is created, we can apply the optimizations, this
> optimizations can be implemented in another class.
>
> Where could I do this? I think that the method that compile the queries is
> the method compile in class Driver, am I right?
> Can someone guide me where  I could implement it?
>
> There is a paper which discussed what I say
> http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf
> We can take it and improve or implement us own ideas.
>
> Personally I would like to do the second options due to time.
>
> By another hand, Someone is interested to work with me and be my mentor in
> Google Summer Code 2012?
>
> Thanks.
>
> Regards.
>
> --
> Ing. Alexis de la Cruz Toledo.
> *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco.
> México,
> D.F, 07360 *
> *CINVESTAV, DF.*
>

Re: optimization of tree plan generated by Hive...

Reply via email to