Hi Laukik, I did see their work, but I guess I asked a question too vague, my bad (: I asked about Parallax because in that work they strongly emphasize that they had it implemented in Pig, so I wondered, but in the case of ParaTimer I kinda assumed they didn't open-sourced. It probably is just the way they described them both. And going back to the original question, I mean the main difference between Parallax and ParaTimer is that the latter one is able to make progress estimates for tasks that run in parallel like joins which have two or more inputs. But I was actually referring to cardinality estimates for pre-defined operators. They use the same approach in both works. They use regular query optimization techniques, like cost formulas. However, the use of these formulas requires to have a-priori knowledge of the data, or to get these estimates while in their debug runs. Would this be the best way to do it while leveraging Pig? Thanks in advance.
Renato M. 2011/2/16 Laukik Chitnis <[email protected]> > Hi Renato, > > They extended Parallax to ParaTimer -- progress estimator for MR DAGs > (which covers the join case). Here is the paper that talks about it: > http://www.cs.washington.edu/homes/kmorton/camera-ready.pdf > > Cheers, > Laukik > > > > On 2/16/11 11:59 AM, "Renato Marroquín Mogrovejo" < > [email protected]> wrote: > > I see, I thought it was already part of Pig internals, but oh well, we will > get there (: > There was one thing that made me think about that work, and it was the join > case. I mean they say that the progress estimator woudn't deal with joins. > What do you think it would be a good approach for a join progress estimator? > Any ideas are more than welcome. Thanks in advance! > > Renato M. > > > 2011/2/15 Alan Gates <[email protected]> > > Parallax is implemented on a pull from Pig trunk around the 0.2 timeframe. > However, we are working with Kristi to get it integrated into current Pig. > With some luck, we'll get it integrated into 0.9 (but no promises). > > Alan. > > > On Feb 15, 2011, at 7:54 AM, Renato Marroquín Mogrovejo wrote: > > Hi, I wanted to know if "Parallax" [1] is implemented on Pig0.8 and how it > is being used. > Any hints about its classes would be awesome. > Thanks in advance. > > Renato M. > > > [1] *Toward A Progress Indicator for Parallel Queries. * > ftp://ftp.cs.washington.edu/tr/2009/07/UW-CSE-09-07-01.PDF > > > > > >
