Re: Future of Pig on Spark ?

Xuefu Zhang Tue, 09 Jun 2015 17:36:07 -0700

Hi Bharath,

Unfortunately there isn't an ETA for a meaningful milestone at this moment,
mostly due to lack of contributors. Right now there are only one or two
engineers working on this, and even those cannot be counted as full time
resources for this effort. I'm happy to see that there is a demand for this
and at the same time I'd like to see more contributors to joining in the
effort. If you and your team can help, that would be a great plus to the
project.


Thanks,
Xuefu

On Tue, Jun 9, 2015 at 10:24 AM, Bharath Ravi Kumar <[email protected]>
wrote:

> A follow up question: Is there a tentative ETA for the initial milestone
> (e.g. order of weeks Vs months)? Knowing the trunk merge & stabilization
> plan will help me deprecate my internal patch-build-deploy process for Pig
> on Spark. On a related note, I'd like to reaffirm that Pig on Spark is
> important for us (i.e. my team) to help transparently migrate over legacy
> pig jobs as we look to move away from MR. And considering various technical
> and ecosystem reasons in addition to our familiarity with running Spark in
> production, pig on spark is more appealing to us than pig on MR or Tez.
>
>
> On Tue, Jun 9, 2015 at 7:57 AM, Bharath Ravi Kumar <[email protected]>
> wrote:
>
>> Thanks Xuefu. That's good to know. I'd be glad to start with feedback /
>> bug reports based on initial usage, and hopefully follow up with patches
>> later. Looking forward to the initial milestone.
>>  On 09-Jun-2015 7:25 am, "Xuefu Zhang" <[email protected]> wrote:
>>
>>> Hi Bharath,
>>>
>>> Thanks for your inquiry. A small team is committed to Pig on Spark
>>> project
>>> and is actively working toward to the direction, while the pace of
>>> progression is slower than wished. This is mostly due to the constrained
>>> resources and sort of diversion of the initial contributors. We certainly
>>> welcome any sort of feedback especially contributions from the community.
>>>
>>> We have a detailed design doc that's ready to shared in the community so
>>> that prospective contributors can take as a reference. While our initial
>>> objective is to achieve functional completeness, we are committed to
>>> enhancement and optimizations to make Pig on Spark run better and faster.
>>> As we are closer to the initial milestone, we will work on the user doc
>>> for
>>> consumption.
>>>
>>> Let me know if you have any questions.
>>>
>>> Thanks,
>>> Xuefu
>>>
>>> On Mon, Jun 8, 2015 at 6:09 PM, Bharath Ravi Kumar <[email protected]>
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > I'm looking for clarity on the future of the spark execution engine for
>>> > pig. While I've noticed activity on the spark branch on the pig git
>>> repo,
>>> > it hasn't been merged to trunk since the initial spork announcement.
>>> > Besides, it's not clear if it will continue to be maintained and
>>> enhanced
>>> > to exploit spark's capabilities to a greater extent (e.g. caching of
>>> RDD's
>>> > etc). I also see very little (up to date) user documentation on the
>>> setup.
>>> > I hence seek clarity on the future of the backend. Thanks.
>>> >
>>> > -Bharath
>>> >
>>>
>>
>

Re: Future of Pig on Spark ?

Reply via email to