Re: dropping support for pre-0.20 Hadoop versions

S. Venkatesh Tue, 31 Aug 2010 03:14:02 -0700

I think this is a huge benefit for all of us. Looking forward to it.
Any time line you have in mind?


Thanks,
Venkatesh

On Tue, Aug 31, 2010 at 12:33 AM, John Sichi <[email protected]> wrote:
> As Carl mentioned below, there was agreement at the last Hive contributor
> meeting that we should drop support for pre-0.20 Hadoop versions in Hive
> trunk.  This means that starting with the Hive 0.7 release, Hadoop 0.20 or
> later will be required.  Anyone stuck on an earlier Hadoop version will need
> to remain on Hive 0.6 and backport any patches they need from trunk.
> There are two major benefits to this:
> * we can finally move from mapred to mapreduce API's across all of Hive
> * we'll enjoy a significant reduction in code maintenance and testing
> overhead (not to mention commit latency) for Hive contributors and
> committers
> Note that although we'll delete the pre-0.20 shim implementations, we will
> still keep the generic shim mechanism itself in place so that we can
> continue to support multiple Hadoop API versions as new ones are released in
> the future.
> For those who were not present at the contributor meeting, please speak up
> if you have an opinion on this.
> JVS
> On Aug 28, 2010, at 2:59 AM, Carl Steinbach wrote:
>
> August 8th, 2010
>
> Yongqiang He gave a presentation about his work on index support in Hive.
>
> Slides are available here: http://files.meetup.com/1658206/Hive%20Index.pptx
>
> John Sichi talked about his work on filter-pushdown optimizations. This is
> applicable to the HBase storage handler and the new index infrastructure.
> Pradeep Kamath gave an update on progress with Howl.
>
> The Howl source code is available
> on GitHub here: http://github.com/yahoo/howl
> Starting to work on security for Howl. For the first iteration the plan is
> to base it on DFS permissions.
>
> General agreement that we should aim to desupport pre-0.20.0 versions of
> Hadoop in Hive 0.7.0. This will allow us to remove the shim layer and will
> make it easier to transition to the new mapreduce APIs. But we also want to
> get a better idea of how many users are stuck on pre-0.20 versions of
> Hadoop.
> Remove Thrift generated code from repository.
>
> Pro: reduce noise in diffs during reviews.
> Con: requires developers to install Thrift compiler.
>
> Discussed moving the documentation from the wiki to version control.
>
> Probably not practical to maintain the trunk version of the docs on the wiki
> and roll over to version control at release time, so trunk version of docs
> will be maintained in vcs.
> It was agreed that feature patches should include updates to the docs, but
> it is also acceptable to file a doc ticket if there is time pressure to
> commit.j
> Will maintain an errata page on the wiki for collecting updates/corrections
> from users. These notes will be rolled into the documentation in vcs on a
> monthly basis.
>
> The next meeting will be held in September at Cloudera's office in Palo
> Alto.
>



-- 
Regards,
Venkatesh

“Perfection (in design) is achieved not when there is nothing more to
add, but rather when there is nothing more to take away.”
- Antoine de Saint-Exupéry

Re: dropping support for pre-0.20 Hadoop versions

Reply via email to