> 1) Why 0.20.0 in the following command?  Why not 0.20.1?

2)  Why do I have to download all previous versions of Hadoop?  Does Hive
> need these?
>

Since the Hadoop API changes from version to version Hive accesses it
through a shim layer. The build is setup to compile shim libraries for the
four most recent minor versions of Hadoop (0.17, 0.18, 0.19 and 0.20), and
in order to do this it needs to have access to the Hadoop jars, which it
gets by downloading the Hadoop release tarballs.

In terms of the results of the build there is no difference between
specifying 0.20.1 or 0.20.0 since the API stays the same between patch
versions. However, you'll end up doing extra work if you specify 0.20.1
since the build script always downloads and builds shims against 0.20.0.
There is no point in also telling it to download and build shims against
0.20.1 since the shims that are built against Hadoop 0.20.0 will work just
as well against 0.20.1.

Carl

Reply via email to