Hi Parag,
I think your question boils down to:

How does one access Hive metadata from MapReduce jobs?

In the past, when I've had to write MR jobs and needed Hive metadata, I
ended up writing a wrapper Hive query that used a custom mapper and reducer
by using hive's transform functionality to do the job.

However, if you want to stick to MR job, you seem to be along the right
lines.

Also, it seems that HCatalog's (
http://incubator.apache.org/hcatalog/docs/r0.4.0/) premise is to make
metadata access among Hive, Pig and MR easier. Perhaps, you want to take a
look at that and see if that fits your use case?

Mark

On Mon, Feb 11, 2013 at 2:59 PM, Parag Sarda <psa...@walmartlabs.com> wrote:

> Hello Hive Users,
>
> I am writing a program in java which is bundled as JAR and executed using
> hadoop jar command. I would like to access hive metadata (read partitions
> informations) in this program. I can ask user to set HIVE_CONF_DIR
> environment variable before calling my program or ask for any reasonable
> parameters to be passed. I do not want to force user to run hive megastore
> service if possible to increase reliability of program by avoiding
> external dependencies.
>
> What is the recommended way to get partitions information? Here is my
> understanding
> 1. Make sure my jar is bundled with hive-metastore[1] library.
> 2. Use HiveMetastoreClient[2]
>
> Is this correct? If yes, how to read the hive configuration[3] from
> HIVE_CONF_DIR?
>
> [1] http://mvnrepository.com/artifact/org.apache.hive/hive-metastore
> [2]
> http://hive.apache.org/docs/r0.7.1/api/org/apache/hadoop/hive/metastore/Hiv
> eMetaStoreClient.html
> [3]
> http://hive.apache.org/docs/r0.7.1/api/org/apache/hadoop/hive/conf/HiveConf
> .html
>
> Thanks in advance,
> Parag
>
>

Reply via email to