RE: Implementing a custom StorageHandler

Lavelle, Shawn Wed, 29 Jun 2016 14:33:25 -0700

I don’t have answers for you, except for #1 – mapreduce are the new classes in 
Hadoop, from my understanding.  They’ve been out for a while, but the Hive 
storage handler API hasn’t been updated to make use of them.  Which leads me to 
my very related question: When might hive provide a storage handler interface 
that uses the new classes, and if not, why not?

  Thanks,

~ Shawn M Lavelle

From: Long, Andrew [mailto:[email protected]]
Sent: Monday, June 27, 2016 5:59 PM
To: user <[email protected]>
Subject: Implementing a custom StorageHandler

Hello everyone,

I’m in the process of implementing a custom StorageHandler and I had some 
questions.

1)      What is the difference between org.apache.Hadoop.mapred.InputFormat and 
org.apache.hadoop.mapreduce.InputFormat?

2)      How is numSpits calculated in 
org.apache.Hadoop.mapred.InputFormat.getSplits(JobConf job, int numSplits)?

3)      Is there a way to enforce a maximum number of splits?  What would 
happen if I ignore numSplits and just returned an array of splits that was the 
actual maximum number of splits?

4)      How is InputSplit.getLocations() used?  If I’m accessing non hfds 
resources should what should I return?  Currently I’m just returning an empty 
array.

Thanks for your time,
Andrew Long

RE: Implementing a custom StorageHandler

Reply via email to