"The latter is usually a minor issue (I haven't tried it; I'm just speaking from experience converting MR jobs to use APIs from new package). Are you finding it otherwise Geoff?"
We've been using the new mapreduce apis. We moved off of mapred a long time ago. It was a straightforward move, and we liked it because we were moving in the "right direction" (i.e. to new apis). The version of IndexOutputFormat that was part of 20.6 derived from the mapreduce apis. So basically, the hbasene package just seems stale in that it seems to have moved backwards to the "mapred" apis. Having already moved all our jobs from mapred to mapreduce, going backward seems weird. "Why do you have to touch HBase at all Geoff? Can you not just make a mapreduce job of adjusted IndexOutputFormat bundling lucene and have it run against HBase APIs?" IndexOutputFormat isn't part of Lucene is it? As far as I know it exist in two packages: 1) the "original" http://hbase.apache.org/docs/r0.20.6/api/org/apache/hadoop/hbase/mapredu ce/IndexOutputFormat.html 2) and in this Hbasene project (flakey? stale?): org.hbasene.index.create.mapred.IndexOutputFormat If IndexOutputFormat is available in Lucene, I'd be thrilled to use it! Is it available in Lucene? I can appreciate the need to jettison cruft. I just wish that IndexOutputFormat existed somewhere in a discrete jar in a way that my existing code required no code change. Rather, I'd just put a new jar in my lib. -g -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Stack Sent: Wednesday, January 19, 2011 3:13 PM To: [email protected] Subject: Re: IndexOuputFormat? On Wed, Jan 19, 2011 at 3:00 PM, Geoff Hendrey <[email protected]> wrote: > I investigated hbasene. The source download relies on ".mapred" api, not > ".mapreduce". It's maven pom doesn't build without a lot of hacking and > fixing unresolved dependencies, and even when I was able to build the > source, I am still out of luck because of the mapred vs mapreduce issue. > The latter is usually a minor issue (I haven't tried it; I'm just speaking from experience converting MR jobs to use APIs from new package). Are you finding it otherwise Geoff? > Is my only recourse to make a custom build of 0.89 which mixes source > from the 0.26 release of HBase in with the 0.89 release? I would have > thought IndexOutputFormat was an important feature to move forward in > the trunk. > Why do you have to touch HBase at all Geoff? Can you not just make a mapreduce job of adjusted IndexOutputFormat bundling lucene and have it run against HBase APIs? Regards it being an important feature for core HBase, for sure its a nice-to-have, but we've been trying to jettison all but core from HBase and have add-ons live elsewhere. We found that carrying along all contribs and additions with their different rates of development (and with flux in developer interest in keeping up the add-on) proved a drag on core development. St.Ack
