Thanks Robin for your quick response, that's great news. As I understand it, this will allow me to store the generated classifier model in HBase. Are there any examples of its usage? Does anyone know where can I find some test cases (such as the ones in MAHOUT-124<https://issues.apache.org/jira/browse/MAHOUT-124> )?
The other question I have is how possible is it to use HBase as a data source instead of the file system (HDFS)? Are there are ideas or best practices on are there any ideas or even best practices on reading and writing documents and vectors generated from these documents directly from and to HBase rather than using HDFS? Thanks, NW On Mon, Jul 25, 2011 at 11:03 PM, Robin Anil <[email protected]> wrote: > We dropped it after pruning the dependencies in Mahout. You can simply > bring > back the class(from the repository) and use it to connect to HBase in your > client code. > > Robin > > On Mon, Jul 25, 2011 at 6:23 PM, NightWolf <[email protected]> wrote: > > > Hi all, > > > > I'm working on a large text classification project and we have our text > > data > > (simple messages) stored in HBase. > > > > We have two problems, first we would like to use HBase as the source for > > Mahout classifiers namely Bayers and Random Forests. > > > > Second, we would like to be able to store the model generated in HBase > > instead of using the in memory approach (InMemoryBayesDatastore) however > as > > our sets grow we are running into problems with memory utilization and > > would > > like to test out HBase as a viable alternative. > > > > There seems to be little material floating around using HBase with Mahout > > and if it's possible to use it as a potential datasource. I'm using > Mahout > > 0.6 core API in Java which has the InMemory datastore. > > > > Doing a bit of digging I belive that there (was) a HBase Bayers Datastore > > component - > > org.apache.mahout.classifier.bayes.datastore.HBaseBayesDatastore > > See older JavaDoc here: > > > > > http://www.jarvana.com/jarvana/view/org/apache/mahout/mahout-core/0.3/mahout-core-0.3-javadoc.jar!/org/apache/mahout/classifier/bayes/datastore/HBaseBayesDatastore.html > > > > However, looking at the latest documentation it looks like this feature > has > > disappeared..? https://builds.apache.org/job/Mahout-Quality/javadoc/ > > > > I wanted to know if it was still possible to use HBase as a datastource > for > > Bayers and RandomForests and are there any previous uses cases in this? > > > > Thanks! > > > > > > -- > > View this message in context: > > > http://lucene.472066.n3.nabble.com/HBase-Mahout-Using-HBase-as-a-Datastore-source-for-Mahout-Classification-tp3197368p3197368.html > > Sent from the Mahout User List mailing list archive at Nabble.com. > > >
