Derek Pappas, If you are using heritrix to crawl, you can try using hbase-writer to write the crawled output to hbase(instead of arc files). http://code.google.com/p/hbase-writer/
On Sat, Apr 25, 2009 at 11:41 PM, Ryan Rawson <[email protected]> wrote: > Man I really understand the frustration when something just > _does_not_work_, > especially if advertised to work so. > > But the thing to remember here, is hbase is cutting edge database stuff - > highly clustered distributed databases are not super straighforward. For > an > example, take Oracle RAC - you won't be able to get that up and running at > any interesting performance levels without paying oracle or a highly > experienced oracle dba to tune it just right. > > So, considering how few tuning parameters, and how scalable hbase is, I > think it's a great deal for the price. > > On Sat, Apr 25, 2009 at 8:19 PM, Andrew Purtell <[email protected]> > wrote: > > > > > Right, well "hbase did not work" with no details as to why > > does not help us to improve it. Please kindly consider > > asking your colleague to forward details at your and/or > > that person's convenience. Also, for future reference, HBase > > has a responsive developer community and could have likely > > helped for only the cost of time to file a bug report and > > respond to inquiries for more information. > > > > - Andy > > > > > From: Derek Pappas > > > Subject: Re: Crawling Using HBase as a back end --Issue > > > Date: Thursday, April 23, 2009, 11:35 PM > > > Someone else in the company knows the details. Sorry did not > > > mean to pan hbase. We are a very small startup and needed to > > > get a prototype (version 2) working. We tried using hbase > > > back in the Dec/Jan time frame. > > > > > > > > > > >
