Thanks Bradford. On Tue, Jun 16, 2009 at 2:17 AM, Bradford Stephens < [email protected]> wrote:
> Right now, we're storing the documents in HBase. The indices are > stored in HDFS and then 'sharded' to each node using Katta. Not sure > if there's much of an advantage to storing the index itself in HBase, > though I'd be interested to see some use cases for it. > > On Sat, Jun 13, 2009 at 11:27 AM, zsongbo<[email protected]> wrote: > > Hi Bradford Stephens, > > Could you please share something about your practices on "Katta+HBase"? > > Do you store the documents or indexes in HBase? > > > > Schubert > > > > On Fri, Jun 12, 2009 at 1:19 PM, Bradford Stephens < > > [email protected]> wrote: > > > >> That actually make a lot of sense. Thanks, awesome people! Me and the > >> dev team are here to get Katta + HBase to play together, and it's > >> looking pretty nice. > >> > >> On Thu, Jun 11, 2009 at 9:47 PM, stack<[email protected]> wrote: > >> > On Thu, Jun 11, 2009 at 6:10 PM, Bradford Stephens < > >> > [email protected]> wrote: > >> > > >> >> > >> >> What I'm noticing is that it's writing to mostly one or two regions > on > >> >> one box at a time, even though I have 7 reducers running. Monitoring > >> >> everything with dstat -v, I notice that only 2 of my servers are > doing > >> >> much. These boxes have very low CPU idling, and high disk output (a > >> >> few GB a minute). > >> >> > >> > > >> > > >> > How many regions in your table? > >> > > >> > At first, there is one. All reducers will go against it. When it > >> splits, > >> > then two regions field the 7 reducers and so on. > >> > > >> > You can manually split regions from the command-line. See if that > helps: > >> > > >> > hbase> split_region 'REGIONNAME' > >> > > >> > (IIRC -- type 'tools' in shell for help on the admin facilities). > >> > > >> > St.Ack > >> > > >> > > >
