Right now, we're storing the documents in HBase. The indices are
stored in HDFS and then 'sharded' to each node using Katta. Not sure
if there's much of an advantage to storing the index itself in HBase,
though I'd be interested to see some use cases for it.

On Sat, Jun 13, 2009 at 11:27 AM, zsongbo<[email protected]> wrote:
> Hi Bradford Stephens,
> Could you please share something about your practices on "Katta+HBase"?
> Do you store the documents or indexes in HBase?
>
> Schubert
>
> On Fri, Jun 12, 2009 at 1:19 PM, Bradford Stephens <
> [email protected]> wrote:
>
>> That actually make a lot of sense. Thanks, awesome people! Me and the
>> dev team are here to get Katta + HBase to play together, and it's
>> looking pretty nice.
>>
>> On Thu, Jun 11, 2009 at 9:47 PM, stack<[email protected]> wrote:
>> > On Thu, Jun 11, 2009 at 6:10 PM, Bradford Stephens <
>> > [email protected]> wrote:
>> >
>> >>
>> >> What I'm noticing is that it's writing to mostly one or two regions on
>> >> one box at a time, even though I have 7 reducers running. Monitoring
>> >> everything with dstat -v, I notice that only 2 of my servers are doing
>> >> much. These boxes have very low CPU idling, and high disk output (a
>> >> few GB a minute).
>> >>
>> >
>> >
>> > How many regions in your table?
>> >
>> > At first, there is one.  All reducers will go against it.   When it
>> splits,
>> > then two regions field the 7 reducers and so on.
>> >
>> > You can manually split regions from the command-line.  See if that helps:
>> >
>> > hbase> split_region 'REGIONNAME'
>> >
>> > (IIRC -- type 'tools' in shell for help on the admin facilities).
>> >
>> > St.Ack
>> >
>>
>

Reply via email to