> Do you want to do Term- or Document partitioning?

It sounds like no one uses term partitioning, doc-partitioning seems
to be the most logical default?

> serve the index shards from memory

In Lucene-land this's a function of allocating enough RAM for the
system IO cache.

On Sun, Feb 13, 2011 at 8:26 AM, Thomas Koch <tho...@koch.ro> wrote:
> Jason Rutherglen:
>> Hello,
>>
>> I'm curious as to what a 'good' approach would be for implementing
>> search in HBase (using Lucene) with the end goal being the integration
>> of realtime search into HBase.  I think the use case makes sense as
>> HBase is realtime and has a write-ahead log, performs automatic
>> partitioning, splitting of data, failover, redundancy, etc.  These are
>> all things Lucene does not have out of the box, that we'd essentially
>> get for 'free'.
>>
>> For starters: Where would be the right place to store Lucene segments
>> or postings?  Eg, we need to be able to efficiently perform a linear
>> iteration of the per-term posting list(s).
>>
>> Thanks!
>>
>> Jason Rutherglen
> Hi Jason,
>
> I had the same idea around last year but didn't continue it since I'm leaving
> the company right now.
> Do you want to do Term- or Document partitioning? Both have advantages and
> disadvantages. You can get a very good introduction in chapter 14.1 of this
> book:
> http://www.ir.uwaterloo.ca/book
>
> The following lecture gives a very interesting insight on Google's index
> architecture:
> http://videolectures.net/wsdm09_dean_cblirs
>
> Projects that do Document partitioning:
> distributed solr, katta, elasticsearch, linkedin's Sensei
> Projects that do Term partitioning:
> lucandra/solandra (using cassandra), hbasene (which is abandoned since a year)
>
> I very much thought that hbasene would be a perfect solution for scalable
> search, but the above book and video convinced me that improving katta would
> be the way to go:
> - implement an indexing solution for katta
> - serve the index shards from memory, as google apparently does
>
> Hope I could help, please keep us posted,
>
> Thomas Koch, http://www.koch.ro
>

Reply via email to