Re: Document serializable representation

Denis Bazhenov Thu, 30 Mar 2017 18:08:50 -0700

Interesting. In case of addIndexes() does Lucene perform any optimization on 
segments before searching over individual segments or those indexes are 
searched "as is”?


> On Mar 30, 2017, at 19:09, Mikhail Khludnev <[email protected]> wrote:
> 
> I believe you can have more shards for indexing and then merge (and not
> literally, but just by addIndexes() or so ) them to smaller number for
> search. Transferring indices is more efficient (scp -C) than separate
> tokens and their attributes over the wire.
> 
> On Thu, Mar 30, 2017 at 12:02 PM, Denis Bazhenov <[email protected]> wrote:
> 
>> We already have done this. Many years ago :)
>> 
>> At the moment we have 7 shards. The problem with getting more shards is
>> that search become less cost effective (in terms of cluster CPU time per
>> request) as you split index in more shards. Considering response time is
>> good enough and the fact search nodes are ~90% of all hardware budget of
>> the cluster, it’s much more cost effective to split analysis from
>> IndexWriter than split index in more shards. It simply would require from
>> us to put disproportionately more hardware in cluster.
>> 
>>> On Mar 30, 2017, at 18:36, Uwe Schindler <[email protected]> wrote:
>>> 
>>> What you would better do is to just split your index into multiple
>> shards and have separate IndexWriter instances on different machines. Those
>> can act on their own. This is what Elasticsearch or Solr are doing: They
>> accept the document, decide which shard they should be located and transfer
>> the plain fieldname:value pairs over the network. Each node then creates
>> Lucene IndexableDocuments out of it and passes to their own IndexWriter.
>> 
>> ---
>> Denis Bazhenov <[email protected]>
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> -- 
> Sincerely yours
> Mikhail Khludnev

---
Denis Bazhenov <[email protected]>






---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Document serializable representation

Reply via email to