Hi.
Thanks for the reply. Of course each document go into exactly one shard.
> On Mar 31, 2017, at 15:01, Erick Erickson wrote:
>
> I don't believe addIndexes does much except rewrite the
> segments file (i.e. the file that tells Lucene what
> the current segments are).
>
> That said, if you'r
I don't believe addIndexes does much except rewrite the
segments file (i.e. the file that tells Lucene what
the current segments are).
That said, if you're desperate you can optimize/force-merge.
Do note, though, that no deduplication is done. So if the
indexes you're merging have docs with the s
Yeah, I definitely will look into PreAnalyzedField as you and Michail suggest.
Thank you.
> On Mar 30, 2017, at 19:15, Uwe Schindler wrote:
>
> But that's hard to implement. I'd go for Solr instead of doing that on your
> own!
---
Denis Bazhenov
Interesting. In case of addIndexes() does Lucene perform any optimization on
segments before searching over individual segments or those indexes are
searched "as is”?
> On Mar 30, 2017, at 19:09, Mikhail Khludnev wrote:
>
> I believe you can have more shards for indexing and then merge (and no
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> -Original Message-
> From: Denis Bazhenov [mailto:dot...@gmail.com]
> Sent: Thursday, March 30, 2017 11:02 AM
> To: java-user@lucene.apache.org
> Subject: Re: Document serializable
I believe you can have more shards for indexing and then merge (and not
literally, but just by addIndexes() or so ) them to smaller number for
search. Transferring indices is more efficient (scp -C) than separate
tokens and their attributes over the wire.
On Thu, Mar 30, 2017 at 12:02 PM, Denis Ba
We already have done this. Many years ago :)
At the moment we have 7 shards. The problem with getting more shards is that
search become less cost effective (in terms of cluster CPU time per request) as
you split index in more shards. Considering response time is good enough and
the fact search
Hi,
the document does not contain the analyzed tokens. The Lucene Analyzers are
called inside the IndexWriter *during* indexing, so there is no way to do that
somewhere else. The IndexableDocument instances by Lucene are just iterables of
IndexableField that contain the unparsed fulltext as pas