It is actually possible in lucene 4, but there is nothing really
convenient setup to do this.

You have two choices there:
1. trigger a massive merge (essentially an optimize), by wrapping all
readers and calling IndexWriter.addIndexes(Reader...).
2. wrap readers in a custom merge policy and do it slowly over time.

in both cases you'd use something like
http://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/test-framework/src/java/org/apache/lucene/index/FieldFilterAtomicReader.java

for lucene 3, this would be more complicated, I don't think its
impossible but there is no available code unfortunately in this case.


On Mon, Mar 31, 2014 at 11:37 PM, Paul Smith <[email protected]> wrote:
> ok, this is more low level Lucene, but in the context of an ElasticSearch
> cluster, is there any way to get an index/shard to optimize away a bunch of
> fields that are no longer used (literally have no term values associated
> with it.
>
> We had an application bug introduced that polluted an index with a very
> large number of fields (25,000 fields... *cough*) , and lets just say things
> weren't well after that.
>
> we've deleted all the rogue records, but the shards still contain the raw
> Lucene Field information (we've inspected these with Luke) and the cluster
> is heavily CPU bound processing "refreshVersionTable" calls that is in a
> large loop a function of the number of fields in the segments.
>
> We've attempted a test optimize of the index using Luke on a single shard,
> but the residual segments post-optimize still contain a large number of
> these fields, all with no values associated with them.
>
> Obviously a reindex would do this, but if there's any other bright ideas
> that are quicker than that (45 million item index we're trying to keep up)
> would be most welcome!
>
> We're on ES 0.19.10 still (lucene 3.6.1).  (you can tell me "upgrade"
> another day please..)
>
> Here's a snapshot picture from the Luke on a single shard from this index.
>
> cheers!
>
> Paul Smith
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAHfYWB5nO%3DDQ50SQ4kgde6JvT%3DgjQ_7FmLbVcXVk5Kiurwme%2Bg%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXZNf2y7AXsJFJg7hBOyJmEW%2BOvcNZse1JfQx0XcFyynA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to