Some responses below:

On Tue, Jun 24, 2014 at 7:04 PM, Cindy Hsin <[email protected]> wrote:

> Looks like the memory usage increased a lot with 10k fields with these two
> parameter disabled.
>
> Based on the experiment we have done, looks like ES have abnormal memory
> usage and performance degradation when number of fields are large (ie.
> 10k). Where Solr memory usage and performance remains for the large number
> fields.
>
> If we are only looking at 10k fields scenario, is there a way for ES to
> make the ingest performance better (perhaps via a bug fix)?
>

I've opened an ES issue to address the slowdown as more and more unique
fields are added via dynamic templates:
https://github.com/elasticsearch/elasticsearch/issues/6619


> The solution Mike suggestion sounds like a workaround (ie combine multiple
> fields into one field to reduce the large number of fields). I can run it
> by our team but not sure if this will fly.
>

Well, I think both Solr and ES (once we fix the above issue) will still
have high cost if you index so many fields, since they both are based on
Lucene.

One simple but effective approach, whether you use Solr or ES, is to use
nested documents, where the parent document holds any "common" fields
across all of your documents, and then each child document has two fields,
key and value.  key holds the original field name you wanted to index, and
value holds the original field value, so you have as many child documents
as you had field+values to index for your original document.  This approach
has worked well in other applications that needed so many fields...

It essentially changes the wide range of field names and field values
instead, which Lucene handles very well.  It results in more, smaller
documents, but this scales out well as you add nodes.

Mike McCandless

http://blog.mikemccandless.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD7smRecxnOmVOrrNfgfk5vmKZaP3ReEcM9P%2BVu2qRgLxSL%2BKQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to