Some responses below: On Tue, Jun 24, 2014 at 7:04 PM, Cindy Hsin <[email protected]> wrote:
> Looks like the memory usage increased a lot with 10k fields with these two > parameter disabled. > > Based on the experiment we have done, looks like ES have abnormal memory > usage and performance degradation when number of fields are large (ie. > 10k). Where Solr memory usage and performance remains for the large number > fields. > > If we are only looking at 10k fields scenario, is there a way for ES to > make the ingest performance better (perhaps via a bug fix)? > I've opened an ES issue to address the slowdown as more and more unique fields are added via dynamic templates: https://github.com/elasticsearch/elasticsearch/issues/6619 > The solution Mike suggestion sounds like a workaround (ie combine multiple > fields into one field to reduce the large number of fields). I can run it > by our team but not sure if this will fly. > Well, I think both Solr and ES (once we fix the above issue) will still have high cost if you index so many fields, since they both are based on Lucene. One simple but effective approach, whether you use Solr or ES, is to use nested documents, where the parent document holds any "common" fields across all of your documents, and then each child document has two fields, key and value. key holds the original field name you wanted to index, and value holds the original field value, so you have as many child documents as you had field+values to index for your original document. This approach has worked well in other applications that needed so many fields... It essentially changes the wide range of field names and field values instead, which Lucene handles very well. It results in more, smaller documents, but this scales out well as you add nodes. Mike McCandless http://blog.mikemccandless.com -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAD7smRecxnOmVOrrNfgfk5vmKZaP3ReEcM9P%2BVu2qRgLxSL%2BKQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
