Wow, this is exactly what I was looking for. I am a bit curious on #5, I am assuming there is a Java API to access ES, is there any link on how to get started using Java with ES? I would like to know how to import ES framework/API into java project.
Thanks again, this is a great clarification! On Tuesday, January 14, 2014 4:17:31 PM UTC-5, Jörg Prante wrote: > > 1. Mostly, indexes are result of a partition design outside ES. For > example, by time, user, data origin. The beauty of ES is that it can host > as many indexes as you wish. > > 2. If your maximum number of nodes (hosts) you want to spend to ES is > known, use that node number for the number of shards. So you make sure your > cluster can scale. If the number is not known, try to estimate the total > number of documents to get indexed, the total volume of that documents, and > an estimated index volume per shard. Rule of thumb: a shard should be sized > so it can fit into the Java heap and so that it can be moved between nodes > in reasonable time (~1-10 GB). > > 3. You can scale up by adding nodes - just start ES on another host. Scale > down is also easy, stop ES on a node. > > 4. You have to write a program that traverses your folders, picks up each > document, and extracts fields from the document to get them indexed. With > scrutmydocs.org you can experiment how this works by using such a file > traverser which is already prepared to handle quite a lot of file types > automatically. > > 5. You should consider using one of the standard clients. As ES supports > HTTP REST, and the standard clients are designed to support a comparable > set of features, it does not matter what language you use. Just pick your > favorite language. (My personal favorite is Java, where there is no need to > use HTTP REST, instead the native transport protocol can be used) > > Jörg > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d6586c50-fad0-46e5-8ff5-d624d821d937%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
