Re: what are the research papers that ES relies on?

MrBu Mon, 30 Mar 2015 09:04:26 -0700

Aaron, thanks for the reply.

You cant distribute all of the documents if the size of it is more than a 
usual hdd. Also that was an example I gave. I am just figuring out the 
magical ways that ES uses rather than lucene has its own.


30 Mart 2015 Pazartesi 18:55:49 UTC+3 tarihinde Aaron Mefford yazdı:
>
> "Automagic" routing happens already on hashing the document id.  It sounds 
> like you may have a situation where your document id is creating a hot 
> spot.  This being the case what you want is not automagic routing but more 
> control over the routing or a better document id.  There is the ability to 
> code your own routing and create a more even distribution, for your given 
> keyset, but I think you would be better served by a better document key, 
> this isnt mongo or hbase where the document key rules the world.
>
> The other possible reason you are hot-spotting is index creation.  In a 
> log ingestion scenario, the most recent index is almost always the hottest 
> index.  That is where all indexing is occurring, that is where all queries 
> start.  If you have tweaked the 5 shard norm and are only creating 1 shard 
> that shard will be hot in this scenario.
>
> Your comment on routing a shard to another shard does not make any sense.  
> You need to read a bit more on what the shards are and how they work.  That 
> said if you have multiple replicas of a shard, then those shards will 
> automatically be distributed across all of your nodes.  In fact if the 
> number of replicas is the same as the number of nodes in the cluster, you 
> should automatically have all data on all nodes, and any node will be able 
> to query local data, and no node will be hot because of query volume.  
> However indexing is still routed to the master shard.
>
> Like was mentioned previously, the code is open, however it sounds like 
> you are looking to go deep water diving before learning to swim.
> On Monday, March 30, 2015 at 8:57:51 AM UTC-6, MrBu wrote:
>>
>> Jörg,
>>
>> Thanks for the input. I have read many tutorials, guides (official one 
>> too). Just I want to re-route in more automagic way. Like routing evenly to 
>> the shard and duplicating mostly used shard to other shards maybe.
>>
>> 30 Mart 2015 Pazartesi 10:33:19 UTC+3 tarihinde Jörg Prante yazdı:
>>>
>>> Elasticsearch is open source, so reading (and using and modifying) the 
>>> algorithms is possible. There is also a lot of introductory material 
>>> available online, and I recommend "Elasticsearch - The definitive guide" if 
>>> you want paperwork.
>>>
>>> If you create an index, ES creates shards for this index (by default 5), 
>>> and different nodes receive one of such shards, so indexing and search is 
>>> automatically distributed over the participating nodes. ES keeps a map of 
>>> shards in the cluster state, so every node is able to route a query or an 
>>> index command. You don't need to manually route queries to shards.
>>>
>>> You can force ES to put all data on 3rd node, and in that case, you 
>>> already know what you want... there is no surprise. ES follows the 
>>> principle of least surprise.
>>>
>>> Jörg
>>>
>>> On Mon, Mar 30, 2015 at 5:07 AM, MrBu <metin....@gmail.com> wrote:
>>>
>>>> Other than Lucene's own research papers, what are the research papers 
>>>> or special algorithms that is being used by Elastic? I couldn't find a 
>>>> list 
>>>> it in the documents.
>>>>
>>>> Are the special algorithms used (and which ones are used in where) for 
>>>> example what is the algorithm used in in load distribution or just round 
>>>> robin algorithm?
>>>>
>>>> I really want to get in deep with Elastic :)
>>>>
>>>> This way I could have more knowledge. Example, suppose there are 20 
>>>> nodes, and surprisingly (and somehow) only the data in 3rd node is being 
>>>> searched all the time. (say these are popular documents somehow gathered 
>>>> only in this node) so Elastic weights this load into all cluster by 
>>>> dividing this data to other nodes ?  Or will it always use only 3rd node? 
>>>> There are tons of questions in my mind, waiting to be answered. Only 
>>>> possible way to read the algorithms . It would help me a lot.
>>>>
>>>> Thanks
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to elasticsearc...@googlegroups.com.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/elasticsearch/75907f69-38be-49fb-bf69-2f5dbf83cc45%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/elasticsearch/75907f69-38be-49fb-bf69-2f5dbf83cc45%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f3bcef25-b07a-4344-b1f2-9e5b8cc9db72%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: what are the research papers that ES relies on?

Reply via email to