Re: Decoupling Data and indexing

Amish Asthana Tue, 11 Nov 2014 10:46:59 -0800

With existing Elastic Search I can think of an architecture like this.

Index : indexForDataDump : No mapping(Is it possible?) or minimum mapping. 
Use only to dump data from external system. There is some primary key.


There are different search indexes with different mapping : search-index1, 
search-index2 etc.
These indexes get populated from the indexForDataDump using technique 
mentioned here 
<http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/>.
So this way I can drop the search index as desired and create new one with 
new mapping.
Any pros/cons or issue with this approach? There will be data duplication 
but  I am hoping its minimum. ( Any way to quantify it?)

regards and thanks
amish

On Tuesday, November 11, 2014 10:02:46 AM UTC-8, Amish Asthana wrote:
>
> I am not aware of FAST but the idea looks promising.
> However it might not be that easy to just have plugin for ES, as the data 
> itself is distributed on different machines.
> So it will not be possible to have just one server with the data, as it 
> will become single point of failure.
> regards and thanks
> amish
>
> On Tuesday, November 11, 2014 1:21:53 AM UTC-8, Jörg Prante wrote:
>>
>> I know from the FAST Search engine ten years ago there was a two-phase 
>> commit for distributed search and indexing. One server could listen on the 
>> API and keep the (compressed) input stored, and all the other indexing 
>> servers were supplied by this input in another phase to create binary 
>> indexes, either automatically, or by manual operation, called 
>> "suspend/resume indexing API". 
>>
>> The advantage was that data could be received permanently via API while 
>> FAST indexing could be stopped temporarily in order to balance between 
>> indexing and search performance on limited hardware.
>>
>> Do you think of something like that also for Elasticsearch? This 
>> architecture is possible to implement by a plugin.
>>
>> Jörg
>>
>> On Mon, Nov 10, 2014 at 10:13 PM, Amish Asthana <[email protected]> 
>> wrote:
>>
>>> Hi
>>> Is there a way we can decouple data and associated mapping/indexing in 
>>> Elasticsearch itself.
>>> Basically store the raw data as source( json or some other format)  and 
>>> various mapping/index can be used on top of that.
>>> I understand that one can use an outside database or file system, but 
>>> can it be natively achieved in ES itself.
>>>
>>> Basically we are trying to see how our ES instance will work when we 
>>> have to change mapping of existing and continuously incoming data without 
>>> any downtime for the end user.
>>> We have an added wrinkle that our indexing has to be edit aware for 
>>> versioning purpose; unlike ES where each edit is a new record.
>>> regards and thanks
>>> amish
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/0bb1f5ef-3991-4568-9891-018baf79ebae%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/elasticsearch/0bb1f5ef-3991-4568-9891-018baf79ebae%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4be01b3a-2747-4f6e-a1c3-7299e9f83bc4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Decoupling Data and indexing

Reply via email to