[
https://issues.apache.org/jira/browse/USERGRID-255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Johnson updated USERGRID-255:
-----------------------------------
Description:
The two-dot-o Query Index module currently stores both document source and
fields in ElasticSearch. Since we only ever retrieve ID numbers from ES, there
is no need for us to store source and it is a waste of resources.
This is what we need:
1) A way to configure Usergrid to either store source or not store source.
2) An index-rebuild "Tool" (implemented as a REST end-point) that either remove
source, or add source depending on how the system in configured to operate. The
Tool must allow us to re-index without downtime. Possible approach:
For each application:
a) Tool creates a new index and adds that index to the application's read and
write alias.
b) Tool removes the old index from the application's write alias so it is no
longer written to.
b) Tool deletes the mappings for each newly added index, then re-creates them
with the new store-source settings.
c) Tool re-indexes the application's collections.
d) Once re-index is complete, Tool deletes the old index.
was:From Todd: When I was testing our indexing schema changes, I noticed our
document source was appearing in ES. We should verify if this is happening in
production, and if so, turn this off on the client side. Since we store all
data in Cassandra, we do not need to also store it into ES. This means we can
index more in ES and use a lot less disk space, so it's very important
operationally.
Summary: Re-indexer That Removes Source from ES (was: Verify we're not
storing document source in Elastic Search)
> Re-indexer That Removes Source from ES
> --------------------------------------
>
> Key: USERGRID-255
> URL: https://issues.apache.org/jira/browse/USERGRID-255
> Project: Usergrid
> Issue Type: Story
> Components: Stack
> Reporter: Todd Nine
> Assignee: David Johnson
>
> The two-dot-o Query Index module currently stores both document source and
> fields in ElasticSearch. Since we only ever retrieve ID numbers from ES,
> there is no need for us to store source and it is a waste of resources.
> This is what we need:
> 1) A way to configure Usergrid to either store source or not store source.
> 2) An index-rebuild "Tool" (implemented as a REST end-point) that either
> remove source, or add source depending on how the system in configured to
> operate. The Tool must allow us to re-index without downtime. Possible
> approach:
> For each application:
> a) Tool creates a new index and adds that index to the application's read and
> write alias.
> b) Tool removes the old index from the application's write alias so it is no
> longer written to.
> b) Tool deletes the mappings for each newly added index, then re-creates them
> with the new store-source settings.
> c) Tool re-indexes the application's collections.
> d) Once re-index is complete, Tool deletes the old index.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)