[
https://issues.apache.org/jira/browse/USERGRID-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Nine updated USERGRID-536:
-------------------------------
Description:
Currently, our dynamic mapping causes several issues with elastic search. We
should change our mapping to use a static structure, and resolve this
operational pain.
We need to make the following changes.
h2. Modify our IndexScope.
This should more closely resemble the elements of an edge since this represents
an edge. It will simplify the use of our query module and make development
clearer. This scope should be refactored into the following objects.
* IndexEdge - Id, name, timestamp, edgeType (source or target)
* SearchEdge - Id, name, edgeType
Note: edgeType is the type of the Id within the edge. Does this Id represent a
source Id, or does it represent a targetId? The entity to be indexed will
implicitly be the opposite of the type specified. I.E if it's a source edge,
the document is the target. If it's a target edge, the document is the source.
These values should also be stored within our document, so that we can index
our documents. Note that we perform bidirectional indexing in some cases, such
was users, groups etc. When we do this, we need to ensure that mark the
direction of the edge appropriately.
h2. Change default sort ordering
When sorting is unspecified, we should order by timestamp descending from our
index edge. This ensures that we retain the correct edge time semantics, and
will properly order collections and connections
h2. Remove the legacy query class
We don't need the Query class, it has far too many functions to be a well
encapsulated object. Instead, we should simply take the string QL, the
SearchEdge and the limit to return our candidates. From there, we should parse
and visit the query internally to the query logic, NOT externally.
h2. Create a static mapping.
The mapping should contains the following static fields.
* entityId - The entity id
* entityVersion - The entity version
* edgeId - The edge Id
* edgeName - The edge name
* edgeTimestamp - The edge timestamp
* edgeType - source | target
* searchEdge - edgeId + edgeName + edgeType
It will then contain an array of "fields" Each of these fields will have the
following formation.
{code}
{ "name":"[entity field name as a path]", "[field type]":[field value}
{code}
We will define a field type for each type of field. Note that each field tuple
will always contain a single field and a single value. Possible field types
are the following.
* string - This will be mapped into 2 mapping with multi mappings. It will be
a string unanalyzed, and an analyzed string. The 2 fields will then be
"string_u" and "string_a". The Query visitor will need to update the field
name approperiatly
* long - An unanalyzed long
* double - An unanalyzed double
* boolean - An unanalyzed boolean
The entity path will be a flattened path from the root json element to the max
json element. It can be though of as a path through the tree of json elements.
We will use a dot '.' to delimit the fields. X.Y.Z for nested objects.
Primitive arrays will contain a field object for each element in the array.
h3. References
Multi Field Mapping:
http://www.elastic.co/guide/en/elasticsearch/reference/current/_multi_fields.html
Nested Objects:
http://www.elastic.co/guide/en/elasticsearch/guide/current/nested-objects.html
was:
Currently, our dynamic mapping causes several issues with elastic search. We
should change our mapping to use a static structure, and resolve this
operational pain.
We need to make the following changes.
h2. Modify our IndexScope.
This should more closely resemble the elements of an edge since this represents
an edge. This scope should be refactored into the following objects.
* IndexEdge - Id, name, timestamp, edgeType (source or target)
* SearchEdge - Id, name, edgeType
Note: edgeType is the type of the Id within the edge. Does this Id represent a
source Id, or does it represent a targetId? The entity to be indexed will
implicitly be the opposite of the type specified. I.E if it's a source edge,
the document is the target. If it's a target edge, the document is the source.
These values should also be stored within our document, so that we can index
our documents. Note that we perform bidirectional indexing in some cases, such
was users, groups etc. When we do this, we need to ensure that mark the
direction of the edge appropriately.
h2. Create a static mapping.
The mapping should contains the following static fields.
* entityId - The entity id
* entityVersion - The entity version
* edgeId - The edge Id
* edgeName - The edge name
* edgeTimestamp - The edge timestamp
* edgeType - source | target
* searchEdge - edgeId + edgeName + edgeType
It will then contain an array of "fields" Each of these fields will have the
following formation.
{code}
{ "name":"[entity field name], "[field type]":[field value}
{code}
> Change our index structure to eliminate static mapping
> ------------------------------------------------------
>
> Key: USERGRID-536
> URL: https://issues.apache.org/jira/browse/USERGRID-536
> Project: Usergrid
> Issue Type: Story
> Components: Stack
> Reporter: Todd Nine
> Assignee: Todd Nine
>
> Currently, our dynamic mapping causes several issues with elastic search. We
> should change our mapping to use a static structure, and resolve this
> operational pain.
> We need to make the following changes.
> h2. Modify our IndexScope.
> This should more closely resemble the elements of an edge since this
> represents an edge. It will simplify the use of our query module and make
> development clearer. This scope should be refactored into the following
> objects.
> * IndexEdge - Id, name, timestamp, edgeType (source or target)
> * SearchEdge - Id, name, edgeType
> Note: edgeType is the type of the Id within the edge. Does this Id represent
> a source Id, or does it represent a targetId? The entity to be indexed will
> implicitly be the opposite of the type specified. I.E if it's a source edge,
> the document is the target. If it's a target edge, the document is the
> source.
> These values should also be stored within our document, so that we can index
> our documents. Note that we perform bidirectional indexing in some cases,
> such was users, groups etc. When we do this, we need to ensure that mark the
> direction of the edge appropriately.
> h2. Change default sort ordering
> When sorting is unspecified, we should order by timestamp descending from our
> index edge. This ensures that we retain the correct edge time semantics, and
> will properly order collections and connections
> h2. Remove the legacy query class
> We don't need the Query class, it has far too many functions to be a well
> encapsulated object. Instead, we should simply take the string QL, the
> SearchEdge and the limit to return our candidates. From there, we should
> parse and visit the query internally to the query logic, NOT externally.
> h2. Create a static mapping.
> The mapping should contains the following static fields.
> * entityId - The entity id
> * entityVersion - The entity version
> * edgeId - The edge Id
> * edgeName - The edge name
> * edgeTimestamp - The edge timestamp
> * edgeType - source | target
> * searchEdge - edgeId + edgeName + edgeType
> It will then contain an array of "fields" Each of these fields will have the
> following formation.
> {code}
> { "name":"[entity field name as a path]", "[field type]":[field value}
> {code}
> We will define a field type for each type of field. Note that each field
> tuple will always contain a single field and a single value. Possible field
> types are the following.
> * string - This will be mapped into 2 mapping with multi mappings. It will
> be a string unanalyzed, and an analyzed string. The 2 fields will then be
> "string_u" and "string_a". The Query visitor will need to update the field
> name approperiatly
> * long - An unanalyzed long
> * double - An unanalyzed double
> * boolean - An unanalyzed boolean
> The entity path will be a flattened path from the root json element to the
> max json element. It can be though of as a path through the tree of json
> elements. We will use a dot '.' to delimit the fields. X.Y.Z for nested
> objects. Primitive arrays will contain a field object for each element in
> the array.
> h3. References
> Multi Field Mapping:
> http://www.elastic.co/guide/en/elasticsearch/reference/current/_multi_fields.html
> Nested Objects:
> http://www.elastic.co/guide/en/elasticsearch/guide/current/nested-objects.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)