[ 
https://issues.apache.org/jira/browse/ATLAS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913530#comment-16913530
 ] 

Sridhar commented on ATLAS-3370:
--------------------------------

[~bolke]
Fulltext based indexing that we originally used an indexing on a column that is 
a concatenation of all field values. It is impacting CRUD performance as this 
value is calculated every time the entities go through CRUD. We replaced this 
with our our free text handler that uses the original field values as they are 
indexed. This eliminates the need for FullTextIndex that we used to maintain. 
Now, this gave an issue when used the indices for aggregation metrics. for e.g. 
when a owner names like "[email protected]" was indexed, it is indexed by 
splitting the owner name into "batman", "cloudera.com". this resulted in messed 
up in aggregation calculations--we use SOLR indexing feature for this. As a 
fix, as part of this ticket, we made the modifications to the indexing style 
used for fields that we don't want tokenize on. This made the SOLR indexing 
system to index the value as is instead of tokenizing. This results in correct 
aggregation values with the framework we were using.

I hope that the above explanation is clear. Please let me know if you need more 
clarity on it.

thanks,
Sridhar

 

> Aggregation Metrics with quick search, Counts don't add up
> ----------------------------------------------------------
>
>                 Key: ATLAS-3370
>                 URL: https://issues.apache.org/jira/browse/ATLAS-3370
>             Project: Atlas
>          Issue Type: Bug
>            Reporter: Sridhar
>            Assignee: Sridhar
>            Priority: Major
>
> The issue was happening because of tokenization done for the fields in issue.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to