Hi Sreenivasulu, The reasoning behind this code block is as follows.
The solr schema for JanusGraph is configured with a StandardTokenizer and this cause the index to be created in a way that any index query with special characters gets tokenized into smaller chunks and gives back wrong results. e.g. With the StandardTokenizer you can't index the email address, it gets tokenized and the special characters are stripped out. So when you fire the index query with the same email address it'll tokenize the email again and the smaller tokens get used in the query. Now this produces false positives as the smaller tokenized values might be present in some other entity. This problem only arises when Atlas does a direct index query, the graph query works fine because the graph layer has other mechanisms of retrieval. But as you've seen, the query becomes slower. Here's what can be done to fix this problem, 1. Change the Tokenizers/Analyzers in the solr.xml or the schema file to a more suitable one. 2. Delete the indexes in JanusGraph and use the index repair utility to rebuild the indexes. Keep in mind that that re-indexing is a large overhead for Atlas. It might take days or weeks to complete on an existing cluster with lots of data. This solution is more suited to a new install of Atlas where the data is minimal. Please let me know if you've any further questions. On Sun, Apr 21, 2019 at 11:16 PM Nallapati, Sreenivasulu < [email protected]> wrote: > Hi Apoorv/Madhan, > > > > Can you please help us understanding the below issue? > > > > > > --- > > Regards, > > Sreeni > > > > *From: *"Nallapati, Sreenivasulu" <[email protected]> > *Date: *Sunday, 21 April 2019 at 11:24 PM > *To: *"[email protected]" <[email protected]>, " > [email protected]" <[email protected]> > *Subject: *Special characters attribute filter search behaviour > > > > Hello Atlas team, > > > > We are facing some performance issue with respect to the Entity Type > attribute filter search. > > > > [image: cid:[email protected]] > > > > > > What is the reason behind for going in-memory or graph query in case of > special > characters in string attribute filter query, even though the attribute is > indexed? . Its taking very very long time to get the results back and we > are getting gateway timeouts. > > The below code snippet says “in-memory or graph query in case” but > doesn’t explain what is the reason behind it. > > > > private boolean isIndexSearchable(FilterCriteria filterCriteria, > AtlasStructType structType) throws AtlasBaseException { > > String qualifiedName = structType.getQualifiedAttributeName( > filterCriteria.getAttributeName()); > > Set<String> indexedKeys = context.getIndexedKeys(); > > boolean ret = indexedKeys != null && indexedKeys > .contains(qualifiedName); > > > > if (ret) { // index exists > > // for string type attributes, don't use index query in the > following cases: > > // - operation is NEQ, as it might return fewer entries due > to *tokenization* of vertex property value > > // - value-to-compare has special characters > > AtlasType attributeType = structType.getAttributeType( > filterCriteria.getAttributeName()); > > > > if > (AtlasBaseTypeDef.ATLAS_TYPE_STRING.equals(attributeType.getTypeName())) > { > > if (filterCriteria.getOperator() == > SearchParameters.Operator.NEQ) { > > if (LOG.isDebugEnabled()) { > > LOG.debug("NEQ operator found for string > attribute {}, deferring to in-memory or graph query (might cause poor > performance)", qualifiedName); > > } > > > > ret = false; > > } else if > (hasIndexQuerySpecialChar(filterCriteria.getAttributeValue())) > { > > if (LOG.isDebugEnabled()) { > > LOG.debug("special characters found in filter > value {}, deferring to in-memory or graph query (might cause poor > performance)", filterCriteria.getAttributeValue()); > > } > > > > ret = false; > > } > > } > > } > > > > if (LOG.isDebugEnabled()) { > > if (!ret) { > > LOG.debug("Not using index query for: attribute='{}', > operator='{}', value='{}'", qualifiedName, filterCriteria.getOperator(), > filterCriteria.getAttributeValue()); > > } > > } > > > > return ret; > > } > > > > > > > > > > --- > > Regards, > > Sreeni >
