Yushi Hayasaka created ATLAS-5095:
-------------------------------------

             Summary: Cache shell entity after creation to prevent from cache 
miss
                 Key: ATLAS-5095
                 URL: https://issues.apache.org/jira/browse/ATLAS-5095
             Project: Atlas
          Issue Type: Improvement
            Reporter: Yushi Hayasaka


Sometimes Atlas attempts to load an entity from the cache (e.g., to notify 
listeners of processed entities after `createOrUpdate()`).
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java#L176]
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L595]
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L465]
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L418]
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L111-L115]
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java#L1145-L1146]

If the specified entity is not found in the cache, Atlas falls back to 
retrieving it through `EntityGraphRetriever#toAtlasEntity`, which is slow path 
compared to cache.
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java#L181]

Currently, we observe that Atlas tries to retrieve shell entities from 
EntityGraphRetriever instead of cache.
When there are many shell entities in the event, it increases the operation 
time.

As introduced in ATLAS-3405, if the non-existing entities are included in the 
event, Atlas creates the shell entity.
In my understanding (please correct me if wrong), the shell entity should only 
have some properties which are specified in createShellEntityVertex. 
[https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java#L291-L312]

So, I guess it is safe to cache after creation (e.g. right after 
`createShellEntityVertex`), and it leads to improve the performance by reducing 
calling slow path.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to