Yushi Hayasaka created ATLAS-5095: ------------------------------------- Summary: Cache shell entity after creation to prevent from cache miss Key: ATLAS-5095 URL: https://issues.apache.org/jira/browse/ATLAS-5095 Project: Atlas Issue Type: Improvement Reporter: Yushi Hayasaka
Sometimes Atlas attempts to load an entity from the cache (e.g., to notify listeners of processed entities after `createOrUpdate()`). [https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java#L176] [https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L595] [https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L465] [https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L418] [https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java#L111-L115] [https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java#L1145-L1146] If the specified entity is not found in the cache, Atlas falls back to retrieving it through `EntityGraphRetriever#toAtlasEntity`, which is slow path compared to cache. [https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java#L181] Currently, we observe that Atlas tries to retrieve shell entities from EntityGraphRetriever instead of cache. When there are many shell entities in the event, it increases the operation time. As introduced in ATLAS-3405, if the non-existing entities are included in the event, Atlas creates the shell entity. In my understanding (please correct me if wrong), the shell entity should only have some properties which are specified in createShellEntityVertex. [https://github.com/apache/atlas/blob/18d7f9dccf5658988d32e387339948286810f0a8/repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java#L291-L312] So, I guess it is safe to cache after creation (e.g. right after `createShellEntityVertex`), and it leads to improve the performance by reducing calling slow path. -- This message was sent by Atlassian Jira (v8.20.10#820010)