Ashutosh Mestry created ATLAS-1995:
--------------------------------------
Summary: Performance of Entity Creation Can Be Improved By Using
Index Query to Fetch Entity Using Unique Attributes
Key: ATLAS-1995
URL: https://issues.apache.org/jira/browse/ATLAS-1995
Project: Atlas
Issue Type: Improvement
Components: atlas-core
Affects Versions: 0.8-incubating, trunk
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry
*Background*
On profiling entity creation flow, it was observed that several calls are made
to _AtlasGraphUtilsV1.getVertexByUniqueAttributes_.
These calls result in querying database using graph query. There is a potential
for improving this if index query was used.
*Analysis*
Upon experimentation, it was found that there is a 50% improvement in
performance of entity creation if this method was replaced with equivalent that
uses _indexQuery_.
Also, when large number of entities are created (typically using
_import_hive.sh_), the CPU usage on Atlas was reduced, as the Solr was being
used for doing some of the work.
*Implementation Guidance*
* Add new method to _AtlasGraphUtilsV1.getAtlasVertexFromIndexQuery_ that will
use _AtlasGraphProvider.indexQuery_ to fetch vertices.
* Ensure that query created is 'escaped' appropriately.
* Include logic to fallback to graph query if the property being queried for is
not indexed.
Since this is a high-impact change, it will be worth while to verify other
dependent modules.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)