Ashutosh Mestry created ATLAS-3132: -------------------------------------- Summary: Data Patch Fx: Improve Data Patching Performance Key: ATLAS-3132 URL: https://issues.apache.org/jira/browse/ATLAS-3132 Project: Atlas Issue Type: Improvement Components: atlas-core Affects Versions: trunk Reporter: Ashutosh Mestry Assignee: Ashutosh Mestry Fix For: trunk
*Background* The Java patch framework (now called data patching framework) introduced recently performs patching at the rate of 1 million entities per 15 hrs. This can be improved. *Proposed Solution*** * Use the Producer-Consumer framework to spawn multiple workers to perform concurrent updates to entity vertices. * Use _AtlasGraph_ in bulk loading mode to further gain performance. * Perform duplicate data checks during processing. *Projected Performance Improvement* * Based on various tests, these give increased throughput. New rate can be ~300K entities per 5 mins. -- This message was sent by Atlassian JIRA (v7.6.3#76005)