Ashutosh Mestry created ATLAS-3132:
--------------------------------------

             Summary: Data Patch Fx: Improve Data Patching Performance
                 Key: ATLAS-3132
                 URL: https://issues.apache.org/jira/browse/ATLAS-3132
             Project: Atlas
          Issue Type: Improvement
          Components:  atlas-core
    Affects Versions: trunk
            Reporter: Ashutosh Mestry
            Assignee: Ashutosh Mestry
             Fix For: trunk


*Background*

The Java patch framework (now called data patching framework) introduced 
recently performs patching at the rate of 1 million entities per 15 hrs. This 
can be improved.

*Proposed Solution***
 * Use the Producer-Consumer framework to spawn multiple workers to perform 
concurrent updates to entity vertices.
 * Use _AtlasGraph_ in bulk loading mode to further gain performance.
 * Perform duplicate data checks during processing.

*Projected Performance Improvement*
 * Based on various tests, these give increased throughput. New rate can be 
~300K entities per 5 mins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to