[
https://issues.apache.org/jira/browse/ATLAS-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashutosh Mestry updated ATLAS-2434:
-----------------------------------
Attachment: ATLAS-2434-Import-Perf-Improvement.patch
> Import: Performance Improvement
> -------------------------------
>
> Key: ATLAS-2434
> URL: https://issues.apache.org/jira/browse/ATLAS-2434
> Project: Atlas
> Issue Type: Bug
> Components: atlas-core
> Affects Versions: trunk
> Reporter: Ashutosh Mestry
> Assignee: Ashutosh Mestry
> Priority: Major
> Fix For: trunk
>
> Attachments: ATLAS-2434-Import-Perf-Improvement.patch
>
>
> *Background*
> The introduction of _relationships_ within Atlas, caused the
> _EntityMutationResponse_ to contain many more entities as modified than
> before.
> This has adverse impact on performance when it comes to bulk entity creation.
> Entity creation in bulk happens during import process. Single entity creation.
> *Behavior*
> During import, in a typical scenario where database is being imported. The
> _EntityMutationResponse_'s updated entities grows progressively. This happens
> because every edge created between database-table and table-column is marked
> as updated entity.
> Import thus slows down progressively.
> On a ZIP file used for benchmarks, showed:
> * Branch-0.8 (last release): 2 minutes.
> * Master (current development): 40+ minutes.
> The behavior deteriorates as size of import increases.
> *Possible Solution*
> During import process, avoid marking entities affected due to relationship
> edge creation as modified.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)