-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70463/
-----------------------------------------------------------

Review request for atlas, Kapildeo Nayak, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Bugs: ATLAS-3132
    https://issues.apache.org/jira/browse/ATLAS-3132


Repository: atlas


Description
-------

**Approach**
- Refactored existing implementation for new design.
- Renamed 'Java Patch Framework' to 'Data Patch Framework', rationale being 
that this is essentially to modify structure of existing data.
- New _DataPatchService_: Modified order in which services are called. 
_DataPatchService_ will be called before other services are invoked, thereby 
giving chance for it to complete before entertaining new data.
- New _DataPatchRegistry_: Data access (CRUD) operation for data patches.
- New _UniqueAttributePatchHandler_: Current implementation for adding the new 
property to data vertices. Implemented rudimentary caching to precent 
repetitive look-ups.
- New REST Endpoint to query status of patches.

**Performance**
Since the data patching operation is high-volume operation, it has been treated 
with priority. 
- New _NewPropertyDataHandler_ uses database in bulk loading mode for rapid 
processing. This scales with resources. Additional properties:
- _atlas.processing.batchSize_: Size of batch.
- _atlas.processing.numWorkers_: Number of worker threads to be employed. 
- Leverages existing PC framework.

Processing speed:
- 300K vertices: ~5 mins
- 4.2 M entities: ~45 mins (from: 2019-04-12 04:44:50 to 2019-04-12 05:29:04)


Diffs
-----

  graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasGraph.java 
d282c9966 
  
graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/DataPatchGraphDBHandler.java
 PRE-CREATION 
  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/VertexIterator.java
 PRE-CREATION 
  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/patches/NewPropertyDataPatch.java
 PRE-CREATION 
  intg/src/main/java/org/apache/atlas/pc/WorkItemConsumer.java b7eb4d89c 
  intg/src/main/java/org/apache/atlas/pc/WorkItemManager.java 0e7d3f22d 
  notification/src/main/java/org/apache/atlas/kafka/EmbeddedKafkaServer.java 
32b597fb6 
  notification/src/main/java/org/apache/atlas/kafka/KafkaNotification.java 
1d0a2734b 
  
repository/src/main/java/org/apache/atlas/repository/patches/AtlasJavaPatchHandler.java
 9153d497b 
  
repository/src/main/java/org/apache/atlas/repository/patches/DataPatchHandler.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/patches/DataPatchManager.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/patches/DataPatchRegistry.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/patches/DataPatchService.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/patches/PatchContext.java 
a60422b80 
  
repository/src/main/java/org/apache/atlas/repository/patches/TypeNameAttributeCache.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/patches/UniqueAttributePatch.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/patches/UniqueAttributePatchHandler.java
 f2238f1b0 
  
repository/src/main/java/org/apache/atlas/repository/store/bootstrap/AtlasTypeDefStoreInitializer.java
 78f3faf99 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasGraphUtilsV2.java
 80141b4f1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 5aa6c8f0e 
  repository/src/test/java/org/apache/atlas/patches/DataPatchRegistryTest.java 
PRE-CREATION 
  
webapp/src/main/java/org/apache/atlas/notification/NotificationHookConsumer.java
 ce2d76f11 
  webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java 
c5ceb9d6d 
  webapp/src/test/java/org/apache/atlas/web/resources/AdminResourceTest.java 
223a90a9c 


Diff: https://reviews.apache.org/r/70463/diff/1/


Testing
-------

**Unit tests**
Additional tests added.

**Volume tests**
Verification with large datasets: 
- 4M entities
- 3.2M entities
- 16K entities.

**Performance tests**
CPU usage, memory usage and disk IO.

**Pre-commit build**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1031/


Thanks,

Ashutosh Mestry

Reply via email to