Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-10 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219859
---


Ship it!




Ship It!

- Sarath Subramanian


On March 5, 2020, 9:43 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated March 5, 2020, 9:43 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is 
> maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format 
> (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
> _JanusGraph_ thereby achieving high ingest rates.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
> 
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  4acb371f1 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 3362bf158 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> bbe0dc5ba 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
>  0f2b4bfae 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  1964ade9a 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
>  cb5a7acd0 
>   
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
>  f552525a4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
>  d7020a702 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java
>  fdf117a25 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  54c32c5e8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
>  PRE-CREATION 
>   repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c 
> 
> 
> Diff: https://reviews.apache.org/r/71025/diff/14/
> 
> 
> Testing
> ---
> 
> **Unit tests**
> Existing tests.
> 
> **Functional tests**
> - Verified import for pre-1.0 and 

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-05 Thread Ashutosh Mestry via Review Board


> On March 5, 2020, 9:30 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
> > Lines 34 (patched)
> > 
> >
> > methods defined here looks more of like helper methods  than interface 
> > methods.

Since this is a drop-in for reduced impact, it needs to have same signature as 
the original concrete implementation. Changing this will involve refactoring 
original code. I can take it up after this commit.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219785
---


On March 5, 2020, 5:43 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated March 5, 2020, 5:43 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is 
> maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format 
> (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
> _JanusGraph_ thereby achieving high ingest rates.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
> 
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  4acb371f1 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 3362bf158 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> bbe0dc5ba 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
>  0f2b4bfae 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  1964ade9a 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
>  cb5a7acd0 
>   
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
>  f552525a4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
>  d7020a702 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  54c32c5e8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
>  

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-05 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated March 5, 2020, 5:43 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: 
- Addressed review comments.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.
- Existing import implementation continues to function as before. This is 
maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format 
(_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
_JanusGraph_ thereby achieving high ingest rates.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```


**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 4acb371f1 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
3362bf158 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
bbe0dc5ba 
  
repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
 0f2b4bfae 
  
repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
1964ade9a 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
 cb5a7acd0 
  
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
 f552525a4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 39ea3f82e 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
 d7020a702 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 30f5e5a7c 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 54c32c5e8 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2f3aad06b 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
 PRE-CREATION 
  repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c 


Diff: https://reviews.apache.org/r/71025/diff/13/

Changes: https://reviews.apache.org/r/71025/diff/12-13/


Testing
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1712/

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
| (40 MB)  

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-05 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219785
---




repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
Lines 34 (patched)


methods defined here looks more of like helper methods  than interface 
methods.


- Sarath Subramanian


On March 4, 2020, 10:09 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated March 4, 2020, 10:09 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is 
> maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format 
> (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
> _JanusGraph_ thereby achieving high ingest rates.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
> 
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  4acb371f1 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 3362bf158 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> bbe0dc5ba 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
>  0f2b4bfae 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  1964ade9a 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
>  cb5a7acd0 
>   
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
>  f552525a4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
>  d7020a702 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  54c32c5e8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
>  PRE-CREATION 
>   repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c 
> 
> 
> Diff: https://reviews.apache.org/r/71025/diff/12/
> 
> 
> Testing
> ---
> 
> 

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-05 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219784
---




intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java
Line 114 (original), 117 (patched)


nit: casting to String is not needed.



repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
Line 56 (original), 56 (patched)


add '@Override' annotation to methods overriding from interface.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
Line 69 (original), 69 (patched)


add '@Override' annotation to methods overriding from interface.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
Line 73 (original), 64 (patched)


ternary operation here is long and not intuitive. Consider refactoring to 
method:

ImportStrategy importStrategy = initImportStrategy(importResult);


- Sarath Subramanian


On March 4, 2020, 10:09 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated March 4, 2020, 10:09 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is 
> maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format 
> (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
> _JanusGraph_ thereby achieving high ingest rates.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
> 
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  4acb371f1 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 3362bf158 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> bbe0dc5ba 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
>  0f2b4bfae 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  1964ade9a 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
>  cb5a7acd0 
>   
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
>  f552525a4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
>  d7020a702 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  54c32c5e8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
>  

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-04 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated March 5, 2020, 6:09 a.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: 
- Found fix for failing UT.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.
- Existing import implementation continues to function as before. This is 
maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format 
(_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
_JanusGraph_ thereby achieving high ingest rates.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```


**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 4acb371f1 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
3362bf158 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
bbe0dc5ba 
  
repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
 0f2b4bfae 
  
repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
1964ade9a 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
 cb5a7acd0 
  
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
 f552525a4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 39ea3f82e 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
 d7020a702 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 30f5e5a7c 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 54c32c5e8 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2f3aad06b 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
 PRE-CREATION 
  repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c 


Diff: https://reviews.apache.org/r/71025/diff/12/

Changes: https://reviews.apache.org/r/71025/diff/11-12/


Testing (updated)
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1712/

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
| 

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-03 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated March 4, 2020, 6:30 a.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Addressed review comments.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.
- Existing import implementation continues to function as before. This is 
maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format 
(_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
_JanusGraph_ thereby achieving high ingest rates.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```


**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 4acb371f1 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
3362bf158 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
bbe0dc5ba 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
1964ade9a 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
 cb5a7acd0 
  
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
 f552525a4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 39ea3f82e 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 30f5e5a7c 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java
 fdf117a25 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 54c32c5e8 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/71025/diff/10/

Changes: https://reviews.apache.org/r/71025/diff/9-10/


Testing
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
| (40 MB)  |  |  ||
+--+--+--++


Thanks,

Ashutosh Mestry



Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-03 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219749
---




repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 384 (patched)


can you avoid this null check? consider initializing 'entityChangeNotifier' 
to a no-op operation.


- Sarath Subramanian


On March 2, 2020, 9:13 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated March 2, 2020, 9:13 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is 
> maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format 
> (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
> _JanusGraph_ thereby achieving high ingest rates.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
> 
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  4acb371f1 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 3362bf158 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> bbe0dc5ba 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 
> 55990f780 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  1964ade9a 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
>  cb5a7acd0 
>   
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
>  f552525a4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java
>  fdf117a25 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  54c32c5e8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71025/diff/9/
> 
> 
> Testing
> ---
> 
> **Unit tests**
> Existing tests.
> 
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292
> 
> **Volume tests**
> - Measure performance with large data.
> 
> +--+--+--++
> | File | Before   | After| Configuration  |
> +--+--+--++
> | smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-03 Thread Nikhil Bonte

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219746
---


Ship it!




Ship It!

- Nikhil Bonte


On March 3, 2020, 5:13 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated March 3, 2020, 5:13 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is 
> maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format 
> (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
> _JanusGraph_ thereby achieving high ingest rates.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
> 
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  4acb371f1 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 3362bf158 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> bbe0dc5ba 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 
> 55990f780 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  1964ade9a 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
>  cb5a7acd0 
>   
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
>  f552525a4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java
>  fdf117a25 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  54c32c5e8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71025/diff/9/
> 
> 
> Testing
> ---
> 
> **Unit tests**
> Existing tests.
> 
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292
> 
> **Volume tests**
> - Measure performance with large data.
> 
> +--+--+--++
> | File | Before   | After| Configuration  |
> +--+--+--++
> | smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
> | (2.2 MB) |  |  ||
> +--+--+--++
> | largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
> | (40 MB)  |  |  ||
> 

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-03 Thread Nixon Rodrigues

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219732
---


Ship it!




Ship It!

- Nixon Rodrigues


On March 3, 2020, 5:13 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated March 3, 2020, 5:13 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is 
> maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format 
> (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
> _JanusGraph_ thereby achieving high ingest rates.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
> 
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  4acb371f1 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 3362bf158 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> bbe0dc5ba 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 
> 55990f780 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  1964ade9a 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
>  cb5a7acd0 
>   
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
>  f552525a4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
>  39ea3f82e 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  30f5e5a7c 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java
>  fdf117a25 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  54c32c5e8 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2f3aad06b 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71025/diff/9/
> 
> 
> Testing
> ---
> 
> **Unit tests**
> Existing tests.
> 
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292
> 
> **Volume tests**
> - Measure performance with large data.
> 
> +--+--+--++
> | File | Before   | After| Configuration  |
> +--+--+--++
> | smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
> | (2.2 MB) |  |  ||
> +--+--+--++
> | largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
> | (40 MB)  |  |  ||
> 

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-02 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated March 3, 2020, 5:13 a.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Modified approach for getting zip file size during migraiton import.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.
- Existing import implementation continues to function as before. This is 
maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format 
(_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
_JanusGraph_ thereby achieving high ingest rates.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```


**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 4acb371f1 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
3362bf158 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
bbe0dc5ba 
  repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 
55990f780 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
1964ade9a 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
 cb5a7acd0 
  
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
 f552525a4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 39ea3f82e 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 30f5e5a7c 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java
 fdf117a25 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 54c32c5e8 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2f3aad06b 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/71025/diff/9/

Changes: https://reviews.apache.org/r/71025/diff/8-9/


Testing
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
| (40 MB)  |  |  ||
+--+--+--++


Thanks,

Ashutosh Mestry



Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-03-02 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated March 2, 2020, 6:57 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Reduces size of patch by breaking in to smaller implementations.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.
- Existing import implementation continues to function as before. This is 
maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format 
(_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
_JanusGraph_ thereby achieving high ingest rates.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```


**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 4acb371f1 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
3362bf158 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
bbe0dc5ba 
  repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 
55990f780 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
1964ade9a 
  
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
 f552525a4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 39ea3f82e 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 30f5e5a7c 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java
 fdf117a25 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 54c32c5e8 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2f3aad06b 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/StatusReporter.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/71025/diff/7/

Changes: https://reviews.apache.org/r/71025/diff/6-7/


Testing
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
| (40 MB)  |  |  ||
+--+--+--++


Thanks,

Ashutosh Mestry



Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-02-19 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated Feb. 20, 2020, 4:53 a.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Refactored classes into separate files.
- New StatusReporting class.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.
- Existing import implementation continues to function as before. This is 
maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format 
(_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
_JanusGraph_ thereby achieving high ingest rates.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```


**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 4acb371f1 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 1a0d0ccea 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
0b3ede93f 
  intg/src/main/java/org/apache/atlas/pc/WorkItemConsumer.java 9ba4bf4e3 
  intg/src/main/java/org/apache/atlas/pc/WorkItemManager.java a7ba67cb0 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
bbe0dc5ba 
  repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 
55990f780 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
27001e3a9 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ZipExportFileNames.java
 351b47536 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
 ca0bc415c 
  
repository/src/main/java/org/apache/atlas/repository/patches/UniqueAttributePatch.java
 2b5811919 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 39ea3f82e 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 30f5e5a7c 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java
 fdf117a25 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 54c32c5e8 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2f3aad06b 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/StatusReporter.java
 PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/impexp/ImportServiceTest.java
 c14850f43 
  
repository/src/test/java/org/apache/atlas/repository/impexp/MigrationImportTest.java
 PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/impexp/StatusReporterTest.java
 PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/impexp/ZipDirectTest.java 
PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/impexp/ZipFileResourceTestUtils.java
 0ffc3d595 
  repository/src/test/resources/zip-direct-1.zip PRE-CREATION 
  repository/src/test/resources/zip-direct-2.zip PRE-CREATION 


Diff: https://reviews.apache.org/r/71025/diff/6/

Changes: https://reviews.apache.org/r/71025/diff/5-6/


Testing
---

**Unit tests**
Existing tests.

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-02-17 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated Feb. 17, 2020, 5:54 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Additional documentation.
- Support for new zip file format.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description (updated)
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.
- Existing import implementation continues to function as before. This is 
maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format 
(_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of 
_JanusGraph_ thereby achieving high ingest rates.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```


**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 4acb371f1 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
0b3ede93f 
  intg/src/main/java/org/apache/atlas/pc/WorkItemConsumer.java 9ba4bf4e3 
  intg/src/main/java/org/apache/atlas/pc/WorkItemManager.java a7ba67cb0 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
bbe0dc5ba 
  repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 
55990f780 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
27001e3a9 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ZipExportFileNames.java
 351b47536 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java
 ca0bc415c 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 39ea3f82e 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 c536f3b86 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java
 fdf117a25 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 54c32c5e8 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 746193188 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
 PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/impexp/ImportServiceTest.java
 c14850f43 
  
repository/src/test/java/org/apache/atlas/repository/impexp/MigrationImportTest.java
 PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/impexp/ZipDirectTest.java 
PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/impexp/ZipFileResourceTestUtils.java
 0ffc3d595 
  repository/src/test/resources/zip-direct-1.zip PRE-CREATION 
  repository/src/test/resources/zip-direct-2.zip PRE-CREATION 


Diff: https://reviews.apache.org/r/71025/diff/5/

Changes: https://reviews.apache.org/r/71025/diff/4-5/


Testing
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
| (40 MB)  |  |  ||

Re: Review Request 71025: Import Service: Support Concurrent Ingest

2020-01-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated Jan. 23, 2020, 5:30 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Rebased with latest version.
- Added new method that uses batch size to commit.
- Refactoring.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```

**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 4acb371f1 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
0b3ede93f 
  intg/src/main/java/org/apache/atlas/pc/WorkItemConsumer.java 9ba4bf4e3 
  intg/src/main/java/org/apache/atlas/pc/WorkItemManager.java a7ba67cb0 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
bbe0dc5ba 
  repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 
55990f780 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java
 928c70dba 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 25284e92f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 54c32c5e8 


Diff: https://reviews.apache.org/r/71025/diff/4/

Changes: https://reviews.apache.org/r/71025/diff/3-4/


Testing
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
| (40 MB)  |  |  ||
+--+--+--++


Thanks,

Ashutosh Mestry



Re: Review Request 71025: Import Service: Support Concurrent Ingest

2019-07-24 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated July 24, 2019, 11:05 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Changed signature for accepting results. Using interface instead of concrete 
class.
- Using _ConcurrentLinkedQueue_ to reduce locking during queue update. 
- Updated: Performance metrics after changes.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```

**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 613a714ff 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
0b3ede93f 
  intg/src/main/java/org/apache/atlas/pc/WorkItemConsumer.java 9ba4bf4e3 
  intg/src/main/java/org/apache/atlas/pc/WorkItemManager.java a7ba67cb0 
  repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 
9bf30f116 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
3ded79842 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 2f330c093 


Diff: https://reviews.apache.org/r/71025/diff/3/

Changes: https://reviews.apache.org/r/71025/diff/2-3/


Testing (updated)
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |2 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  10 mins | Shards: 4, Threads: 16 |
| (40 MB)  |  |  ||
+--+--+--++


Thanks,

Ashutosh Mestry



Re: Review Request 71025: Import Service: Support Concurrent Ingest

2019-07-24 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated July 24, 2019, 4:58 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Added pre-commit build details.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```

**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 613a714ff 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
0b3ede93f 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
3ded79842 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 2f330c093 


Diff: https://reviews.apache.org/r/71025/diff/2/


Testing (updated)
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |3 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  20 mins | Shards: 4, Threads: 16 |
| (40 MB)  |  |  ||
+--+--+--++


Thanks,

Ashutosh Mestry



Re: Review Request 71025: Import Service: Support Concurrent Ingest

2019-07-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

(Updated July 23, 2019, 9:02 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include:
- Prevent case where entity with same uniqueAttribute is being created in 
multiple threads.
- Updated result handling, reduced payload generated by results.
- Refactoring for clarity.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.

_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```

**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F request=@./import-options.json -F 
data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```


Diffs (updated)
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 613a714ff 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
0b3ede93f 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
3ded79842 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 2f330c093 


Diff: https://reviews.apache.org/r/71025/diff/2/

Changes: https://reviews.apache.org/r/71025/diff/1-2/


Testing
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |3 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  20 mins | Shards: 4, Threads: 16 |
| (40 MB)  |  |  ||
+--+--+--++


Thanks,

Ashutosh Mestry



Re: Review Request 71025: Import Service: Support Concurrent Ingest

2019-07-10 Thread Nikhil Bonte

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review216479
---


Ship it!




Ship It!

- Nikhil Bonte


On July 9, 2019, 4:42 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> ---
> 
> (Updated July 9, 2019, 4:42 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within 
> _AtlasImportRequest_.
> 
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> 
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F request=@./import-options.json -F 
> data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
>  499e8d1af 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
> 0b3ede93f 
>   intg/src/main/resources/atlas-log4j.xml 4f74c2abb 
>   
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
>  3ded79842 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
>  2f330c093 
> 
> 
> Diff: https://reviews.apache.org/r/71025/diff/1/
> 
> 
> Testing
> ---
> 
> **Unit tests**
> Existing tests.
> 
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
> 
> **Volume tests**
> - Measure performance with large data.
> 
> +--+--+--++
> | File | Before   | After| Configuration  |
> +--+--+--++
> | smalldb  |   6 min  |3 min | Shards: 4, Threads: 8  |
> | (2.2 MB) |  |  ||
> +--+--+--++
> | largedb  |3 hrs |  20 mins | Shards: 4, Threads: 16 |
> | (40 MB)  |  |  ||
> +--+--+--++
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Review Request 71025: Import Service: Support Concurrent Ingest

2019-07-08 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
---

Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320


Repository: atlas


Description
---

**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within 
_AtlasImportRequest_.


Diffs
-

  
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java
 499e8d1af 
  intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 
0b3ede93f 
  intg/src/main/resources/atlas-log4j.xml 4f74c2abb 
  
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 
3ded79842 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
 2f330c093 


Diff: https://reviews.apache.org/r/71025/diff/1/


Testing
---

**Unit tests**
Existing tests.

**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.

**Volume tests**
- Measure performance with large data.

+--+--+--++
| File | Before   | After| Configuration  |
+--+--+--++
| smalldb  |   6 min  |3 min | Shards: 4, Threads: 8  |
| (2.2 MB) |  |  ||
+--+--+--++
| largedb  |3 hrs |  20 mins | Shards: 4, Threads: 16 |
| (40 MB)  |  |  ||
+--+--+--++


Thanks,

Ashutosh Mestry