-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57495/
-----------------------------------------------------------

Review request for atlas, Madhan Neethiraj and Sarath Subramanian.


Bugs: ATLAS-1646
    https://issues.apache.org/jira/browse/ATLAS-1646


Repository: atlas


Description
-------

**Background**
Existing implementation of Export REST API uses *ByteArrayOutputStream* to 
during output zip file creation. This puts pressure on memory when handling 
large data. Also, the data transfer does not start until entire export is done. 
This situation is less than ideal for performance.

**Solution**
- Passing *ServletOutputStream* to *ZipSink*.
  - This improves memory usage as memory does not get held up by 
*ByteArrayOutputStream*. 
  - Reduces additional copy from *ByteArrayOutputStream* to 
*ServletOutputSream*.
  - Simplifies *ZipSink*.
- Clear internal data structures after operation completion.
  - This aids, though not much, when freeing up memory used. There is some 
improvement in large transfers.
- *ExportService.ExportContext.guidsToProcess* removed sequential lookup from 
*List* to *Set*.
- Data transfer from server to client starts much sooner. Client is able to 
interrupt the progress if needed.


Diffs
-----

  intg/src/main/java/org/apache/atlas/model/impexp/AtlasExportResult.java 
e6a967e 
  webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java 
31a4cf9 
  webapp/src/main/java/org/apache/atlas/web/resources/ExportService.java 
c1891e0 
  webapp/src/main/java/org/apache/atlas/web/resources/ZipSink.java 2e4cb01 


Diff: https://reviews.apache.org/r/57495/diff/1/


Testing
-------

Profiled using *jmap* & *Eclipse MAT*, verified using *YourKit*.

Verified: *FetchTypes* viz. *full* and *connected*.

Memory usage: Stays constant on prolonged use. Verified ~3 hrs of continuous 
runs using medium and large database exports.

Performance improvement:
Date | File Size | No. of Entities | Duration (in mins)|
-----|-----------|-----------------|-------------------|
3/08 |   180 MB  |          202930 |            22 mins|
3/09 |   180 MB  |          202930 |            19 mins|

About 15% improvement.


Thanks,

Ashutosh Mestry

Reply via email to