-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57495/#review168686
-----------------------------------------------------------


Ship it!




Ship It!

- Madhan Neethiraj


On March 10, 2017, 4:09 a.m., Ashutosh Mestry wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57495/
> -----------------------------------------------------------
> 
> (Updated March 10, 2017, 4:09 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-1503
>     https://issues.apache.org/jira/browse/ATLAS-1503
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> **Background**
> Existing implementation of Export REST API uses *ByteArrayOutputStream* to 
> during output zip file creation. This puts pressure on memory when handling 
> large data. Also, the data transfer does not start until entire export is 
> done. This situation is less than ideal for performance.
> 
> **Solution**
> - Passing *ServletOutputStream* to *ZipSink*.
>   - This improves memory usage as memory does not get held up by 
> *ByteArrayOutputStream*. 
>   - Reduces additional copy from *ByteArrayOutputStream* to 
> *ServletOutputSream*.
>   - Simplifies *ZipSink*.
> - Clear internal data structures after operation completion.
>   - This aids, though not much, when freeing up memory used. There is some 
> improvement in large transfers.
> - *ExportService.ExportContext.guidsToProcess* removed sequential lookup from 
> *List* to *Set*.
> - Data transfer from server to client starts much sooner. Client is able to 
> interrupt the progress if needed.
> 
> 
> Diffs
> -----
> 
>   intg/src/main/java/org/apache/atlas/model/impexp/AtlasExportResult.java 
> e6a967e 
>   webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java 
> 31a4cf9 
>   webapp/src/main/java/org/apache/atlas/web/resources/ExportService.java 
> c1891e0 
>   webapp/src/main/java/org/apache/atlas/web/resources/ZipSink.java 2e4cb01 
> 
> 
> Diff: https://reviews.apache.org/r/57495/diff/2/
> 
> 
> Testing
> -------
> 
> Profiled using *jmap* & *Eclipse MAT*, verified using *YourKit*.
> 
> Verified: *FetchTypes* viz. *full* and *connected*.
> 
> Memory usage: Stays constant on prolonged use. Verified ~3 hrs of continuous 
> runs using medium and large database exports.
> 
> Performance improvement:
> Date | File Size | No. of Entities | Duration (in mins)|
> -----|-----------|-----------------|-------------------|
> 3/02 |   180 MB  |          202930 |            29 mins|
> 3/08 |   180 MB  |          202930 |            22 mins|
> 3/09 |   180 MB  |          202930 |            19 mins|
> 
> About 15% improvement with list & set combined data structures.
> About 30% improvement by eliminating use of *ByteArrayOutputStream*.
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>

Reply via email to