---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57495/#review168686
---
Ship it!
Ship It!
- Madhan Neethiraj
On March 10, 2017, 4:09 a.m., Ashutosh Mestry wrote:
>
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57495/
> ---
>
> (Updated March 10, 2017, 4:09 a.m.)
>
>
> Review request for atlas, Madhan Neethiraj and Sarath Subramanian.
>
>
> Bugs: ATLAS-1503
> https://issues.apache.org/jira/browse/ATLAS-1503
>
>
> Repository: atlas
>
>
> Description
> ---
>
> **Background**
> Existing implementation of Export REST API uses *ByteArrayOutputStream* to
> during output zip file creation. This puts pressure on memory when handling
> large data. Also, the data transfer does not start until entire export is
> done. This situation is less than ideal for performance.
>
> **Solution**
> - Passing *ServletOutputStream* to *ZipSink*.
> - This improves memory usage as memory does not get held up by
> *ByteArrayOutputStream*.
> - Reduces additional copy from *ByteArrayOutputStream* to
> *ServletOutputSream*.
> - Simplifies *ZipSink*.
> - Clear internal data structures after operation completion.
> - This aids, though not much, when freeing up memory used. There is some
> improvement in large transfers.
> - *ExportService.ExportContext.guidsToProcess* removed sequential lookup from
> *List* to *Set*.
> - Data transfer from server to client starts much sooner. Client is able to
> interrupt the progress if needed.
>
>
> Diffs
> -
>
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasExportResult.java
> e6a967e
> webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java
> 31a4cf9
> webapp/src/main/java/org/apache/atlas/web/resources/ExportService.java
> c1891e0
> webapp/src/main/java/org/apache/atlas/web/resources/ZipSink.java 2e4cb01
>
>
> Diff: https://reviews.apache.org/r/57495/diff/2/
>
>
> Testing
> ---
>
> Profiled using *jmap* & *Eclipse MAT*, verified using *YourKit*.
>
> Verified: *FetchTypes* viz. *full* and *connected*.
>
> Memory usage: Stays constant on prolonged use. Verified ~3 hrs of continuous
> runs using medium and large database exports.
>
> Performance improvement:
> Date | File Size | No. of Entities | Duration (in mins)|
> -|---|-|---|
> 3/02 | 180 MB | 202930 |29 mins|
> 3/08 | 180 MB | 202930 |22 mins|
> 3/09 | 180 MB | 202930 |19 mins|
>
> About 15% improvement with list & set combined data structures.
> About 30% improvement by eliminating use of *ByteArrayOutputStream*.
>
>
> Thanks,
>
> Ashutosh Mestry
>
>