> On Jan. 7, 2021, 6:17 a.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/repository/audit/HBaseBasedAuditRepository.java
> > Lines 302 (patched)
> > <https://reviews.apache.org/r/73119/diff/2/?file=2243934#file2243934line302>
> >
> >     Does use of second scan result in significant performance improvements 
> > i.e. avoiding population of event.details and event.entity during the first 
> > scan? If not, it will be simper to move #320 - #326 to #293 and avoid this 
> > second scan.

Yes, there is significant performance improvement with the two scan approach. 
The first scan is a full scan and it is done with limited columns. The second 
scan is a limited row scan with all the columns. If we want to avoid the second 
scan then the first scan has to be a full scan with all the columns which would 
be very expensive. On a setup with 1 million audit entries, spread across a 
thousand entities, the two scan approach was 5X faster than the single scan 
approach.


> On Jan. 7, 2021, 6:17 a.m., Madhan Neethiraj wrote:
> > webapp/src/main/java/org/apache/atlas/web/rest/EntityREST.java
> > Lines 802 (patched)
> > <https://reviews.apache.org/r/73119/diff/2/?file=2243937#file2243937line802>
> >
> >     Why not add 'sortBy' query-param to existing REST API method, instead 
> > of adding a new method? Also, consider adding 'sortOrder' query-param - 
> > with default value as 'asc'.

Added the 'sortorder' as a query-param in the new API. 
There are fundamental differences between the two APIs
 - Offset in the existing rest API is achieved by startkey parameter where 
hbase row key is expected. New API takes offset in the form of an integer.
 - Performance and mechanism are different. New API is done with in-memory sort 
and therefore is 10X slower than the old one. If we mix the APIs it would be 
very difficult and confusing to present the benchmarking.
 - Old API support filter by column, but the new one does not.

Also, to differentiate the behavior based on parameters would be very confusing 
from docs perspective.


- Deep


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73119/#review222416
-----------------------------------------------------------


On Jan. 6, 2021, 7:06 p.m., Deep Singh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73119/
> -----------------------------------------------------------
> 
> (Updated Jan. 6, 2021, 7:06 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Madhan Neethiraj, Sarath 
> Subramanian, and Sidharth Mishra.
> 
> 
> Bugs: ATLAS-4094
>     https://issues.apache.org/jira/browse/ATLAS-4094
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> Created a new entity audit API with Sorting functionality.
> 
> /api/atlas/v2/entity/{entity-guid}/audit/sortby/{sort-column}?offset=2&count=5
> 
> sort-column may have 3 values "user", "action", or "timestamp".
> 
> HBase does not support query with sorted results. To support this API 
> inmemory sort has to be performed.
> Audit entry can potentially have entire entity dumped into it. Loading entire 
> audit entries for an entity can be memory intensive. Therefore we load audit 
> entries with limited columns first, perform sort on this light weight list, 
> then get the relevant section by removing offsets and reducing to limits. 
> With this reduced list we create MultiRowRangeFilter and then again scan the 
> table to get all the columns from the table this time.
> 
> 
> Diffs
> -----
> 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/CassandraBasedAuditRepository.java
>  8a453fd43 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditRepository.java
>  07784d1c4 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/HBaseBasedAuditRepository.java
>  9fca74470 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/InMemoryEntityAuditRepository.java
>  900df0205 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/NoopEntityAuditRepository.java
>  ef9e259ea 
>   webapp/src/main/java/org/apache/atlas/web/rest/EntityREST.java 0d6d0c845 
> 
> 
> Diff: https://reviews.apache.org/r/73119/diff/2/
> 
> 
> Testing
> -------
> 
> Manual testing was done.
> 
> 
> Thanks,
> 
> Deep Singh
> 
>

Reply via email to