----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/72666/#review221351 -----------------------------------------------------------
intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java Lines 67 (patched) <https://reviews.apache.org/r/72666/#comment310241> incrementCapacityBy is already an instance member, line #29 above. It is unnecessary to pass it as parameter to methods at #67, #79. Please review and update. intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java Lines 73 (patched) <https://reviews.apache.org/r/72666/#comment310242> This assumes that access is done in strict ascending order sequence, without any gaps. This is not a good assumption for a generic/reusable class implementation. This will be broken with the following access pattern: FixedBufferList<MyObjecy> list = new FixedBufferList<>(10, 1); list.getForUpdate(0); list.getForUpdate(15); Please review my earlier comment and update to remove this assumption. Also, I suggest to collapse following methods into one, named ensureCapacity(): - request() - ensureCapacity(): it not obvious what the return value of this method is, and it is critical for the caller at #74. All this noise, and confusion can be avoided by simply collapsing this method into previous method at #67. - instantiateItems(): this method is not callable from any other context, as it simply "adds()" to the buffer webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java Line 168 (original), 175 (patched) <https://reviews.apache.org/r/72666/#comment310243> sendNotifications() can be called from the same thread multiple times - for example when a transaction involves create/update/delete of entities i.e. a graph transaction can call more than one of the following methods: - onEntitiesAdded() - onEntitiesUpdated() - onEntitiesDeleted() - onEntitiesPurged() - onClassificationsAdded() - onClassificationsUpdated() - onClassificationsDeleted() Each call would end up calling EntityNotificationSender.send() with the notification objects, which in turn stores the objects in a list. The second call in the same transaction will end up overwriting EntityNotificationV2 instances in list buffer - which will cause the notification objects stored in EntityNotificationSender.send() be updated as well. This will result in incorrect notification to be sent. Please review this carefully and update. - Madhan Neethiraj On July 24, 2020, 5:18 a.m., Ashutosh Mestry wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/72666/ > ----------------------------------------------------------- > > (Updated July 24, 2020, 5:18 a.m.) > > > Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, > and Sarath Subramanian. > > > Bugs: ATLAS-3878 > https://issues.apache.org/jira/browse/ATLAS-3878 > > > Repository: atlas > > > Description > ------- > > **Background** > See JIRA for details. > > *Analysis* Using memory profiling tools, it was observed that large number of > notification objects were created. These stayed in memory and later were > promoted to higher generation, thereby taking even longer to be collected. > > **Approach** > Using the fixed-buffer approach to address the problem of creating large > number of small objects. > > New *FixedBufferList* This is an encapsulation over *ArrayList*. During > initial allocation, list is populated with default values. Features: > - Setting of values to these pre-allocated objects is achieved by first doing > a *get* on the element and then assigning values to it. > - *toList* fetches the sub-list from the encapsulating list. This uses the > state within the class to fetch the right length for the returning array. > > New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is > necessary since the list is now part class's state. > Modified *EntityAuditListenerV2* Uses the new classes. > Modifed *EntityNotificationListener* Uses the new classes. > > **Verification** > - Using the test setup, the memory usage was observed over a period of 24 > hrs. > - Memory usage and object allocation was obvserved using memory profiler. > > > Diffs > ----- > > intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 > intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION > intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java > PRE-CREATION > > repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java > 79527acfa > > webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java > a677b315c > > > Diff: https://reviews.apache.org/r/72666/diff/8/ > > > Testing > ------- > > **Unit testing** > Unit tests added for the new classes. > > **Volume testing** > Setup: > - Node: Threads 40, Core: 40, Allocated Memory: 12 GB > - Multiple Kafka queues ingesting data. > - Bulk entity creation using custom script ingesting 100M entities. > > Memory usage stayed between 0 and 5% during the 24 hr period. > > With: > - Workers: 64 > - Batch size: 50 (fewer elements in batch improve commit time and audit write > time). > - Throughput: ~1.2 M entities per hour. Without out of memory error. > > **Pre-commit** > https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/ > https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/ > > > Thanks, > > Ashutosh Mestry > >
