Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 24, 2020, 5:18 a.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
 a677b315c 


Diff: https://reviews.apache.org/r/72666/diff/7/


Testing (updated)
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour. Without out of memory error.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2067/


Thanks,

Ashutosh Mestry



Re: Review Request 72646: ATLAS-3876 : Relationship Search API not showing correct approximateCount

2020-07-23 Thread Pinal Shah


> On July 23, 2020, 10:50 p.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/discovery/EntityDiscoveryService.java
> > Line 607 (original), 616 (patched)
> > 
> >
> > what if out/in edges size is a lot, do you bring everything into memory 
> > just to get size? 
> > 
> > Do you need entire edges in the list or maybe we can maintain a counter 
> > and increment?

Yes Sarath you are right but,
approximate count is count without limit/offset
I brought it to inmemory to filter out 'ACTIVE' edges incase of excludeDeleted 
flag
Other way is to fire another query with 'ACTIVE'filter


> On July 23, 2020, 10:50 p.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/discovery/EntityDiscoveryService.java
> > Lines 682 (patched)
> > 
> >
> > Constants.GUID_PROPERTY_KEY).value() can be null (mostly not, but some 
> > bad vertices). Consider checking for null value of guid.
> > 
> > Also, since you already have vertex, do we need additional 
> > 'endVerticesGuid' list? Can we extract 'guid' from Vertex 'v' and add it to 
> > 'resultList' is same loop. 
> > 
> > resultList.add(entityRetriever.toAtlasEntityHeader(endVertexGuid, 
> > attributes));
> > 
> > (or)
> > 
> > Maybe map Vertex to AtlasVertex and directly call:
> > 
> > resultList.add(entityRetriever.toAtlasEntityHeader(atlasVertex, 
> > attributes));

I have added  (v != null && 
v.property(Constants.GUID_PROPERTY_KEY).isPresent()) for the null check
Correct, we dont need extra list, will address it


- Pinal


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72646/#review221343
---


On July 6, 2020, 9:14 a.m., Pinal Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72646/
> ---
> 
> (Updated July 6, 2020, 9:14 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3876
> https://issues.apache.org/jira/browse/ATLAS-3876
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Issue:**
> Relationship api doesn't provide approximate count of the related entities in 
> the response.
> 
> **Workaround:**
> Get the total count of related given entity , irrespective of the 
> offset/limit.
> 
> 
> Also this patch includes **improvement in the time taken to fetch related 
> entities**  .
> Average time taken for the Api to search relationship entities having **5000 
> end vertices** with limit **500**:
> Before: 9seconds
> After applying this patch : 3seconds
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/discovery/EntityDiscoveryService.java
>  4b9564295 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
>  863a00350 
>   repository/src/main/java/org/apache/atlas/util/SearchPredicateUtil.java 
> 5069d78c8 
> 
> 
> Diff: https://reviews.apache.org/r/72646/diff/2/
> 
> 
> Testing
> ---
> 
> Manually tested
> Precommit : https://builds.apache.org/job/PreCommit-ATLAS-Build-Test/2011 
> (Failed in Impala build)
> 
> 
> Thanks,
> 
> Pinal Shah
> 
>



Re: Review Request 72567: ATLAS-3782 : Support NOT_CONTAINS operator in basic search

2020-07-23 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72567/#review221344
---


Ship it!




Ship It!

- Sarath Subramanian


On June 30, 2020, 6:09 a.m., Pinal Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72567/
> ---
> 
> (Updated June 30, 2020, 6:09 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3782
> https://issues.apache.org/jira/browse/ATLAS-3782
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Issue:**
> The operator 'SearchParameters.Operator.NOT_CONTAINS' is defined and 
> implemented in SearchProcessors. It would allow a search of entities that do 
> not contain in given string in a specified attribute (eg exclude entities 
> from search that contain 'temp' in the qualified name). 
> 
> **WorkAround:**
> JanusGraph doesn't allow NOT_CONTAINS operator. So we will handle this in 
> inmemory
> BasicSearch generates query via three modes
> 1. Index query -> NOT_CONTAINS will not be supported
> 2. InMemeory Predicates -> NOT_CONTAINS will be supported, Already handled 
> #123 SearchProcessor
> 3. Graph query -> NOT_CONTAINS will not be supported
> 
> As in index and graph query wiil not support not_Contains operator, We need 
> to apply filter(inMemoryPredicate) after either index/graph query.
> 
> To support above, I have modified ClassificationSearchProcessor
> + For both cases index as well as graph, added typeNamePredicate and 
> attributePredicate
> + Added these predicate after query
> - Removed gremlinQuery block
> 
> **Operator Value:**
> It can be either "not_contains" or "NOT_CONTAINS"
> 
> **Note:**
> As part of this jira, not_contains is also added in quick search.
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasSolrQueryBuilder.java
>  6c06a3cbe 
>   
> repository/src/main/java/org/apache/atlas/discovery/ClassificationSearchProcessor.java
>  9c72cd4a2 
>   repository/src/main/java/org/apache/atlas/discovery/SearchProcessor.java 
> c9a605355 
>   repository/src/test/java/org/apache/atlas/BasicTestSetup.java 8b98b3990 
>   
> repository/src/test/java/org/apache/atlas/discovery/AtlasDiscoveryServiceTest.java
>  PRE-CREATION 
>   
> repository/src/test/java/org/apache/atlas/discovery/BasicSearchClassificationTest.java
>  9b16e919d 
>   
> repository/src/test/java/org/apache/atlas/discovery/EntitySearchProcessorTest.java
>  b7ce97845 
> 
> 
> Diff: https://reviews.apache.org/r/72567/diff/4/
> 
> 
> Testing
> ---
> 
> Added testcases
> Precommit : https://builds.apache.org/job/PreCommit-ATLAS-Build-Test/1999
> 
> 
> Thanks,
> 
> Pinal Shah
> 
>



Re: Review Request 72646: ATLAS-3876 : Relationship Search API not showing correct approximateCount

2020-07-23 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72646/#review221343
---




repository/src/main/java/org/apache/atlas/discovery/EntityDiscoveryService.java
Line 607 (original), 616 (patched)


what if out/in edges size is a lot, do you bring everything into memory 
just to get size? 

Do you need entire edges in the list or maybe we can maintain a counter and 
increment?



repository/src/main/java/org/apache/atlas/discovery/EntityDiscoveryService.java
Lines 682 (patched)


Constants.GUID_PROPERTY_KEY).value() can be null (mostly not, but some bad 
vertices). Consider checking for null value of guid.

Also, since you already have vertex, do we need additional 
'endVerticesGuid' list? Can we extract 'guid' from Vertex 'v' and add it to 
'resultList' is same loop. 

resultList.add(entityRetriever.toAtlasEntityHeader(endVertexGuid, 
attributes));

(or)

Maybe map Vertex to AtlasVertex and directly call:

resultList.add(entityRetriever.toAtlasEntityHeader(atlasVertex, 
attributes));


- Sarath Subramanian


On July 6, 2020, 2:14 a.m., Pinal Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72646/
> ---
> 
> (Updated July 6, 2020, 2:14 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3876
> https://issues.apache.org/jira/browse/ATLAS-3876
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Issue:**
> Relationship api doesn't provide approximate count of the related entities in 
> the response.
> 
> **Workaround:**
> Get the total count of related given entity , irrespective of the 
> offset/limit.
> 
> 
> Also this patch includes **improvement in the time taken to fetch related 
> entities**  .
> Average time taken for the Api to search relationship entities having **5000 
> end vertices** with limit **500**:
> Before: 9seconds
> After applying this patch : 3seconds
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/discovery/EntityDiscoveryService.java
>  4b9564295 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
>  863a00350 
>   repository/src/main/java/org/apache/atlas/util/SearchPredicateUtil.java 
> 5069d78c8 
> 
> 
> Diff: https://reviews.apache.org/r/72646/diff/2/
> 
> 
> Testing
> ---
> 
> Manually tested
> Precommit : https://builds.apache.org/job/PreCommit-ATLAS-Build-Test/2011 
> (Failed in Impala build)
> 
> 
> Thanks,
> 
> Pinal Shah
> 
>



Re: Review Request 72703: Import Service: UpdateVertexGuid Now Makes Updates to AtlasEntityWithExtInfo

2020-07-23 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72703/#review221341
---


Ship it!




Ship It!

- Sarath Subramanian


On July 23, 2020, 11:29 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72703/
> ---
> 
> (Updated July 23, 2020, 11:29 a.m.)
> 
> 
> Review request for atlas, Nikhil Bonte, Nixon Rodrigues, and Sarath 
> Subramanian.
> 
> 
> Bugs: ATLAS-3902
> https://issues.apache.org/jira/browse/ATLAS-3902
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> (Modified) *RegularImport.updateVertexGuid* Updated method to handle 
> *AtlasEntityWithExtInfo*.
> 
> 
> Diffs
> -
> 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
>  3f7e86167 
> 
> 
> Diff: https://reviews.apache.org/r/72703/diff/1/
> 
> 
> Testing
> ---
> 
> **Functional**
> - On a cluster with Atlas, perform import using REST calls with 
> *stocks-1.zip* and *stock-2.zip*.
> 
> ```
> curl -g -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
> "Cache-Control: no-cache" -F data=@./stocks-1.zip 
> "http://localhost:21000/api/atlas/admin/import;
> 
> ```
> 
> 
> File Attachments
> 
> 
> Stocks-1
>   
> https://reviews.apache.org/media/uploaded/files/2020/07/23/ae57756a-71dc-4cf4-8ae7-70270c14ae08__stocks-1.zip
> Stocks-2
>   
> https://reviews.apache.org/media/uploaded/files/2020/07/23/b0813b02-e44f-40a8-b52c-2dd620d067d9__stocks-2.zip
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72698: ATLAS-3875: Introduce sample project for AtlasClient

2020-07-23 Thread Jyoti Singh


> On July 22, 2020, 7:24 p.m., Sidharth Mishra wrote:
> > atlas-examples/src/main/java/org/apache/atlas/AtlasClientBaseExample.java
> > Lines 63 (patched)
> > 
> >
> > char[] is always preferred over String - check 
> > https://stackoverflow.com/questions/8881291/why-is-char-preferred-over-string-for-passwords#:~:text=Since%20Strings%20are%20immutable%20there,2.
> > 
> > Please check if we can add a new contructor at client v2 and do this ow 
> > lets file a jira and track this for client v2

Need to change is seperate ticket as there is change in client also.


- Jyoti


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72698/#review221321
---


On July 23, 2020, 6:10 p.m., Jyoti Singh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72698/
> ---
> 
> (Updated July 23, 2020, 6:10 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Madhan Neethiraj, Sarath 
> Subramanian, and Sidharth Mishra.
> 
> 
> Bugs: ATLAS-3875
> https://issues.apache.org/jira/browse/ATLAS-3875
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Using this project users can get an idea as how to integrate with Atlas using 
> AtlasCleint. This helps the user to understand the basic rest functionality 
> of Atlas such as
> 
> - EntityRest
> - TypeDefRest
> - DiscoveryRest
> - LineageRest
> - GlossaryRest
> 
> 
> Diffs
> -
> 
>   atlas-examples/pom.xml PRE-CREATION 
>   atlas-examples/sample-app/README.md PRE-CREATION 
>   atlas-examples/sample-app/pom.xml PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/DiscoveryExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/EntityExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/GlossaryExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/LineageExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/SampleApp.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/SampleAppConstants.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/TypeDefExample.java
>  PRE-CREATION 
>   atlas-examples/sample-app/src/main/resources/atlas-application.properties 
> PRE-CREATION 
>   pom.xml 5e0442ae5 
> 
> 
> Diff: https://reviews.apache.org/r/72698/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jyoti Singh
> 
>



Re: Review Request 72698: ATLAS-3875: Introduce sample project for AtlasClient

2020-07-23 Thread Jyoti Singh


> On July 22, 2020, 7:24 p.m., Sidharth Mishra wrote:
> > atlas-examples/src/main/java/org/apache/atlas/AtlasClientBaseExample.java
> > Lines 140 (patched)
> > 
> >
> > Please use Console.readPassword instead

Not using this as it will throw exceptions in IDE.


- Jyoti


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72698/#review221321
---


On July 23, 2020, 6:10 p.m., Jyoti Singh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72698/
> ---
> 
> (Updated July 23, 2020, 6:10 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Madhan Neethiraj, Sarath 
> Subramanian, and Sidharth Mishra.
> 
> 
> Bugs: ATLAS-3875
> https://issues.apache.org/jira/browse/ATLAS-3875
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Using this project users can get an idea as how to integrate with Atlas using 
> AtlasCleint. This helps the user to understand the basic rest functionality 
> of Atlas such as
> 
> - EntityRest
> - TypeDefRest
> - DiscoveryRest
> - LineageRest
> - GlossaryRest
> 
> 
> Diffs
> -
> 
>   atlas-examples/pom.xml PRE-CREATION 
>   atlas-examples/sample-app/README.md PRE-CREATION 
>   atlas-examples/sample-app/pom.xml PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/DiscoveryExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/EntityExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/GlossaryExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/LineageExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/SampleApp.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/SampleAppConstants.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/TypeDefExample.java
>  PRE-CREATION 
>   atlas-examples/sample-app/src/main/resources/atlas-application.properties 
> PRE-CREATION 
>   pom.xml 5e0442ae5 
> 
> 
> Diff: https://reviews.apache.org/r/72698/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jyoti Singh
> 
>



Re: Review Request 72695: Optional configuration to support locks on JanusGraph to ensure data consitency.

2020-07-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72695/#review221338
---




repository/src/main/java/org/apache/atlas/repository/patches/ConcurrentPatchProcessor.java
Line 39 (original), 39 (patched)


We will need to add another *JavaPatch* to update existing data.


- Ashutosh Mestry


On July 20, 2020, 9:48 p.m., Damian Warszawski wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72695/
> ---
> 
> (Updated July 20, 2020, 9:48 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Bolke de Bruin, madhan, and Sarath 
> Subramanian.
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Optional configuration to support locks on JanusGraph to ensure data 
> consitency.
> 
> JanusGraph is eventually consistent by default which is efficient but results 
> in duplicates when race condition occurs.
> 
> 
> Reference to jira 
> https://issues.apache.org/jira/projects/ATLAS/issues/ATLAS-3398
> 
> 
> Diffs
> -
> 
>   
> graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasGraphManagement.java
>  fca789027 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java
>  6ef9cb76c 
>   
> graphdb/janus/src/test/java/org/apache/atlas/repository/graphdb/janus/AbstractGraphDatabaseTest.java
>  35004157f 
>   
> graphdb/janus/src/test/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusDatabaseTest.java
>  5cd55093e 
>   intg/src/main/java/org/apache/atlas/ApplicationProperties.java e662c8fae 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
>  e35f3594f 
>   
> repository/src/main/java/org/apache/atlas/repository/patches/ConcurrentPatchProcessor.java
>  5a9ac2abe 
>   
> repository/src/main/java/org/apache/atlas/repository/patches/UniqueAttributePatch.java
>  d3111f110 
> 
> 
> Diff: https://reviews.apache.org/r/72695/diff/1/
> 
> 
> Testing
> ---
> 
> Not possible to reproduce the error on local machine. Enable locking on our 
> dev env and have not introduce any regression.
> 
> 
> Thanks,
> 
> Damian Warszawski
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221337
---




repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
Lines 409 (patched)


should we reset/clear these values for EntityAuditEventV2? It might have 
remnant values when reused again?


- Sarath Subramanian


On July 23, 2020, 11:48 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 23, 2020, 11:48 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
>  a677b315c 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/7/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221333
---




intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 51 (patched)


this assumes that the elements are accessed in ascending oeder - since 
getAndIncrementLength() simply increments the length by 1. I suggest to get rid 
of getAndIncrementLength() and update getForUpdate() as below:

  public T getForUpdate(int index) {
ensureCapacity(index + 1);

T ret = buffer.get(index);

// update length only if the accessed index is beyond the current length
if (this.length <= index) {
  this.length = index + 1;
}

return ret;
  }



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 55 (patched)


To make it easier to read, I suggest to get rid of method 
resetCurrentLength() and update toList() as below:
  public List toList(boolean resetList) {
List ret = this.buffer.subList(0, this.length);

if (resetList) {
  this.length = 0;
}
  }



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 60 (patched)


- is 'incrementCapacityBy' argument needed, given this is available in 
instance memeber this.incrementCapacityBy?
- consider renaming request() => ensureCapacity()
- consider folding adjustCapacity(), instantiateItems() into this method, 
as below:

  private void ensureCapacity(int capacity) {
if (capacity > this.buffer.size()) {
  int currCapacity = this.buffer.size();
  int newCapacity  = currCapacity + incrementCapacityBy;

  while (newCapacity < capacity) {
newCapacity += incrementCapacityBy;
  }

  this.buffer.ensureCapacity(newCapacity);

  // initialize new entries
  for (int i = currCapacity; i < newCapacity; i++) {
this.buffer.add(itemClass.newInstance());
  }
}
  }



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 71 (patched)


adjustCapacity() => ensureCapacity()



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 87 (patched)


Consider moving instantiateItems() into adjustCapacity() implementation; 
this will avoid callers of adjustCapacity() to call instantiateItems() as well .



intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java
Lines 44 (patched)


Why does this block need to be under synchronized? No state is updated in 
this block.



intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java
Lines 33 (patched)


verifyPeriodicPurge() - the method name doesn't seem to related the 
implementation. Please review and update.



repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
Lines 90 (patched)


FixedBufferListAccessor class doesn't seem to add much value. Consider the 
following simper usage:

  private static final ThreadLocal> 
AUDIT_EVENTS_BUFFER =
  new ThreadLocal.withInitial(() -> new 
FixedBufferList(FIXED_BUFFER_INITIAL_SIZE_DEFAULT, 
FIXED_BUFFER_INCREMENT_DEFAULT));

Note the use of 'static' above; fixedBufferListAccessor can also be 
eliminated. You might consider a simple accesser method in 
EntityAuditListenerV2:

  private FixedBufferList getAuditEventsBuffer() {
return AUDIT_EVENTS_BUFFER.get();
  }

Same applies for EntityNotificationListenerV2 as well.


- Madhan Neethiraj


On July 23, 2020, 6:48 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 23, 2020, 6:48 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the 

Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/
---

(Updated July 23, 2020, 6:48 p.m.)


Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and 
Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3878
https://issues.apache.org/jira/browse/ATLAS-3878


Repository: atlas


Description
---

**Background**
See JIRA for details.

*Analysis* Using memory profiling tools, it was observed that large number of 
notification objects were created. These stayed in memory and later were 
promoted to higher generation, thereby taking even longer to be collected.

**Approach**
Using the fixed-buffer approach to address the problem of creating large number 
of small objects.

New *FixedBufferList* This is an encapsulation over *ArrayList*. During initial 
allocation, list is populated with default values. Features:
- Setting of values to these pre-allocated objects is achieved by first doing a 
*get* on the element and then assigning values to it.
- *toList* fetches the sub-list from the encapsulating list. This uses the 
state within the class to fetch the right length for the returning array.

New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
necessary since the list is now part class's state.
Modified *EntityAuditListenerV2* Uses the new classes.
Modifed *EntityNotificationListener* Uses the new classes.

**Verification**
- Using the test setup, the memory usage was observed over a period of 24 hrs. 
- Memory usage and object allocation was obvserved using memory profiler.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
PRE-CREATION 
  intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
 79527acfa 
  
webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
 a677b315c 


Diff: https://reviews.apache.org/r/72666/diff/7/

Changes: https://reviews.apache.org/r/72666/diff/6-7/


Testing
---

**Unit testing**
Unit tests added for the new classes.

**Volume testing**
Setup:
- Node: Threads 40, Core: 40, Allocated Memory: 12 GB
- Multiple Kafka queues ingesting data.
- Bulk entity creation using custom script ingesting 100M entities.

Memory usage stayed between 0 and 5% during the 24 hr period.

With:
- Workers: 64
- Batch size: 50 (fewer elements in batch improve commit time and audit write 
time).
- Throughput: ~1.2 M entities per hour. Without out of memory error.

**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/


Thanks,

Ashutosh Mestry



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Ashutosh Mestry via Review Board


> On July 23, 2020, 5:59 p.m., Sarath Subramanian wrote:
> > intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
> > Lines 33 (patched)
> > 
> >
> > ArrayList => List

I would prefer to keep it buffer since this implements the *fixed buffer* 
algorithm to handle the problem.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221322
---


On July 22, 2020, 4:16 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 22, 2020, 4:16 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
>  a677b315c 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/6/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



Review Request 72703: Import Service: UpdateVertexGuid Now Makes Updates to AtlasEntityWithExtInfo

2020-07-23 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72703/
---

Review request for atlas, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.


Bugs: ATLAS-3902
https://issues.apache.org/jira/browse/ATLAS-3902


Repository: atlas


Description
---

**Approach**
(Modified) *RegularImport.updateVertexGuid* Updated method to handle 
*AtlasEntityWithExtInfo*.


Diffs
-

  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java
 3f7e86167 


Diff: https://reviews.apache.org/r/72703/diff/1/


Testing
---

**Functional**
- On a cluster with Atlas, perform import using REST calls with *stocks-1.zip* 
and *stock-2.zip*.

```
curl -g -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H 
"Cache-Control: no-cache" -F data=@./stocks-1.zip 
"http://localhost:21000/api/atlas/admin/import;

```


File Attachments


Stocks-1
  
https://reviews.apache.org/media/uploaded/files/2020/07/23/ae57756a-71dc-4cf4-8ae7-70270c14ae08__stocks-1.zip
Stocks-2
  
https://reviews.apache.org/media/uploaded/files/2020/07/23/b0813b02-e44f-40a8-b52c-2dd620d067d9__stocks-2.zip


Thanks,

Ashutosh Mestry



Re: Review Request 72698: ATLAS-3875: Introduce sample project for AtlasClient

2020-07-23 Thread Jyoti Singh


> On July 22, 2020, 7:24 p.m., Sidharth Mishra wrote:
> > atlas-examples/src/main/java/org/apache/atlas/AtlasClientBaseExample.java
> > Lines 86 (patched)
> > 
> >
> > Instead of commenting like typedef then entity examples etc. it would 
> > be good to move these to separate private functions. Please refer - 
> > 
> > FUNCTIONS SHOULD DO ONE THING. THEY SHOULD DO IT WELL. THEY SHOULD DO 
> > IT ONLY (more details - 
> > https://learning.oreilly.com/library/view/clean-code/9780136083238/chapter03.html#ch3)
> > 
> > 
> > https://softwareengineering.stackexchange.com/questions/137941/should-a-method-do-one-thing-and-be-good-at-it

We need to delete Type and entity at the end so not moving these inside 
function.


- Jyoti


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72698/#review221321
---


On July 23, 2020, 6:10 p.m., Jyoti Singh wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72698/
> ---
> 
> (Updated July 23, 2020, 6:10 p.m.)
> 
> 
> Review request for atlas, Ashutosh Mestry, Madhan Neethiraj, Sarath 
> Subramanian, and Sidharth Mishra.
> 
> 
> Bugs: ATLAS-3875
> https://issues.apache.org/jira/browse/ATLAS-3875
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> Using this project users can get an idea as how to integrate with Atlas using 
> AtlasCleint. This helps the user to understand the basic rest functionality 
> of Atlas such as
> 
> - EntityRest
> - TypeDefRest
> - DiscoveryRest
> - LineageRest
> - GlossaryRest
> 
> 
> Diffs
> -
> 
>   atlas-examples/pom.xml PRE-CREATION 
>   atlas-examples/sample-app/README.md PRE-CREATION 
>   atlas-examples/sample-app/pom.xml PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/DiscoveryExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/EntityExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/GlossaryExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/LineageExample.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/SampleApp.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/SampleAppConstants.java
>  PRE-CREATION 
>   
> atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/TypeDefExample.java
>  PRE-CREATION 
>   atlas-examples/sample-app/src/main/resources/atlas-application.properties 
> PRE-CREATION 
>   pom.xml 5e0442ae5 
> 
> 
> Diff: https://reviews.apache.org/r/72698/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jyoti Singh
> 
>



Re: Review Request 72698: ATLAS-3875: Introduce sample project for AtlasClient

2020-07-23 Thread Jyoti Singh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72698/
---

(Updated July 23, 2020, 6:10 p.m.)


Review request for atlas, Ashutosh Mestry, Madhan Neethiraj, Sarath 
Subramanian, and Sidharth Mishra.


Changes
---

adding code review change


Bugs: ATLAS-3875
https://issues.apache.org/jira/browse/ATLAS-3875


Repository: atlas


Description
---

Using this project users can get an idea as how to integrate with Atlas using 
AtlasCleint. This helps the user to understand the basic rest functionality of 
Atlas such as

- EntityRest
- TypeDefRest
- DiscoveryRest
- LineageRest
- GlossaryRest


Diffs (updated)
-

  atlas-examples/pom.xml PRE-CREATION 
  atlas-examples/sample-app/README.md PRE-CREATION 
  atlas-examples/sample-app/pom.xml PRE-CREATION 
  
atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/DiscoveryExample.java
 PRE-CREATION 
  
atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/EntityExample.java
 PRE-CREATION 
  
atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/GlossaryExample.java
 PRE-CREATION 
  
atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/LineageExample.java
 PRE-CREATION 
  
atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/SampleApp.java
 PRE-CREATION 
  
atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/SampleAppConstants.java
 PRE-CREATION 
  
atlas-examples/sample-app/src/main/java/org/apache/atlas/examples/sampleapp/TypeDefExample.java
 PRE-CREATION 
  atlas-examples/sample-app/src/main/resources/atlas-application.properties 
PRE-CREATION 
  pom.xml 5e0442ae5 


Diff: https://reviews.apache.org/r/72698/diff/3/

Changes: https://reviews.apache.org/r/72698/diff/2-3/


Testing
---


Thanks,

Jyoti Singh



Re: Review Request 72666: Notification: Solution to Memory Build-up

2020-07-23 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72666/#review221322
---




intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 33 (patched)


ArrayList => List



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 59 (patched)


request => initializeBuffer()



intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java
Lines 105 (patched)


getActualTypeArguments() might return empty array. consider checking length



repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
Lines 97 (patched)


do we need to create a new instance and pass to the constructor? The 
constructor just needs the class.

consider updating constructor of FixedBufferListAccessor to take class:

this.fixedBufferListAccessor = new 
FixedBufferListAccessor<>(EntityAuditEventV2FixedList.class);

same for line EntityNotificationListenerV2.java line#94


- Sarath Subramanian


On July 22, 2020, 9:16 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72666/
> ---
> 
> (Updated July 22, 2020, 9:16 a.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, 
> and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3878
> https://issues.apache.org/jira/browse/ATLAS-3878
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Background**
> See JIRA for details.
> 
> *Analysis* Using memory profiling tools, it was observed that large number of 
> notification objects were created. These stayed in memory and later were 
> promoted to higher generation, thereby taking even longer to be collected.
> 
> **Approach**
> Using the fixed-buffer approach to address the problem of creating large 
> number of small objects.
> 
> New *FixedBufferList* This is an encapsulation over *ArrayList*. During 
> initial allocation, list is populated with default values. Features:
> - Setting of values to these pre-allocated objects is achieved by first doing 
> a *get* on the element and then assigning values to it.
> - *toList* fetches the sub-list from the encapsulating list. This uses the 
> state within the class to fetch the right length for the returning array.
> 
> New *NamedFixedBufferList* Maintains a per-thread *FixedBufferList*. This is 
> necessary since the list is now part class's state.
> Modified *EntityAuditListenerV2* Uses the new classes.
> Modifed *EntityNotificationListener* Uses the new classes.
> 
> **Verification**
> - Using the test setup, the memory usage was observed over a period of 24 
> hrs. 
> - Memory usage and object allocation was obvserved using memory profiler.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 2c007ca01 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferList.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/FixedBufferListAccessor.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListAccessorTest.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/utils/FixedBufferListTest.java 
> PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/audit/EntityAuditListenerV2.java
>  79527acfa 
>   
> webapp/src/main/java/org/apache/atlas/notification/EntityNotificationListenerV2.java
>  a677b315c 
> 
> 
> Diff: https://reviews.apache.org/r/72666/diff/6/
> 
> 
> Testing
> ---
> 
> **Unit testing**
> Unit tests added for the new classes.
> 
> **Volume testing**
> Setup:
> - Node: Threads 40, Core: 40, Allocated Memory: 12 GB
> - Multiple Kafka queues ingesting data.
> - Bulk entity creation using custom script ingesting 100M entities.
> 
> Memory usage stayed between 0 and 5% during the 24 hr period.
> 
> With:
> - Workers: 64
> - Batch size: 50 (fewer elements in batch improve commit time and audit write 
> time).
> - Throughput: ~1.2 M entities per hour. Without out of memory error.
> 
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/2035/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>



[jira] [Updated] (ATLAS-3902) Import Service: Importing Data With Differing GUIDs for Same Unique Attributes Causes Errors in Certain Cases

2020-07-23 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-3902:
---
Summary: Import Service: Importing Data With Differing GUIDs for Same 
Unique Attributes Causes Errors in Certain Cases  (was: Import Service: 
Importing Data With Differing GUIDs for Same Unique Attributes Causes Errors)

> Import Service: Importing Data With Differing GUIDs for Same Unique 
> Attributes Causes Errors in Certain Cases
> -
>
> Key: ATLAS-3902
> URL: https://issues.apache.org/jira/browse/ATLAS-3902
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0, trunk, 2.1.0
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk, 2.1.0
>
>
> *Background*
> Consider the scenario where 2 clusters containing Atlas are setup to be 
> synchronized Atlas' export and import APIs. If the source Atlas has changes 
> where table is dropped and re-created with same name. The table's entity 
> within Atlas  will get a new GUID but will continue to have the same 
> _qualifiedName_.
> This case is handled within the Import API.
> However, the case that is not handled is to perform similar update on to the 
> table's storage descriptor.
> *Steps to Duplicate*
>  # Create a schema within Hive containing database, tables, columns and 
> views. Atlas will reflect the changes. Perform export. Generate _s1.zip_.
>  # Drop schema.
>  # Re-create the same schema within Hive. Perform export. Generate _s2.zip_.
>  # Clear Atlas database.
>  # Import _s1.zip_. Observe _application.log_.
>  # Import s2.zip. Observe _application.log_. During import log will generate 
> messages like '_GUID Updated: Entity..._'
> _Expected result:_ Import should succeed with messages indicating changes 
> entity's GUID.
> _Actual result_: Import fails with errors indicating schema violation 
> (_AtlasSchemaViolation_)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ATLAS-3902) Import Service: Importing Data With Differing GUIDs for Same Unique Attributes Causes Errors

2020-07-23 Thread Ashutosh Mestry (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Mestry updated ATLAS-3902:
---
Description: 
*Background*

Consider the scenario where 2 clusters containing Atlas are setup to be 
synchronized Atlas' export and import APIs. If the source Atlas has changes 
where table is dropped and re-created with same name. The table's entity within 
Atlas  will get a new GUID but will continue to have the same _qualifiedName_.

This case is handled within the Import API.

However, the case that is not handled is to perform similar update on to the 
table's storage descriptor.

*Steps to Duplicate*
 # Create a schema within Hive containing database, tables, columns and views. 
Atlas will reflect the changes. Perform export. Generate _s1.zip_.
 # Drop schema.
 # Re-create the same schema within Hive. Perform export. Generate _s2.zip_.
 # Clear Atlas database.
 # Import _s1.zip_. Observe _application.log_.
 # Import s2.zip. Observe _application.log_. During import log will generate 
messages like '_GUID Updated: Entity..._'

_Expected result:_ Import should succeed with messages indicating changes 
entity's GUID.

_Actual result_: Import fails with errors indicating schema violation 
(_AtlasSchemaViolation_)

  was:
*Background*

*Steps to Duplicate*
 # Create a schema within Hive containing database, tables, columns and views. 
Atlas will reflect the changes. Perform export. Generate _s1.zip_.
 # Drop schema.
 # Re-create the same schema within Hive. Perform export. Generate _s2.zip_.
 # Clear Atlas database.
 # Import _s1.zip_. Observe _application.log_.
 # Import s2.zip. Observe _application.log_. During import log will generate 
messages like '_GUID Updated: Entity..._'

_Expected result:_ Import should succeed with messages indicating changes 
entity's GUID.

_Actual result_: Import fails with errors indicating schema violation 
(_AtlasSchemaViolation_)


> Import Service: Importing Data With Differing GUIDs for Same Unique 
> Attributes Causes Errors
> 
>
> Key: ATLAS-3902
> URL: https://issues.apache.org/jira/browse/ATLAS-3902
> Project: Atlas
>  Issue Type: Bug
>  Components:  atlas-core
>Affects Versions: 2.0.0, trunk, 2.1.0
>Reporter: Ashutosh Mestry
>Assignee: Ashutosh Mestry
>Priority: Major
> Fix For: trunk, 2.1.0
>
>
> *Background*
> Consider the scenario where 2 clusters containing Atlas are setup to be 
> synchronized Atlas' export and import APIs. If the source Atlas has changes 
> where table is dropped and re-created with same name. The table's entity 
> within Atlas  will get a new GUID but will continue to have the same 
> _qualifiedName_.
> This case is handled within the Import API.
> However, the case that is not handled is to perform similar update on to the 
> table's storage descriptor.
> *Steps to Duplicate*
>  # Create a schema within Hive containing database, tables, columns and 
> views. Atlas will reflect the changes. Perform export. Generate _s1.zip_.
>  # Drop schema.
>  # Re-create the same schema within Hive. Perform export. Generate _s2.zip_.
>  # Clear Atlas database.
>  # Import _s1.zip_. Observe _application.log_.
>  # Import s2.zip. Observe _application.log_. During import log will generate 
> messages like '_GUID Updated: Entity..._'
> _Expected result:_ Import should succeed with messages indicating changes 
> entity's GUID.
> _Actual result_: Import fails with errors indicating schema violation 
> (_AtlasSchemaViolation_)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ATLAS-3902) Import Service: Importing Data With Differing GUIDs for Same Unique Attributes Causes Errors

2020-07-23 Thread Ashutosh Mestry (Jira)
Ashutosh Mestry created ATLAS-3902:
--

 Summary: Import Service: Importing Data With Differing GUIDs for 
Same Unique Attributes Causes Errors
 Key: ATLAS-3902
 URL: https://issues.apache.org/jira/browse/ATLAS-3902
 Project: Atlas
  Issue Type: Bug
  Components:  atlas-core
Affects Versions: 2.1.0, 2.0.0, trunk
Reporter: Ashutosh Mestry
Assignee: Ashutosh Mestry
 Fix For: trunk, 2.1.0


*Background*

*Steps to Duplicate*
 # Create a schema within Hive containing database, tables, columns and views. 
Atlas will reflect the changes. Perform export. Generate _s1.zip_.
 # Drop schema.
 # Re-create the same schema within Hive. Perform export. Generate _s2.zip_.
 # Clear Atlas database.
 # Import _s1.zip_. Observe _application.log_.
 # Import s2.zip. Observe _application.log_. During import log will generate 
messages like '_GUID Updated: Entity..._'

_Expected result:_ Import should succeed with messages indicating changes 
entity's GUID.

_Actual result_: Import fails with errors indicating schema violation 
(_AtlasSchemaViolation_)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [atlas] insertmike opened a new pull request #105: Fixed small typo on type system page

2020-07-23 Thread GitBox


insertmike opened a new pull request #105:
URL: https://github.com/apache/atlas/pull/105


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (ATLAS-3901) AD user default role

2020-07-23 Thread theo11 (Jira)


[ 
https://issues.apache.org/jira/browse/ATLAS-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163244#comment-17163244
 ] 

theo11 commented on ATLAS-3901:
---

Thank you for your response. If I understand you correctly, I can bind user to 
a group like:

 "userRoles": {

"theoad":  [ "DATA_STEWARD" ]

},

But of course this requires additional effort to maintain the list manually. Is 
there any example of how to sync AD users to a group automatically?

> AD user default role
> 
>
> Key: ATLAS-3901
> URL: https://issues.apache.org/jira/browse/ATLAS-3901
> Project: Atlas
>  Issue Type: Bug
>Reporter: theo11
>Priority: Major
>
> Hello,
> I'm having trouble to set up AD users to be correctly binded to DATA_STEWARD 
> role. Login works correctly, but user has no permissions like relationships 
> etc.
> Needed property in atlas-application.properties is set as follows:
> atlas.authentication.method.ldap.ad.default.role=DATA_STEWARD
> All roles are defaults from atlas-simple-authz-policy.json. There are no 
> related error entries in Atlas log.
> Could you advice?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)