[jira] [Commented] (ATLAS-3114) Issue with concurrent bulk inserts for entities

2019-04-06 Thread Madhan Neethiraj (JIRA)


[ 
https://issues.apache.org/jira/browse/ATLAS-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16811492#comment-16811492
 ] 

Madhan Neethiraj commented on ATLAS-3114:
-

[~ayushmnnit] - attached file  [^Error.txt]  doesn't have any error messages. 
Can you please look at Atlas server logs for any errors? Few observations from 
reviewing  [^model.json] and  [^AtlasClientV2Test.java]:

- relationship-type rdbms_db_table would add following relationship-attributes:
-- 'rdbms_tables' to entity-type rdbms_db
-- mandatory attribute 'table' to entity-type rdbms_table <== perhaps the end 
name should be 'db'?
- createTable() call doesn't set mandatory attribute 'table'. This should have 
resulted in an error. Can you please verify that the model attached here is the 
one you use in your runs?
- relationship-type rdbms_table_column would add following 
relationship-attributes:
-- 'columns' to entity-type rdbms_table
-- mandatory attribute 'column' to entity-type rdbms_column <== perhaps the end 
name should be 'table'?
- createColumn() call doesn't set mandatory attribute 'column'. This should 
have resulted in an error. Can you please verify that the model attached here 
is the one you use in your runs?
- If 'column' attribute is set while creating a column, there is no need to 
update the table entity with attribute ATTRIBUTE_COLUMNS

> Issue with concurrent bulk inserts for entities
> ---
>
> Key: ATLAS-3114
> URL: https://issues.apache.org/jira/browse/ATLAS-3114
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ayush Nigam
>Assignee: chaitali borole
>Priority: Major
> Attachments: AtlasClientV2Test.java, Error.txt, model.json
>
>
> We have a model with tables having attribute 'columns'  in which we are 
> attaching list of object ids for all columns once these are created. We are 
> using clientV2 java APIs.
> We are doing bulk operation for columns and parallelizing the tables.
> Sometimes the issue is that bulk creation for columns is successful,i.e. 
> atlas don't throw any exception but we get some columns as created,some as 
> updated,whereas as none of the columns existed before.Even it misses out some 
> entities while creating.Some are created and some are just silently missed 
> without throwing an exception.
> So to sum up issue is there for concurrent bulk create/update calls.It works 
> for concurrent single entity create/update calls.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ATLAS-3114) Issue with concurrent bulk inserts for entities

2019-04-06 Thread Ayush Nigam (JIRA)


[ 
https://issues.apache.org/jira/browse/ATLAS-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16811499#comment-16811499
 ] 

Ayush Nigam commented on ATLAS-3114:


Hi [~madhan.neethiraj] ,Thanks for looking into the issue. We are persisting 
entities top-down,so first top entities we will create,the bottom ones and then 
attach bottom entities to the top one later,and update the top one. So 
rdbms_source and rdbms_db were successfully created and we are getting issues 
only while we are trying to attach referenced entities..i.e. columns to table 
and then trying to update table.So for now we can focus on table and columns 
only.

Our assumption is that bulk upserts sometimes fail silently in concurrent 
requests,we never faced this issue with single request although that too was 
concurrent.

You can see the error in Error.txt at line 127 as :

AtlasStjava.util.concurrent.ExecutionException: 
org.apache.atlas.AtlasServiceException: Metadata service API 
com.intuit.idf.dataregistry.atlas.AtlasClientV2$API_V2@30b914f1 failed with 
status 404 (Not Found) Response Body 
(\{"errorCode":"ATLAS-404-00-00A","errorMessage":"Referenced entity 
AtlasObjectId{guid='null', typeName='rdbms_column', 
uniqueAttributes={qualifiedName:pool-1-thread-3:Table5:Column81}} is not 
found"})
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at 
com.intuit.idf.dataportal.alationbridge.AtlasClientV2Test.main(AtlasClientV2Test.java:69)
Caused by: org.apache.atlas.AtlasServiceException: Metadata service API 
com.intuit.idf.dataregistry.atlas.AtlasClientV2$API_V2@30b914f1 failed with 
status 404 (Not Found) Response Body 
(\{"errorCode":"ATLAS-404-00-00A","errorMessage":"Referenced entity 
AtlasObjectId{guid='null', typeName='rdbms_column', 
uniqueAttributes={qualifiedName:pool-1-thread-3:Table5:Column81}} is not 
found"})
    at 
org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:395)
    at 
org.apache.atlas.AtlasBaseClient.callAPIWithResource(AtlasBaseClient.java:323)
    at org.apache.atlas.AtlasBaseClient.callAPI(AtlasBaseClient.java:211)
    at 
com.intuit.idf.dataregistry.atlas.AtlasClientV2.createEntity(AtlasClientV2.java:547)
    at 
com.intuit.idf.dataportal.alationbridge.AtlasClientV2Test.lambda$0(AtlasClientV2Test.java:60)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

 

This is just the part of the code we have attached where we are facing 
problems..top entities rdbms_source,rbdms_db are already created successfully 
for us.

1) Explained below

2) We have attached just part of the code where we are facing issues,also is 
table a mandatory attribute by just mentioning it in a relationship name? 
Although your point is correct but should the code fail because of the wrong 
name of endDef..although type is correct?

3) Does mentioning name of an endDef in relationship makes an attribute 
mandatory? I have not faced this issue before,hence asking?(Same as above)

4) In line 59 of the code you can see we are setting mandatory attribute 
'columns'

5) Yes,but we are following a top down approach rather than a bottom up 
approach,so first we create a table with mandatory attribute columns as an 
empty list then we do bulk create of columns and attach it to to table and 
update the table.

> Issue with concurrent bulk inserts for entities
> ---
>
> Key: ATLAS-3114
> URL: https://issues.apache.org/jira/browse/ATLAS-3114
> Project: Atlas
>  Issue Type: Bug
>Reporter: Ayush Nigam
>Assignee: chaitali borole
>Priority: Major
> Attachments: AtlasClientV2Test.java, Error.txt, model.json
>
>
> We have a model with tables having attribute 'columns'  in which we are 
> attaching list of object ids for all columns once these are created. We are 
> using clientV2 java APIs.
> We are doing bulk operation for columns and parallelizing the tables.
> Sometimes the issue is that bulk creation for columns is successful,i.e. 
> atlas don't throw any exception but we get some columns as created,some as 
> updated,whereas as none of the columns existed before.Even it misses out some 
> entities while creating.Some are created and some are just silently missed 
> without throwing an exception.
> So to sum up issue is there for concurrent bulk create/update calls.It works 
> for concurrent single entity create/update calls.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 70304: Improvements to PC Framework

2019-04-06 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70304/#review214442
---


Ship it!




Ship It!

- Sarath Subramanian


On April 1, 2019, 3:15 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70304/
> ---
> 
> (Updated April 1, 2019, 3:15 p.m.)
> 
> 
> Review request for atlas, Madhan Neethiraj, Nixon Rodrigues, and Sarath 
> Subramanian.
> 
> 
> Bugs: ATLAS-3090
> https://issues.apache.org/jira/browse/ATLAS-3090
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> 
> - Additional methods added to _WorkItemManager_ and _WorkItemConsumer_.
> - Added ability to return results from consumers.
> - Added ability to restart tasks if they are done. 
> 
> **Description**
> - _getResults_ Fetch results from consumers.
> - _drain_ wait until existing tasks are completed.
> - _checkAndProduce_ add tasks only after adding to executor.
> 
> 
> Diffs
> -
> 
>   
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/migration/JsonNodeProcessManager.java
>  fb1e68448 
>   intg/src/main/java/org/apache/atlas/pc/WorkItemConsumer.java df2cb67dd 
>   intg/src/main/java/org/apache/atlas/pc/WorkItemManager.java 8ac6f115d 
>   intg/src/test/java/org/apache/atlas/pc/WorkItemConsumerTest.java 6c88b9e6b 
>   intg/src/test/java/org/apache/atlas/pc/WorkItemConsumerWithResultsTest.java 
> PRE-CREATION 
>   intg/src/test/java/org/apache/atlas/pc/WorkItemManagerWithResultsTest.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70304/diff/3/
> 
> 
> Testing
> ---
> 
> **Unit tests**
> New tests added.
> 
> **Pre-commit Build**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/999/
> 
> 
> Thanks,
> 
> Ashutosh Mestry
> 
>