[jira] [Commented] (ATLAS-2816) Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2

2018-08-15 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/ATLAS-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581936#comment-16581936
 ] 

ASF subversion and git services commented on ATLAS-2816:


Commit 24830e6bc8624ae6d4abbe4f9037f0de8ad57cd0 in atlas's branch 
refs/heads/master from [~chengbing.liu]
[ https://git-wip-us.apache.org/repos/asf?p=atlas.git;h=24830e6 ]

ATLAS-2816: Allow ignoring relationship in
 EntityGraphRetriever for FullTextMapperV2

Signed-off-by: apoorvnaik 


> Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2
> 
>
> Key: ATLAS-2816
> URL: https://issues.apache.org/jira/browse/ATLAS-2816
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Chengbing Liu
>Assignee: Apoorv Naik
>Priority: Major
> Attachments: ATLAS-2816.01.patch, ATLAS-2816.02.patch
>
>
> We encountered a problem when using Hive bridge in production. One database 
> has 5000+ tables. Importing the first table costs only tens of milliseconds, 
> and then it becomes slower with more tables. In the end, it costs 1~2 seconds 
> to import one table.
> After investigation, we realized that it is not necessary for the 
> {{FullTextMapperV2}} to retrieve all the relationship of the database each 
> time a table is imported. The time complexity of importing a whole database 
> actually goes to O(n^2) (n is number of tables).
> We propose to add a parameter to the constructor of {{EntityGraphRetriever}}: 
> {{ignoreRelationship}}. When set to true, {{mapVertexToAtlasEntity}} will 
> skip the {{mapRelationshipAttributes}} call. Since {{FullTextMapperV2}} will 
> not use relationship attributes of the entity, this can save plenty of time 
> when importing entities with a large number of relations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ATLAS-2816) Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2

2018-08-10 Thread Chengbing Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/ATLAS-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575843#comment-16575843
 ] 

Chengbing Liu commented on ATLAS-2816:
--

Thanks [~apoorvnaik] for review! Uploaded a new patch.

> Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2
> 
>
> Key: ATLAS-2816
> URL: https://issues.apache.org/jira/browse/ATLAS-2816
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Chengbing Liu
>Assignee: Apoorv Naik
>Priority: Major
> Attachments: ATLAS-2816.01.patch, ATLAS-2816.02.patch
>
>
> We encountered a problem when using Hive bridge in production. One database 
> has 5000+ tables. Importing the first table costs only tens of milliseconds, 
> and then it becomes slower with more tables. In the end, it costs 1~2 seconds 
> to import one table.
> After investigation, we realized that it is not necessary for the 
> {{FullTextMapperV2}} to retrieve all the relationship of the database each 
> time a table is imported. The time complexity of importing a whole database 
> actually goes to O(n^2) (n is number of tables).
> We propose to add a parameter to the constructor of {{EntityGraphRetriever}}: 
> {{ignoreRelationship}}. When set to true, {{mapVertexToAtlasEntity}} will 
> skip the {{mapRelationshipAttributes}} call. Since {{FullTextMapperV2}} will 
> not use relationship attributes of the entity, this can save plenty of time 
> when importing entities with a large number of relations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ATLAS-2816) Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2

2018-08-09 Thread Apoorv Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/ATLAS-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575740#comment-16575740
 ] 

Apoorv Naik commented on ATLAS-2816:


One suggestion, use the followReferences flag instead of hardcoding the 
ignoreRelationship param. This would make is easier to toggle if certain 
deployment scenario wants to use the relationship details to be captured in the 
entityText.

 

HTH

> Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2
> 
>
> Key: ATLAS-2816
> URL: https://issues.apache.org/jira/browse/ATLAS-2816
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Chengbing Liu
>Priority: Major
> Attachments: ATLAS-2816.01.patch
>
>
> We encountered a problem when using Hive bridge in production. One database 
> has 5000+ tables. Importing the first table costs only tens of milliseconds, 
> and then it becomes slower with more tables. In the end, it costs 1~2 seconds 
> to import one table.
> After investigation, we realized that it is not necessary for the 
> {{FullTextMapperV2}} to retrieve all the relationship of the database each 
> time a table is imported. The time complexity of importing a whole database 
> actually goes to O(n^2) (n is number of tables).
> We propose to add a parameter to the constructor of {{EntityGraphRetriever}}: 
> {{ignoreRelationship}}. When set to true, {{mapVertexToAtlasEntity}} will 
> skip the {{mapRelationshipAttributes}} call. Since {{FullTextMapperV2}} will 
> not use relationship attributes of the entity, this can save plenty of time 
> when importing entities with a large number of relations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ATLAS-2816) Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2

2018-08-09 Thread Chengbing Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/ATLAS-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575686#comment-16575686
 ] 

Chengbing Liu commented on ATLAS-2816:
--

[~apoorvnaik], I just found ATLAS-2815 removes 
{{mapRelationshipAttributes(entityVertex, entity)}} and then adds it back, 
looks like it's an accidental change?
 I will provide a patch based on the latest code today.

> Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2
> 
>
> Key: ATLAS-2816
> URL: https://issues.apache.org/jira/browse/ATLAS-2816
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Chengbing Liu
>Priority: Major
>
> We encountered a problem when using Hive bridge in production. One database 
> has 5000+ tables. Importing the first table costs only tens of milliseconds, 
> and then it becomes slower with more tables. In the end, it costs 1~2 seconds 
> to import one table.
> After investigation, we realized that it is not necessary for the 
> {{FullTextMapperV2}} to retrieve all the relationship of the database each 
> time a table is imported. The time complexity of importing a whole database 
> actually goes to O(n^2) (n is number of tables).
> We propose to add a parameter to the constructor of {{EntityGraphRetriever}}: 
> {{ignoreRelationship}}. When set to true, {{mapVertexToAtlasEntity}} will 
> skip the {{mapRelationshipAttributes}} call. Since {{FullTextMapperV2}} will 
> not use relationship attributes of the entity, this can save plenty of time 
> when importing entities with a large number of relations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ATLAS-2816) Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2

2018-08-09 Thread Apoorv Naik (JIRA)


[ 
https://issues.apache.org/jira/browse/ATLAS-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575327#comment-16575327
 ] 

Apoorv Naik commented on ATLAS-2816:


Please submit a patch or create a review on reviewboard with suggested changes.

> Allow ignoring relationship in EntityGraphRetriever for FullTextMapperV2
> 
>
> Key: ATLAS-2816
> URL: https://issues.apache.org/jira/browse/ATLAS-2816
> Project: Atlas
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Chengbing Liu
>Priority: Major
>
> We encountered a problem when using Hive bridge in production. One database 
> has 5000+ tables. Importing the first table costs only tens of milliseconds, 
> and then it becomes slower with more tables. In the end, it costs 1~2 seconds 
> to import one table.
> After investigation, we realized that it is not necessary for the 
> {{FullTextMapperV2}} to retrieve all the relationship of the database each 
> time a table is imported. The time complexity of importing a whole database 
> actually goes to O(n^2) (n is number of tables).
> We propose to add a parameter to the constructor of {{EntityGraphRetriever}}: 
> {{ignoreRelationship}}. When set to true, {{mapVertexToAtlasEntity}} will 
> skip the {{mapRelationshipAttributes}} call. Since {{FullTextMapperV2}} will 
> not use relationship attributes of the entity, this can save plenty of time 
> when importing entities with a large number of relations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)