[ https://issues.apache.org/jira/browse/ATLAS-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pinal Shah updated ATLAS-4882: ------------------------------ Description: *Issue:* Export during ingestion fails giving Found 0 entities in the logs Ingestion meaning Atlas is consuming messages *When is the issue seen?* It occurs when there is huge amount of data in backend and Atlas is consuming messages linked to entity of which export is running *Analysis to find Root cause:* * when there is huge amount of data in backend, export FAILS * when there is huge amount of data in backend but less tables under it, then also export FAILS * if background consumption stops, export PASS * if consumption is of different entities then requested in export, export PASS * export query to find starting object uses below query, where has clause to check property is expensive {code:java} g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('guid').values('_guid'){code} - has('__guid') queries solr [(35x_t <> null)]:vertex_index - below is the timetaken in the solr logs {code:java} 2024-06-14 02:38:56.218 INFO (qtp1158676965-19) [c:vertex_index s:shard1 r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request [vertex_index_shard1_replica_n1] webapp=/solr path=/select params={q=:&stateVer=vertex_index:12&fl=id&start=0&fq=35x_t:*+&rows=500000&wt=javabin&version=2} hits=1681928 status=0 QTime=4227 2024-06-14 02:40:23.945 INFO (qtp1158676965-16) [c:vertex_index s:shard1 r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request [vertex_index_shard1_replica_n1] webapp=/solr path=/select params={q=:&stateVer=vertex_index:12&fl=id&start=500000&fq=35x_t:*+&rows=500000&wt=javabin&version=2} hits=1682086 status=0 QTime=787 2024-06-14 02:41:37.703 INFO (qtp1158676965-14) [c:vertex_index s:shard1 r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request [vertex_index_shard1_replica_n1] webapp=/solr path=/select params={q=:&stateVer=vertex_index:12&fl=id&start=1000000&fq=35x_t:*+&rows=500000&wt=javabin&version=2} hits=1682216 status=0 QTime=1962 2024-06-14 02:42:20.715 INFO (qtp1158676965-20) [c:vertex_index s:shard1 r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request [vertex_index_shard1_replica_n1] webapp=/solr path=/select params={q=:&stateVer=vertex_index:12&fl=id&start=1500000&fq=35x_t:*+&rows=500000&wt=javabin&version=2} hits=1682363 status=0 QTime=4465 {code} - ran same query through gremlin shell while ingestion is happening it doesn't fail - time taken for above gremlin query in code when ingestion : 214825ms - time takem for above gremlin query in gremlin shell when ingestion : 104641ms - time taken for above gremlin query when no ingestion : 181682ms Still Root cause is unknown *WorkAround:* - Remove .has('__guid') clause from below, it is very quick {code:java} g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('guid').values('_guid'){code} was: *Issue:* Export during ingestion fails giving Found 0 entities in the logs Ingestion meaning Atlas is consuming messages *When is the issue seen?* It occurs when there is huge amount of data in backend and Atlas is consuming messages linked to entity of which export is running *Analysis to find Root cause:* * when there is huge amount of data in backend, export FAILS * when there is huge amount of data in backend but less tables under it, then also export FAILS * if background consumption stops, export PASS * if consumption is of different entities then requested in export, export PASS * export query to find starting object uses below query, where has clause to check property is expensive g.V().has('__typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('__guid').values('__guid') - has('__guid') queries [(35x_t <> null)]:vertex_index , checked timetaken in the solr logs 2024-06-14 02:38:56.218 INFO (qtp1158676965-19) [c:vertex_index s:shard1 r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request [vertex_index_shard1_replica_n1] webapp=/solr path=/select params=\{q=*:*&_stateVer_=vertex_index:12&fl=id&start=0&fq=35x_t:*+&rows=500000&wt=javabin&version=2} hits=1681928 status=0 QTime=4227 2024-06-14 02:40:23.945 INFO (qtp1158676965-16) [c:vertex_index s:shard1 r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request [vertex_index_shard1_replica_n1] webapp=/solr path=/select params=\{q=*:*&_stateVer_=vertex_index:12&fl=id&start=500000&fq=35x_t:*+&rows=500000&wt=javabin&version=2} hits=1682086 status=0 QTime=787 2024-06-14 02:41:37.703 INFO (qtp1158676965-14) [c:vertex_index s:shard1 r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request [vertex_index_shard1_replica_n1] webapp=/solr path=/select params=\{q=*:*&_stateVer_=vertex_index:12&fl=id&start=1000000&fq=35x_t:*+&rows=500000&wt=javabin&version=2} hits=1682216 status=0 QTime=1962 2024-06-14 02:42:20.715 INFO (qtp1158676965-20) [c:vertex_index s:shard1 r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request [vertex_index_shard1_replica_n1] webapp=/solr path=/select params=\{q=*:*&_stateVer_=vertex_index:12&fl=id&start=1500000&fq=35x_t:*+&rows=500000&wt=javabin&version=2} hits=1682363 status=0 QTime=4465 - ran same query through gremlin shell while ingestion is happening it doesnt fail - time taken for above gremlin query in code when ingestion : 214825ms - time takem for above gremlin query in gremlin shell when ingestion : 104641ms - time taken for above gremlin query when no ingestion : 181682ms WorkAround - Remove .has('__guid') clause from below, it is very quick g.V().has('__typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('__guid').values('__guid') > Export/Import: Export exits with "Found 0 entities" > ---------------------------------------------------- > > Key: ATLAS-4882 > URL: https://issues.apache.org/jira/browse/ATLAS-4882 > Project: Atlas > Issue Type: Bug > Components: atlas-core > Reporter: Pinal Shah > Assignee: Pinal Shah > Priority: Major > > *Issue:* > Export during ingestion fails giving Found 0 entities in the logs > Ingestion meaning Atlas is consuming messages > *When is the issue seen?* > It occurs when there is huge amount of data in backend and Atlas is consuming > messages linked to entity of which export is running > *Analysis to find Root cause:* > * when there is huge amount of data in backend, export FAILS > * when there is huge amount of data in backend but less tables under it, > then also export FAILS > * if background consumption stops, export PASS > * if consumption is of different entities then requested in export, export > PASS > * export query to find starting object uses below query, where has clause to > check property is expensive > {code:java} > g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('guid').values('_guid'){code} > - has('__guid') queries solr [(35x_t <> null)]:vertex_index > - below is the timetaken in the solr logs > {code:java} > 2024-06-14 02:38:56.218 INFO (qtp1158676965-19) [c:vertex_index s:shard1 > r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request > [vertex_index_shard1_replica_n1] webapp=/solr path=/select > params={q=:&stateVer=vertex_index:12&fl=id&start=0&fq=35x_t:*+&rows=500000&wt=javabin&version=2} > hits=1681928 status=0 QTime=4227 > 2024-06-14 02:40:23.945 INFO (qtp1158676965-16) [c:vertex_index s:shard1 > r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request > [vertex_index_shard1_replica_n1] webapp=/solr path=/select > params={q=:&stateVer=vertex_index:12&fl=id&start=500000&fq=35x_t:*+&rows=500000&wt=javabin&version=2} > hits=1682086 status=0 QTime=787 > 2024-06-14 02:41:37.703 INFO (qtp1158676965-14) [c:vertex_index s:shard1 > r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request > [vertex_index_shard1_replica_n1] webapp=/solr path=/select > params={q=:&stateVer=vertex_index:12&fl=id&start=1000000&fq=35x_t:*+&rows=500000&wt=javabin&version=2} > hits=1682216 status=0 QTime=1962 > 2024-06-14 02:42:20.715 INFO (qtp1158676965-20) [c:vertex_index s:shard1 > r:core_node2 x:vertex_index_shard1_replica_n1] o.a.s.c.S.Request > [vertex_index_shard1_replica_n1] webapp=/solr path=/select > params={q=:&stateVer=vertex_index:12&fl=id&start=1500000&fq=35x_t:*+&rows=500000&wt=javabin&version=2} > hits=1682363 status=0 QTime=4465 {code} > - ran same query through gremlin shell while ingestion is happening it > doesn't fail > - time taken for above gremlin query in code when ingestion > : 214825ms > - time takem for above gremlin query in gremlin shell when ingestion : > 104641ms > - time taken for above gremlin query when no ingestion > : 181682ms > Still Root cause is unknown > *WorkAround:* > - Remove .has('__guid') clause from below, it is very quick > {code:java} > g.V().has('_typeName','hive_db').has('Referenceable.qualifiedName','db6@cm').has('guid').values('_guid'){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)