[jira] [Commented] (ATLAS-4738) Dynamic Index Recovery issues and improvements
[ https://issues.apache.org/jira/browse/ATLAS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786965#comment-17786965 ] ASF subversion and git services commented on ATLAS-4738: Commit 52aaf934bc0a3a6d079fe7b1722fe49b16b3f666 in atlas's branch refs/heads/branch-2.0 from Radhika Kundam [ https://gitbox.apache.org/repos/asf?p=atlas.git;h=52aaf934b ] ATLAS-4738: Dynamic Index Recovery issues and improvements Signed-off-by: radhikakundam (cherry picked from commit 882954e2969d6018d2bc8977f9dc18a6e3d1d5ce) > Dynamic Index Recovery issues and improvements > -- > > Key: ATLAS-4738 > URL: https://issues.apache.org/jira/browse/ATLAS-4738 > Project: Atlas > Issue Type: Improvement > Components: atlas-core >Reporter: Radhika Kundam >Assignee: Radhika Kundam >Priority: Major > > # Though there is no issues with SOLR, on Atlas startup Index recovery is > getting started as it's considering 1970-01-01T00:00:00Z as default recovery > start time. > When no index recovery data is available by default it'll consider > recovery start time based on TTL i.e., (current time - tx.log TTL). > By default SOLR tx.log default ttl is configured as 10days. > 2. Custom start time configuration atlas.graph.index.recovery.start.time is > not working as it conflicts with janusgraph internal configurations. This > config should be updated to *atlas.index.recovery.start.time* > 3. When SOLR went down with existing architecture we note down start time > with a buffer of solr health checkup frequency which will be (current time - > retrytime of solr health). To avoid any possible corner case scenarios > updating the buffer two times of retry time so next recovery start time > should be (current time - 2 * retry time) > 4. Added REST support for index recovery which can give more flexibility to > the customer to start index recovery on-demand and will be valubale feature. > a. Get index recovery timing details > API: GET api/atlas/v2/indexrecovery/ > Response: > { "Latest start time": "2023-03-25T20:34:59.704Z", => Start > time of recent/upcoming Index recovery > "On-demand start time": > "2023-03-24T06:17:23.656Z", => Start time of index recovery requested through > REST API > "Previous start time": > "2023-03-25T20:24:51.583Z" => Start time of previous index recovery > } > b. Start index recovery at a specific time > API: POST > api/atlas/v2/indexrecovery/start?startTime=2023-03-24T06:17:23.656Z > Response: > On success => 200 OK > On empty start Time => 400 Bad request > On Index recovery failure => 500 Internal Server Error -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ATLAS-4738) Dynamic Index Recovery issues and improvements
[ https://issues.apache.org/jira/browse/ATLAS-4738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786964#comment-17786964 ] ASF subversion and git services commented on ATLAS-4738: Commit 882954e2969d6018d2bc8977f9dc18a6e3d1d5ce in atlas's branch refs/heads/master from Radhika Kundam [ https://gitbox.apache.org/repos/asf?p=atlas.git;h=882954e29 ] ATLAS-4738: Dynamic Index Recovery issues and improvements Signed-off-by: radhikakundam > Dynamic Index Recovery issues and improvements > -- > > Key: ATLAS-4738 > URL: https://issues.apache.org/jira/browse/ATLAS-4738 > Project: Atlas > Issue Type: Improvement > Components: atlas-core >Reporter: Radhika Kundam >Assignee: Radhika Kundam >Priority: Major > > # Though there is no issues with SOLR, on Atlas startup Index recovery is > getting started as it's considering 1970-01-01T00:00:00Z as default recovery > start time. > When no index recovery data is available by default it'll consider > recovery start time based on TTL i.e., (current time - tx.log TTL). > By default SOLR tx.log default ttl is configured as 10days. > 2. Custom start time configuration atlas.graph.index.recovery.start.time is > not working as it conflicts with janusgraph internal configurations. This > config should be updated to *atlas.index.recovery.start.time* > 3. When SOLR went down with existing architecture we note down start time > with a buffer of solr health checkup frequency which will be (current time - > retrytime of solr health). To avoid any possible corner case scenarios > updating the buffer two times of retry time so next recovery start time > should be (current time - 2 * retry time) > 4. Added REST support for index recovery which can give more flexibility to > the customer to start index recovery on-demand and will be valubale feature. > a. Get index recovery timing details > API: GET api/atlas/v2/indexrecovery/ > Response: > { "Latest start time": "2023-03-25T20:34:59.704Z", => Start > time of recent/upcoming Index recovery > "On-demand start time": > "2023-03-24T06:17:23.656Z", => Start time of index recovery requested through > REST API > "Previous start time": > "2023-03-25T20:24:51.583Z" => Start time of previous index recovery > } > b. Start index recovery at a specific time > API: POST > api/atlas/v2/indexrecovery/start?startTime=2023-03-24T06:17:23.656Z > Response: > On success => 200 OK > On empty start Time => 400 Bad request > On Index recovery failure => 500 Internal Server Error -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ATLAS-4769) Duplicate Relationships
[ https://issues.apache.org/jira/browse/ATLAS-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786703#comment-17786703 ] Toshiki Fukasawa commented on ATLAS-4769: - Please let me know if there are any mistakes in the steps to reproduce. Is this use case not expected? > Duplicate Relationships > --- > > Key: ATLAS-4769 > URL: https://issues.apache.org/jira/browse/ATLAS-4769 > Project: Atlas > Issue Type: Bug > Components: atlas-core >Affects Versions: 2.2.0, 2.3.0 > Environment: Apache Atlas Python client: 0.0.11 > Operating System: Rocky Linux 8.5 >Reporter: Toshiki Fukasawa >Priority: Major > Attachments: simple-test.py > > > When registering entities with the same qualifiedName and the same > relationship multiple times, unexpected behavior occurs. Specifically, even > though there is only one entity registered, the relationshipAttributes > section has multiple instances of the same entity. This inconsistency arises > when registering different entities in between the repeated registrations of > the same entity. > Expected Behavior: > The Ralationships section should accurately reflect the number of entities > registered with the specified qualifiedName and relationship. Duplicate > registrations should not result in the record of multiple instances of the > same entity in relationshipAttributes section. > Steps to Reproduce: > # Register the "dataA" entity with the qualified name "dataA_q" as > relationship of Process entity. > # Register the "dataB" entity with the qualified name "dataB_q" as > relationship of Process entity. > # Register the "dataC" entity with the qualified name "dataC_q" as > relationship of Process entity. > # Register the "dataB" entity with the same qualifiedName as relationship of > Process entity again. > # Register the "dataC" entity with the same qualifiedName as relationship of > Process entity again. > # Observe the Ralationships section. > In version 2.3.0, even if the order of registration is A->B->A->B, duplicate > relationships will occur. > Reproducible program: > I have created a program that demonstrates the issue: > simple-test.py > Running this will result in: > {noformat} > Recorded DataSet Entities > {"typeName": "DataSet", "attributes": {"qualifiedName": "dataA_q", "name": > "dataA"}, "guid": "471398f8-679b-4015-a60f-28bd3b4e315f", "status": "ACTIVE", > "displayText": "dataA", "classificationNames": [], "classifications": [], > "meaningNames": [], "meanings": null, "isIncomplete": false, "labels": []} > {"typeName": "DataSet", "attributes": {"qualifiedName": "dataB_q", "name": > "dataB"}, "guid": "bb6e6d93-38d8-4ae5-ac1d-3e5be880401e", "status": "ACTIVE", > "displayText": "dataB", "classificationNames": [], "classifications": [], > "meaningNames": [], "meanings": null, "isIncomplete": false, "labels": []} > {"typeName": "DataSet", "attributes": {"qualifiedName": "dataC_q", "name": > "dataC"}, "guid": "1026e5c9-acab-42a4-a4fe-ea6f08e4c81f", "status": "ACTIVE", > "displayText": "dataC", "classificationNames": [], "classifications": [], > "meaningNames": [], "meanings": null, "isIncomplete": false, "labels": []} > Recorded relationshipAttributes of the Process > {'guid': '471398f8-679b-4015-a60f-28bd3b4e315f', 'typeName': 'DataSet', > 'entityStatus': 'ACTIVE', 'displayText': 'dataA', 'relationshipType': > 'dataset_process_inputs', 'relationshipGuid': > '4e270e39-f592-473f-af5a-4976f7252428', 'relationshipStatus': 'DELETED', > 'relationshipAttributes': {'typeName': 'dataset_process_inputs'}} > {'guid': 'bb6e6d93-38d8-4ae5-ac1d-3e5be880401e', 'typeName': 'DataSet', > 'entityStatus': 'ACTIVE', 'displayText': 'dataB', 'relationshipType': > 'dataset_process_inputs', 'relationshipGuid': > 'bbe2c759-fc53-42f9-8d7a-6c2966ba25e1', 'relationshipStatus': 'DELETED', > 'relationshipAttributes': {'typeName': 'dataset_process_inputs'}} > {'guid': 'bb6e6d93-38d8-4ae5-ac1d-3e5be880401e', 'typeName': 'DataSet', > 'entityStatus': 'ACTIVE', 'displayText': 'dataB', 'relationshipType': > 'dataset_process_inputs', 'relationshipGuid': > 'b71cb1f3-7cc5-4b53-a35f-659d73b97c15', 'relationshipStatus': 'DELETED', > 'relationshipAttributes': {'typeName': 'dataset_process_inputs'}} > {'guid': '1026e5c9-acab-42a4-a4fe-ea6f08e4c81f', 'typeName': 'DataSet', > 'entityStatus': 'ACTIVE', 'displayText': 'dataC', 'relationshipType': > 'dataset_process_inputs', 'relationshipGuid': > 'cae464a4-50bc-4acb-b5e2-73bb9b2b9f75', 'relationshipStatus': 'DELETED', > 'relationshipAttributes': {'typeName': 'dataset_process_inputs'}} > {'guid': '1026e5c9-acab-42a4-a4fe-ea6f08e4c81f', 'typeName': 'DataSet', > 'entityStatus': 'ACTIVE', 'displayText': 'dataC', 'relationshipType': > 'dataset_process_inputs', 'relationshipGuid': > '5323ffcb-b996-4437-83fb-810d304fb45b',