Re: Review Request 73076: Deferred Actions Implementation

2021-04-13 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated April 13, 2021, 4:28 p.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Modified implementation of EntityGraphRetriever.traverseImpactedVertices to 
avoid using recursion. This was a result of testing done that caused 
StackOverflow exception when testing with 5K propagations.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**REST APIs**
+-++--+--+
| api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
fetched. |
|   |  |  | If list of guids supplied is empty all 
tasks |
|   |  |  | that are in PENDING state are fetched. 
If list   |
|   |  |  | guids is specified, tasks corresponding 
to that guid are fetched.|
+---+--+--+--+
|api/admin/task |DELETE| guids| List of guids to be deleted. No action 
is taken if list of guids |
|   |  |  | supplied is empty or a Task guid does 
not exist. |
+---+--+--+--+

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.
- Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for a 
given entity.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 4d8c94894 
  intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 27c7f7391 
  intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 4c7f8c681 
  intg/src/main/java/org/apache/atlas/type/Constants.java 7ee852086 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/DeleteHandlerDelegate.java
 1aaea64d7 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/DeleteHandlerV1.java
 7b2e2d371 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-04-08 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated April 8, 2021, 11:26 p.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include:
- Now using tasks for propagating tasks during entity creation.
- Additional refactoring. Removed dead code.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**REST APIs**
+-++--+--+
| api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
fetched. |
|   |  |  | If list of guids supplied is empty all 
tasks |
|   |  |  | that are in PENDING state are fetched. 
If list   |
|   |  |  | guids is specified, tasks corresponding 
to that guid are fetched.|
+---+--+--+--+
|api/admin/task |DELETE| guids| List of guids to be deleted. No action 
is taken if list of guids |
|   |  |  | supplied is empty or a Task guid does 
not exist. |
+---+--+--+--+

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.
- Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for a 
given entity.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 4d8c94894 
  intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 27c7f7391 
  intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 4c7f8c681 
  intg/src/main/java/org/apache/atlas/type/Constants.java 7ee852086 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/DeleteHandlerDelegate.java
 1aaea64d7 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/DeleteHandlerV1.java
 7b2e2d371 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/HardDeleteHandlerV1.java
 c241e2371 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-03-12 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated March 13, 2021, 1 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Addressed review comments.
- Modified approach for using _GraphTransactionInterceptor_. The factory now 
adds itself to _TaskManagement_. Removed _MethodInvocationAdapter_.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**REST APIs**
+-++--+--+
| api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
fetched. |
|   |  |  | If list of guids supplied is empty all 
tasks |
|   |  |  | that are in PENDING state are fetched. 
If list   |
|   |  |  | guids is specified, tasks corresponding 
to that guid are fetched.|
+---+--+--+--+
|api/admin/task |DELETE| guids| List of guids to be deleted. No action 
is taken if list of guids |
|   |  |  | supplied is empty or a Task guid does 
not exist. |
+---+--+--+--+

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.
- Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for a 
given entity.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 4d8c94894 
  intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 27c7f7391 
  intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 4c7f8c681 
  intg/src/main/java/org/apache/atlas/type/Constants.java 7ee852086 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 8208d11c2 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-03-12 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated March 12, 2021, 8:29 p.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Refactored the new MethodInvocationAdapter.
- Refactored _ClassificationPropagationTasks_.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**REST APIs**
+-++--+--+
| api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
fetched. |
|   |  |  | If list of guids supplied is empty all 
tasks |
|   |  |  | that are in PENDING state are fetched. 
If list   |
|   |  |  | guids is specified, tasks corresponding 
to that guid are fetched.|
+---+--+--+--+
|api/admin/task |DELETE| guids| List of guids to be deleted. No action 
is taken if list of guids |
|   |  |  | supplied is empty or a Task guid does 
not exist. |
+---+--+--+--+

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.
- Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for a 
given entity.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 4d8c94894 
  intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 27c7f7391 
  intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 4c7f8c681 
  intg/src/main/java/org/apache/atlas/type/Constants.java 7ee852086 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 8208d11c2 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 fe8699dca 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-03-11 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated March 12, 2021, 6:41 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Addressed review comments.
- Addressed the problem of postTransaction notifications not getting sent as 
GraphTransaction interceptor was not being called.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**REST APIs**
+-++--+--+
| api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
fetched. |
|   |  |  | If list of guids supplied is empty all 
tasks |
|   |  |  | that are in PENDING state are fetched. 
If list   |
|   |  |  | guids is specified, tasks corresponding 
to that guid are fetched.|
+---+--+--+--+
|api/admin/task |DELETE| guids| List of guids to be deleted. No action 
is taken if list of guids |
|   |  |  | supplied is empty or a Task guid does 
not exist. |
+---+--+--+--+

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.
- Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for a 
given entity.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 4d8c94894 
  intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 27c7f7391 
  intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 4c7f8c681 
  intg/src/main/java/org/apache/atlas/type/Constants.java 7ee852086 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 8208d11c2 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-03-11 Thread Ashutosh Mestry via Review Board


> On March 11, 2021, 8:52 p.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
> > Lines 2817 (patched)
> > 
> >
> > this will end up re-adding existing pending tasks to the list again.
> > 
> > Consider adding only the new taskGuid:
> > 
> > AtlasGraphUtilsV2.addEncodedProperty(entityVertex, 
> > PENDING_TASKS_PROPERTY_KEY, taskGuid);

Not sure how to do this, current implementation works fine.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222668
---


On March 11, 2021, 6:49 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated March 11, 2021, 6:49 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **REST APIs**
> +-++--+--+
> | api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
> fetched. |
> |   |  |  | If list of guids supplied is empty 
> all tasks |
> |   |  |  | that are in PENDING state are 
> fetched. If list   |
> |   |  |  | guids is specified, tasks 
> corresponding to that guid are fetched.|
> +---+--+--+--+
> |api/admin/task |DELETE| guids| List of guids to be deleted. No 
> action is taken if list of guids |
> |   |  |  | supplied is empty or a Task guid does 
> not exist. |
> +---+--+--+--+
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag checks.
> - Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for 
> a given entity.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
>   distro/src/conf/atlas-log4j.xml 7df963eb2 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   

Re: Review Request 73076: Deferred Actions Implementation

2021-03-11 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222669
---




repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Line 2007 (original), 2083 (patched)


consider reverting back this renaming (2007 and 2011) for better clarity



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
Lines 54 (patched)


consider renaming atlasTask => task



repository/src/main/java/org/apache/atlas/tasks/TaskRegistry.java
Lines 163 (patched)


atlasTask => task


- Sarath Subramanian


On March 10, 2021, 10:49 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated March 10, 2021, 10:49 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **REST APIs**
> +-++--+--+
> | api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
> fetched. |
> |   |  |  | If list of guids supplied is empty 
> all tasks |
> |   |  |  | that are in PENDING state are 
> fetched. If list   |
> |   |  |  | guids is specified, tasks 
> corresponding to that guid are fetched.|
> +---+--+--+--+
> |api/admin/task |DELETE| guids| List of guids to be deleted. No 
> action is taken if list of guids |
> |   |  |  | supplied is empty or a Task guid does 
> not exist. |
> +---+--+--+--+
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag checks.
> - Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for 
> a given entity.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
>   distro/src/conf/atlas-log4j.xml 7df963eb2 
>   

Re: Review Request 73076: Deferred Actions Implementation

2021-03-11 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222668
---




repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2817 (patched)


this will end up re-adding existing pending tasks to the list again.

Consider adding only the new taskGuid:

AtlasGraphUtilsV2.addEncodedProperty(entityVertex, 
PENDING_TASKS_PROPERTY_KEY, taskGuid);


- Sarath Subramanian


On March 10, 2021, 10:49 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated March 10, 2021, 10:49 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **REST APIs**
> +-++--+--+
> | api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
> fetched. |
> |   |  |  | If list of guids supplied is empty 
> all tasks |
> |   |  |  | that are in PENDING state are 
> fetched. If list   |
> |   |  |  | guids is specified, tasks 
> corresponding to that guid are fetched.|
> +---+--+--+--+
> |api/admin/task |DELETE| guids| List of guids to be deleted. No 
> action is taken if list of guids |
> |   |  |  | supplied is empty or a Task guid does 
> not exist. |
> +---+--+--+--+
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag checks.
> - Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for 
> a given entity.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
>   distro/src/conf/atlas-log4j.xml 7df963eb2 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 
> 4d8c94894 
>   intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 

Re: Review Request 73076: Deferred Actions Implementation

2021-03-10 Thread Ashutosh Mestry via Review Board


> On March 9, 2021, 6:27 a.m., Madhan Neethiraj wrote:
> > intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java
> > Lines 184 (patched)
> > 
> >
> > - getTaskVertex() => getAdditionalInfo()?
> > - @JsonIgnore on additionalInfo: is it not necessary to 
> > serialize/deserialize additionalInfo?

I have removed this field and now using an explicit call to fetch vertex using 
guid.


> On March 9, 2021, 6:27 a.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
> > Lines 152 (patched)
> > 
> >
> > Is this commit() only for changes to task status? Or does this include 
> > entity/classfication updates too? In its later, then wouldn't graph DB 
> > status be incorrect in case of Exception i.e. block at #137?

This is only for task changes.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222660
---


On March 8, 2021, 6:19 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated March 8, 2021, 6:19 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **REST APIs**
> +-++--+--+
> | api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
> fetched. |
> |   |  |  | If list of guids supplied is empty 
> all tasks |
> |   |  |  | that are in PENDING state are 
> fetched. If list   |
> |   |  |  | guids is specified, tasks 
> corresponding to that guid are fetched.|
> +---+--+--+--+
> |api/admin/task |DELETE| guids| List of guids to be deleted. No 
> action is taken if list of guids |
> |   |  |  | supplied is empty or a Task guid does 
> not exist. |
> +---+--+--+--+
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag 

Re: Review Request 73076: Deferred Actions Implementation

2021-03-10 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated March 11, 2021, 6:49 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include:
- Addressed review comments.
- Simplified _TaskExecutor_ implementation.
- Additional unit tests.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**REST APIs**
+-++--+--+
| api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
fetched. |
|   |  |  | If list of guids supplied is empty all 
tasks |
|   |  |  | that are in PENDING state are fetched. 
If list   |
|   |  |  | guids is specified, tasks corresponding 
to that guid are fetched.|
+---+--+--+--+
|api/admin/task |DELETE| guids| List of guids to be deleted. No action 
is taken if list of guids |
|   |  |  | supplied is empty or a Task guid does 
not exist. |
+---+--+--+--+

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.
- Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for a 
given entity.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 4d8c94894 
  intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 27c7f7391 
  intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 4c7f8c681 
  intg/src/main/java/org/apache/atlas/type/Constants.java 7ee852086 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 8208d11c2 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java
 fe8699dca 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-03-08 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222660
---




intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java
Lines 99 (patched)


It will help to add following comment for pendingTasks field:
  private Set pendingTasks; // read-only field i.e. value provided 
is ignored during entity create/update



intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java
Lines 544 (patched)


equals() and hashCode() need not take pendingTasks value into account.



intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java
Lines 184 (patched)


- getTaskVertex() => getAdditionalInfo()?
- @JsonIgnore on additionalInfo: is it not necessary to 
serialize/deserialize additionalInfo?



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Line 145 (original), 164 (patched)


Consider injecting taskManagement in the constructor. And mark it as final. 
Its value shouldn't depend on DEFERRED_ACTION_ENABLED.

And update setTasksUseFlag() from:
  setTasksUseFlag(TaskManagement taskManagement)
to:
  setTasksUseFlag(boolean useTasks)



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2053 (patched)


impacted-vertices list depends on classification being propagated - right? 
Some edges might explicitly not propagate a classification. Please review how 
this is handled? Perhaps replace:
 entityRetriever.getImpactedVerticesV2(entityVertex)
with:
 entityRetriever.getImpactedVerticesV2(entityVertex, null, 
classificationVertexId);



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2800 (patched)


Is this necessary here? Locks should be obtained much earlier in the 
add/update/remove classification sequence. Please review if attempting to 
obtain lock here can cause dead lock if 2 threads are holding lock on one 
entity each, and waiting to get lock on other entity.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2811 (patched)


Instead of re-adding exising pendingTasks, consider adding only taskGuid - 
the new task.
  AtlasGraphUtilsV2.addEncodedProperty(entityVertex, 
PENDING_TASKS_PROPERTY_KEY, taskGuid);



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2814 (patched)


The locks is supposed to be released after commit is performed; here it is 
released before commit. How does this lock help?

Please review similar usage in removePendingTaskFromEntityVertex() as well.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2838 (patched)


Why would taskManagement be null, if it is injected during construction?



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
Lines 1668 (patched)


ret might already be a set. In that case, creation of HasSet<> can be 
avoided.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
Lines 90 (patched)


Should this be rollback() in case of failures i.e. block at #85?



repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
Lines 42 (patched)


consumerThread is initialized only during construction - though in a 
private method called from constructor. Consider marking consumerThread as 
final, by moving the assignment in the private method to constructor.



repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
Lines 152 (patched)


Is this commit() only for changes to task status? Or does this include 
entity/classfication updates too? In its later, then wouldn't graph DB status 
be incorrect in case of Exception i.e. block at #137?



repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
Lines 57 (patched)


TaskRegistry must be injected for annotations on its methods to work - like 
GraphTransaction. Please review and update.




Re: Review Request 73076: Deferred Actions Implementation

2021-03-08 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222659
---


Ship it!




Ship It!

- Sarath Subramanian


On March 8, 2021, 10:19 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated March 8, 2021, 10:19 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **REST APIs**
> +-++--+--+
> | api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
> fetched. |
> |   |  |  | If list of guids supplied is empty 
> all tasks |
> |   |  |  | that are in PENDING state are 
> fetched. If list   |
> |   |  |  | guids is specified, tasks 
> corresponding to that guid are fetched.|
> +---+--+--+--+
> |api/admin/task |DELETE| guids| List of guids to be deleted. No 
> action is taken if list of guids |
> |   |  |  | supplied is empty or a Task guid does 
> not exist. |
> +---+--+--+--+
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag checks.
> - Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for 
> a given entity.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
>   distro/src/conf/atlas-log4j.xml 7df963eb2 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 
> 4d8c94894 
>   intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 27c7f7391 
>   intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 4c7f8c681 
>   intg/src/main/java/org/apache/atlas/type/Constants.java 7ee852086 
>   intg/src/test/resources/atlas-application.properties 7e74d5107 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> 86b369fc9 
>   
> 

Re: Review Request 73076: Deferred Actions Implementation

2021-03-08 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated March 8, 2021, 6:19 p.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**REST APIs**
+-++--+--+
| api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
fetched. |
|   |  |  | If list of guids supplied is empty all 
tasks |
|   |  |  | that are in PENDING state are fetched. 
If list   |
|   |  |  | guids is specified, tasks corresponding 
to that guid are fetched.|
+---+--+--+--+
|api/admin/task |DELETE| guids| List of guids to be deleted. No action 
is taken if list of guids |
|   |  |  | supplied is empty or a Task guid does 
not exist. |
+---+--+--+--+

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.
- Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for a 
given entity.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 771287f75 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 4d8c94894 
  intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 27c7f7391 
  intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 4c7f8c681 
  intg/src/main/java/org/apache/atlas/type/Constants.java 7ee852086 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 8208d11c2 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-03-03 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222654
---


Fix it, then Ship it!





intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java
Lines 98 (patched)


'pendingTasks'is never used?



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2781 (patched)


why createAndQueueTask() need both 'entityVertex' and 'entityGuid'. 
Consider passing only 'entityVertex' and retrieve 'entityGuid' from 
entityVertex to avoid duplicates in methods params.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2795 (patched)


'entityGuid' and 'entityVertex' doesn't need to be duplicated in methods 
params. Fetch entityGuid from entityVertex.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
Lines 78 (patched)


remove task guid from entity vertex after task completion?

entityGraphMapper.removePendingTaskFromEntityVertex(entityGuid, 
getTaskGuid())



repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
Lines 199 (patched)


REPORT_FREQUENCEY => REPORT_FREQUENCY


- Sarath Subramanian


On Feb. 25, 2021, 10:08 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Feb. 25, 2021, 10:08 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **REST APIs**
> +-++--+--+
> | api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
> fetched. |
> |   |  |  | If list of guids supplied is empty 
> all tasks |
> |   |  |  | that are in PENDING state are 
> fetched. If list   |
> |   |  |  | guids is specified, tasks 
> corresponding to that guid are fetched.|
> +---+--+--+--+
> |api/admin/task |DELETE| guids| List of guids to be deleted. No 
> action is taken if list of guids |
> |   |  |  | supplied is empty or a Task guid does 
> not exist. |
> 

Re: Review Request 73076: Deferred Actions Implementation

2021-02-25 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Feb. 26, 2021, 6:08 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Addressed failing Basic Search tests due to new fields.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**REST APIs**
+-++--+--+
| api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
fetched. |
|   |  |  | If list of guids supplied is empty all 
tasks |
|   |  |  | that are in PENDING state are fetched. 
If list   |
|   |  |  | guids is specified, tasks corresponding 
to that guid are fetched.|
+---+--+--+--+
|api/admin/task |DELETE| guids| List of guids to be deleted. No action 
is taken if list of guids |
|   |  |  | supplied is empty or a Task guid does 
not exist. |
+---+--+--+--+

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.
- Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for a 
given entity.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 4d8c94894 
  intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 27c7f7391 
  intg/src/main/java/org/apache/atlas/type/AtlasTypeRegistry.java 4c7f8c681 
  intg/src/main/java/org/apache/atlas/type/Constants.java 7ee852086 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 8208d11c2 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-02-24 Thread Ashutosh Mestry via Review Board


> On Feb. 19, 2021, 8:58 p.m., Madhan Neethiraj wrote:
> > common/src/main/java/org/apache/atlas/repository/Constants.java
> > Lines 212 (patched)
> > 
> >
> > - TASK_TYPE_PROPERTY_KEY and TASK_TYPE are duplicates
> > - Is TASK_TYPE_NAME needed, given the type name will be stored in 
> > TASK_TYPE

TASK_TYPE_PROPERTY_KEY is the vertex type. Helps with retrieval 
(getAllTaskTypes). TASK_TYPE is the type of task e.g. PROPAGATION_ADD, 
PROPAGATION_DELETE, etc.


> On Feb. 19, 2021, 8:58 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
> > Lines 52 (patched)
> > 
> >
> > It looks like tasks are saved in the store only just before execution - 
> > in TaskConsumer.run(). That could be after a long time from this point. 
> > This increases the chances of loosing the tasks - consider Atlas 
> > shutdown/goes down/becomes passive. Consider saving all tasks to store 
> > before adding to taskList, to minimize the chances of loosing tasks.

Tasks are committed as part of the parent transaction. They are passed on for 
execution once the transaction succeeds. If parent transactions fails, the task 
is removed. If it succeeds and server shuts down, they are retrieved upon 
server startup as part of pending transaction processing.


> On Feb. 19, 2021, 8:58 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
> > Lines 92 (patched)
> > 
> >
> > Task execution is skipped if task.getAdditionalInfo() is null; purpose 
> > of this condition is not clear. Consider adding a comment.

I have updated the name of the method from getAdditionalInformation to 
getTaskVertex.


> On Feb. 19, 2021, 8:58 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
> > Lines 100 (patched)
> > 
> >
> > When will such tasks (that exceeded retry count) be removed from store?

The tasks won't ever be removed. Only potential solution is to run purge 
operation.


> On Feb. 19, 2021, 8:58 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
> > Lines 123 (patched)
> > 
> >
> > - COMPLETE_WITH_ERRORS => FAILED?
> > - when will failed tasks be removed from the store?

In current logic, never.


> On Feb. 19, 2021, 8:58 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
> > Lines 54 (patched)
> > 
> >
> > In case of HA deployment, startInternal() should be deferred until 
> > instanceIsActive() is called. Please refer to NotificationHookConsumer.

I figured out the problem. I have addresed it.


> On Feb. 19, 2021, 8:58 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
> > Lines 73 (patched)
> > 
> >
> > In case of HA deployment, instanceIsPassive() should stop processing 
> > queuedTasks, as another instance will start processing soon.

In case of HA, passive instance will not process tasks. This is in line with 
other components.


> On Feb. 19, 2021, 8:58 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
> > Lines 122 (patched)
> > 
> >
> > Why only pending tasks? All tasks in the store are to be executed, 
> > right?

That is right. All tasks that are available should be run.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222619
---


On Feb. 19, 2021, 3:49 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Feb. 19, 2021, 3:49 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The 

Re: Review Request 73076: Deferred Actions Implementation

2021-02-24 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Feb. 25, 2021, 5:27 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Addressed review comments.
- Updated/modified REST APIs.
- Added _Statistics_ class that manages periodic updates to tasks.log.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description (updated)
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**REST APIs**
+-++--+--+
| api/admin/tasks   | GET  |  guids   |  (optional) List of guids to be 
fetched. |
|   |  |  | If list of guids supplied is empty all 
tasks |
|   |  |  | that are in PENDING state are fetched. 
If list   |
|   |  |  | guids is specified, tasks corresponding 
to that guid are fetched.|
+---+--+--+--+
|api/admin/task |DELETE| guids| List of guids to be deleted. No action 
is taken if list of guids |
|   |  |  | supplied is empty or a Task guid does 
not exist. |
+---+--+--+--+

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.
- Added _AtlasEntity.getPendingTasks()_ that indicidate presence of tasks for a 
given entity.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasEntity.java 4d8c94894 
  intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasEntityType.java 27c7f7391 
  intg/src/main/java/org/apache/atlas/type/Constants.java 7ee852086 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 8208d11c2 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-02-19 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222619
---




common/src/main/java/org/apache/atlas/repository/Constants.java
Lines 212 (patched)


- TASK_TYPE_PROPERTY_KEY and TASK_TYPE are duplicates
- Is TASK_TYPE_NAME needed, given the type name will be stored in TASK_TYPE



repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java
Lines 246 (patched)


Can taskManagement be null, given it is injected? Please review.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 159 (patched)


Shouldn't taskManagement be injected, similar to 
GraphTransactionInterceptor? If yes:
 - remove getTaskManagement()
 - mark as final



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 1965 (patched)


task should be added only if propagateTags is true, right? Consider 
replacing this 'if' with:
  if (propagateTags && DEFERRED_ACTION_ENABLED) {



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2035 (patched)


LOG.warn("propagateClassification(entityGuid={}, 
classificationVertexId={}): entity vertex not found", entityGuid, 
classificationVertexId);



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2041 (patched)


- consider changing the log level to debug, to avoid noise in the log file
- LOG.debug("propagateClassification(entityGuid={}, 
classificationVertexId={}): found no entities to propagate the classification", 
entityGuid, classificationVertexId);



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2050 (patched)


- classificationName will be null, if classificationVertex is null; so the 
log message may not be helpful. Consider updating to:
  LOG.warn("propagateClassification(entityGuid={}, 
classificationVertexId={}): classification vertex not found", entityGuid, 
classificationVertexId);
- move #2049 - #2052, immediately after #2045?



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2063 (patched)


- LOG.error("propagateClassification(entityGuid={}, 
classificationVertexId={}): error while propagating classification", 
entityGuid, classificationVertexId, e);



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
Lines 40 (patched)


Is graph.commit() needed in #40, #56 and #72 - given entityGraphMapper 
methods called have @GraphTransaction annotation?



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
Lines 41 (patched)


Is it necessary to explicitly call 
GraphTransactionInterceptor.releaseLock()? Shouldn't graph commit (via 
@GraphTransaction) already handle releasing the lock?

Also, when is the lock obtained?



repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
Lines 52 (patched)


It looks like tasks are saved in the store only just before execution - in 
TaskConsumer.run(). That could be after a long time from this point. This 
increases the chances of loosing the tasks - consider Atlas shutdown/goes 
down/becomes passive. Consider saving all tasks to store before adding to 
taskList, to minimize the chances of loosing tasks.



repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
Lines 91 (patched)


getCreateVertex() => getVertex()?
  - is it necessary to create vertex if not found?



repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
Lines 92 (patched)


Task execution is skipped if task.getAdditionalInfo() is null; purpose of 
this condition is not clear. Consider adding a comment.



repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
Lines 100 (patched)


When will such tasks (that exceeded retry count) be removed from store?



repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
Lines 123 (patched)

Re: Review Request 73076: Deferred Actions Implementation

2021-02-18 Thread Ashutosh Mestry via Review Board


> On Feb. 18, 2021, 7:14 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
> > Lines 2780 (patched)
> > 
> >
> > the taskDef guid should be removed from 
> > CLASSIFICATION_VERTEX_PENDING_PROPAGATIONS_KEY in classification vertex 
> > once the deferred action is complete. Please review.

Dropping this since the entity vertex is going to be updated.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222609
---


On Feb. 19, 2021, 3:49 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Feb. 19, 2021, 3:49 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag checks.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
>   distro/src/conf/atlas-log4j.xml 7df963eb2 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
>   intg/src/test/resources/atlas-application.properties 7e74d5107 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> 86b369fc9 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
>  d0ffb853f 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2cfcc0bc1 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
>  8208d11c2 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
>  PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java 
> PRE-CREATION 
>   

Re: Review Request 73076: Deferred Actions Implementation

2021-02-18 Thread Ashutosh Mestry via Review Board


> On Feb. 18, 2021, 7:52 a.m., Sarath Subramanian wrote:
> > one improvement to add (can be taken up in a follow-up patch) - add the 
> > task id to the entity vertex as well, so from entity details page we will 
> > know how many pending tasks are in process.

I will address this in this patch.


> On Feb. 18, 2021, 7:52 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
> > Lines 89 (patched)
> > 
> >
> > this method dispatches all taskDefs, consider renaming to 
> > "dispatchAllTasks"

Dropping this, since this is the place where tasks are added.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222610
---


On Feb. 19, 2021, 3:49 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Feb. 19, 2021, 3:49 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag checks.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
>   distro/src/conf/atlas-log4j.xml 7df963eb2 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
>   intg/src/test/resources/atlas-application.properties 7e74d5107 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> 86b369fc9 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
>  d0ffb853f 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2cfcc0bc1 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
>  8208d11c2 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
>  PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
> PRE-CREATION 
>   

Re: Review Request 73076: Deferred Actions Implementation

2021-02-18 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Feb. 19, 2021, 3:49 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Addressed review comments (20+).
- Rebased.
- Refactored for simplicity.
- Reduced code.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/tasks/AtlasTask.java PRE-CREATION 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 8208d11c2 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskLogger.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskRegistry.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/TestModules.java 71b0a4a68 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 7de3536f4 
  
server-api/src/main/java/org/apache/atlas/listener/ActiveStateChangeHandler.java
 bb7f8fcc4 
  test-tools/src/main/resources/log4j.properties 4db0598ad 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-02-17 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222608
---



Here are comments for partial review; will complete the review in a day. Thanks!


common/src/main/java/org/apache/atlas/repository/Constants.java
Lines 182 (patched)


Instead of adding __pendingPropagations to classification vertices, it will 
be helpful to add __pendingTasks at entity-vertex.

Consider adding following:
  - AtlasEntity.hasPendingTasks flag, so that UI can prompt the user of 
pending-tasks on the entity
  - a REST API to retrieve pending task IDs/details on a given entity

ID of deferred-action should be added to __pendingTasks of the entity on 
which the classification was added/updated/deleted. In case of relationship 
create/update or tag-propagation update it should be added to entity from which 
the propagation needs to be recomputed.



common/src/main/java/org/apache/atlas/repository/Constants.java
Lines 211 (patched)


Consider using a task-specific guid field - like "__taskGuid", to avoid 
potential collision with entity guids.

In fact, consider prefixing all task-vertex properties with "__task_" - 
like:
  - __task_guid
  - __task_type
  - __task_created_time
  - __task_updated_time



distro/src/conf/atlas-log4j.xml
Lines 64 (patched)


please add maxBackupIndex as well - similar to other appenders.



intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java
Lines 39 (patched)


TaskDef => AtlasTask
 - to be inline with AtlasEntity, AtlasClassification, AtlasGlossary, ..



intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java
Lines 198 (patched)


retryCount => attemptCount



intg/src/main/java/org/apache/atlas/utils/AtlasJson.java
Lines 176 (patched)


fromLinkedHashMap() - this method doesn't seem to be used. Please review 
and remove if unused.



repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java
Lines 241 (patched)


Is it not possible to get reference to TaksManagement with @Inject, since 
TaskManagement is a @Component?



repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
Lines 337 (patched)


Only whitespace changes in GraphBackedSearchIndexer? Shouldn't new 
property-keys for AtlasTask be cread here?



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2786 (patched)


Couldn't taskManagement be @Injected, similar to bunch of other members in 
this class?



repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java
Lines 26 (patched)


This implementation exists in 
webapp/src/main/java/org/apache/atlas/BeanUtil.java as well. Consider moving 
BeanUtil class to repository, to avoid duplication.


- Madhan Neethiraj


On Feb. 17, 2021, 5:55 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Feb. 17, 2021, 5:55 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 

Re: Review Request 73076: Deferred Actions Implementation

2021-02-17 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222610
---


Fix it, then Ship it!




one improvement to add (can be taken up in a follow-up patch) - add the task id 
to the entity vertex as well, so from entity details page we will know how many 
pending tasks are in process.


intg/src/main/java/org/apache/atlas/model/instance/AtlasClassification.java
Lines 93 (patched)


nit: setPendingPropagations(other.getPendingPropagations());



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
Lines 102 (patched)


revert changes to this file (not related to deferred action)



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
Lines 353 (patched)


getPendingProagations => getPendingPropagations



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
Lines 50 (patched)


update classification vertex to remove taskDef id from pendingPropagations 
propery here.

same applies for line 69, 87



repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
Lines 89 (patched)


this method dispatches all taskDefs, consider renaming to "dispatchAllTasks"


- Sarath Subramanian


On Feb. 16, 2021, 9:55 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Feb. 16, 2021, 9:55 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag checks.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
>   distro/src/conf/atlas-log4j.xml 7df963eb2 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   intg/src/main/java/org/apache/atlas/model/instance/AtlasClassification.java 
> 1edf4eeaf 
>   intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
>   intg/src/test/resources/atlas-application.properties 7e74d5107 
>   

Re: Review Request 73076: Deferred Actions Implementation

2021-02-17 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222609
---




repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2780 (patched)


the taskDef guid should be removed from 
CLASSIFICATION_VERTEX_PENDING_PROPAGATIONS_KEY in classification vertex once 
the deferred action is complete. Please review.


- Sarath Subramanian


On Feb. 16, 2021, 9:55 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Feb. 16, 2021, 9:55 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag checks.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
>   distro/src/conf/atlas-log4j.xml 7df963eb2 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   intg/src/main/java/org/apache/atlas/model/instance/AtlasClassification.java 
> 1edf4eeaf 
>   intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
>   intg/src/test/resources/atlas-application.properties 7e74d5107 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> 86b369fc9 
>   
> repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
>  d0ffb853f 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  ce58e9aa4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2cfcc0bc1 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
>  8208d11c2 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
>  PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
> PRE-CREATION 
>   

Re: Review Request 73076: Deferred Actions Implementation

2021-02-16 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Feb. 17, 2021, 5:55 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Addressed review comments.
- Minor refactoring.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasClassification.java 
1edf4eeaf 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
 d0ffb853f 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 ce58e9aa4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphRetriever.java
 8208d11c2 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskLogger.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskRegistry.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-02-11 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222588
---




common/src/main/java/org/apache/atlas/repository/Constants.java
Lines 97 (patched)


register this new property - 'PENDING_PROPAGATIONS' as SET type in 
GraphBackedSearchIndexer ('createCommonVertexIndex')



intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java
Lines 56 (patched)


consider making 'parameters' of Map type. serialize to string during 
writing to vertex (TaskRegistry.createVertex()) and deserialize when reading 
from vertex (TaskRegistry.toTaskDef()).



intg/src/main/java/org/apache/atlas/type/AtlasType.java
Lines 147 (patched)


remove unused method - 'fromLinkedHashMap', also remove usunsed method 
AtlasJson.fromLinkedHashMap()



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2780 (patched)


for LIST/SET type attributes, use 'AtlasGraphUtilsV2.addEncodedProperty()' 
to add multi-values to vertex. Please review.

for (String pendingPropagationGuid : 
classification.getPendingPropagations()) {
AtlasGraphUtilsV2.addEncodedProperty(classificationVertex, 
PENDING_PROPAGATIONS, classification.getPendingPropagations());
}



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
Lines 30 (patched)


"classificationVertexIds" => "classificationVertexId"


- Sarath Subramanian


On Feb. 11, 2021, 11:12 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Feb. 11, 2021, 11:12 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag checks.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
>   distro/src/conf/atlas-log4j.xml 7df963eb2 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   intg/src/main/java/org/apache/atlas/model/instance/AtlasClassification.java 
> 1edf4eeaf 
>   intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
>   

Re: Review Request 73076: Deferred Actions Implementation

2021-02-11 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Feb. 11, 2021, 7:12 p.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Minor updates.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasClassification.java 
1edf4eeaf 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 ce58e9aa4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskLogger.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskRegistry.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 7de3536f4 
  
server-api/src/main/java/org/apache/atlas/listener/ActiveStateChangeHandler.java
 bb7f8fcc4 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-02-10 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Feb. 11, 2021, 6:37 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Refactoring. Removed class _EntityGraphMapperWithTasks_.
- Simplified individual task methods.
- Addressed review comments.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/instance/AtlasClassification.java 
1edf4eeaf 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 ce58e9aa4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskLogger.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskRegistry.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 7de3536f4 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-02-10 Thread Ashutosh Mestry via Review Board


> On Feb. 5, 2021, 7:16 a.m., Madhan Neethiraj wrote:
> > intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java
> > Lines 65 (patched)
> > 
> >
> > Since TaskDef will be serialized/deserialized, consider using 
> > Map as type for additionalInfo, instead of Object.

This has been decored with @JsonIgnore. Hence will not get serialized.


> On Feb. 5, 2021, 7:16 a.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
> > Lines 1926 (patched)
> > 
> >
> > List of entities to propagate must be computed while processing the 
> > entity creation/update; the result of this computation must be stored in 
> > the deferred-action and executed later. Computation of entities to 
> > propagate the classification must not be deferred. Please review and update.

Computing is time consuming in case of large datasets.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222558
---


On Feb. 10, 2021, 6:40 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Feb. 10, 2021, 6:40 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> **New Updates**
> - Separate log file for tasks.
> - TaskDef now has _startTime_ and _endTime_ to indicating start and end of 
> task.
> - Refactored parameter passing during task creation. It now accepts Map 
> instead of Object array. Each task creates its own set of parameters.
> - Reduced use of TASKS_ENABLE flag checks.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
>   distro/src/conf/atlas-log4j.xml 7df963eb2 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
>   intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
>   intg/src/test/resources/atlas-application.properties 7e74d5107 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> 86b369fc9 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  ce58e9aa4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2cfcc0bc1 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
>  PRE-CREATION 
>   

Re: Review Request 73076: Deferred Actions Implementation

2021-02-09 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Feb. 10, 2021, 6:40 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updated decription to include improvements with latest patch.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description (updated)
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.

**New Updates**
- Separate log file for tasks.
- TaskDef now has _startTime_ and _endTime_ to indicating start and end of task.
- Refactored parameter passing during task creation. It now accepts Map instead 
of Object array. Each task creates its own set of parameters.
- Reduced use of TASKS_ENABLE flag checks.


Diffs
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 ce58e9aa4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskLogger.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskRegistry.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 7de3536f4 
  

Re: Review Request 73076: Deferred Actions Implementation

2021-02-09 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Feb. 10, 2021, 6:25 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Addressed review comments.
- Rebased.
- Manual tests.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.


Diffs (updated)
-

  common/src/main/java/org/apache/atlas/repository/Constants.java 61abfcaca 
  distro/src/conf/atlas-log4j.xml 7df963eb2 
  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 ce58e9aa4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskLogger.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskRegistry.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 7de3536f4 
  
server-api/src/main/java/org/apache/atlas/listener/ActiveStateChangeHandler.java
 bb7f8fcc4 
  test-tools/src/main/resources/log4j.properties 4db0598ad 
  webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java 
e8fc111a6 
  webapp/src/test/java/org/apache/atlas/web/resources/AdminResourceTest.java 
77422b2a5 


Diff: 

Re: Review Request 73076: Deferred Actions Implementation

2021-02-05 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222569
---




intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java
Lines 61 (patched)


Use Date datatype for createdTime and updatedTime - instead of long - 
similar to other REST API classes like AtlasEntity, AtlasBaseTypeDef.



repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java
Lines 240 (patched)


processTasks() => submitDefferedTasks()



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 1926 (patched)


After more thinking, it seems appropriate to compute entities-to-propagate 
during execution of deferred-action, to handle changes to relationships along 
the lineage path. For example: consider addition of a new view; at the time of 
view entity creation,  its source tables may not have all classifications, 
since due to pending deferred tasks. In such case, if entities-to-propagate was 
compuated inline, then some classifications wouldn't be propagated to the view 
entity.

Hence, current approach of computing entities-to-propagate in 
deferred-action looks good!

It will help to record guid of the deferred-action in the classification. 
It will help the user to know that the classification hasn't yet been 
propagated; and UI can render the status appropriately.

  class AtlasClassification {
...

List pendingPropagations;
  }


- Madhan Neethiraj


On Jan. 25, 2021, 11:08 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Jan. 25, 2021, 11:08 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
>   intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
>   intg/src/test/resources/atlas-application.properties 7e74d5107 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> 86b369fc9 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  ce58e9aa4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2cfcc0bc1 
>   
> 

Re: Review Request 73076: Deferred Actions Implementation

2021-02-05 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222552
---




repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2324 (patched)


'if' check is duplicated in line 2324 and 2323. review and remove 2324



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
Lines 43 (patched)


consider using classificationVertexIds as argument to 
entityGraphMapper.propagateClassifications() and   
entityGraphMapper.updateClassificationsPropagation();

internally this fires a graph query to get the classification vertex which 
can fetched with classification verttex id if passed.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
Lines 83 (patched)


why is 0th element always passed? should it be 
classificationVertexIds.get(i) ?



repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
Lines 133 (patched)


duplicated check. This check is already made in the caller - 
startInternal() review and remove.


- Sarath Subramanian


On Jan. 25, 2021, 3:08 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Jan. 25, 2021, 3:08 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
>   intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
>   intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
>   intg/src/test/resources/atlas-application.properties 7e74d5107 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> 86b369fc9 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  ce58e9aa4 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2cfcc0bc1 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
>  

Re: Review Request 73076: Deferred Actions Implementation

2021-02-04 Thread Madhan Neethiraj

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222558
---




intg/src/main/java/org/apache/atlas/AtlasConfiguration.java
Lines 79 (patched)


DEFERRED_ACTION_ENABLED("atlas.deferred.action.enabled", true)



intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java
Lines 45 (patched)


"model" package should only have data that is exchanged with Atlas clients 
- via REST API or notifications. Please move these constants out of model 
package, to org.apache.atlas.repository.Constants - along with other vertext 
property constants



intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java
Lines 63 (patched)


Consider using Map as type for parameters.



intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java
Lines 64 (patched)


Consider tracking start and end time of task execution.



intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java
Lines 65 (patched)


Since TaskDef will be serialized/deserialized, consider using Map as type for additionalInfo, instead of Object.



intg/src/main/java/org/apache/atlas/utils/AtlasJson.java
Lines 175 (patched)


the implementation doesn't seem to require/enforce that paramter is a 
linkedHashMap. If this is irrelevant, consider remaing this method to 
fromObject():

  public static  T fromObject(Object obj, Class type) {
T ret = null;

if (obj != null) {
  ret = mapper.convertValue(obj, type);

  if (ret instanceof Struct) {
((Struct) ret).normalize();
  }
}

return ret;
  }
  
Rename AtlasType.fromLinkedHashMap() as well, to AtlasType.fromObject().



repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java
Lines 241 (patched)


Consider removing "!AtlasConfiguration.TASKS_USE_ENABLED.getBoolean()" from 
here, as RequestContext.get().getQueuedTasks() must be empty is 
deferred-actions are disabled. It better to isolate this flag to fewer places.

Also, in what condition would "BeanUtilRepo.getBean(TaskManagement.class)" 
be null? It might be better to initialize this once, instead of calling for 
each transaction.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
Line 119 (original), 122 (patched)


I suggest this logic, of forking on 
AtlasConfiguration.TASKS_USE_ENABLED.getBoolean(), be buried inside 
EntityGraphRetriever - to minimize exposing TASKS_USE_ENABLED.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 1926 (patched)


List of entities to propagate must be computed while processing the entity 
creation/update; the result of this computation must be stored in the 
deferred-action and executed later. Computation of entities to propagate the 
classification must not be deferred. Please review and update.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Line 2060 (original), 2108 (patched)


Move #2118 - #2121 to here:

  if (isPropagationEnabled(classificationVertex)) {
if (AtlasConfiguration.TASKS_USE_ENABLED.getBoolean()) {
  
RequestContext.get().addClassificationVertexId(classificationVertex.getId().toString());

  entityVertices = new ArrayList<>();
} else {
  entityVertices = 
deleteDelegate.getHandler().removeTagPropagation(classificationVertex);

  if (LOG.isDebugEnabled()) {
LOG.debug("Number of propagations to delete -> {}", 
entityVertices.size());
  }
}
  } else {
   entityVertices = new ArrayList<>();
  }



repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
Lines 92 (patched)


Move private methods after all public and protected methods.



repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
Lines 100 (patched)


Consider changing parameters to a Map, instead of an array.



server-api/src/main/java/org/apache/atlas/RequestContext.java
Lines 65 (patched)


its not clear what 'classificationVertexIds' mean in request context. If 
this is about 

Re: Review Request 73076: Deferred Actions Implementation

2021-01-25 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Jan. 25, 2021, 11:08 p.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Addressed review comments.
- Minor refactoring.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 08d6c9d4a 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 ce58e9aa4 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 7de3536f4 
  
server-api/src/main/java/org/apache/atlas/listener/ActiveStateChangeHandler.java
 bb7f8fcc4 
  webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java 
b20b40474 
  webapp/src/test/java/org/apache/atlas/web/resources/AdminResourceTest.java 
77422b2a5 


Diff: https://reviews.apache.org/r/73076/diff/12/

Changes: https://reviews.apache.org/r/73076/diff/11-12/


Testing
---

**Unit tests**
New tests.

**Manual tests**
In-progress.

**Volume test**
Pending.

**Pre-commit build**
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/281/


Thanks,

Ashutosh Mestry



Re: Review Request 73076: Deferred Actions Implementation

2021-01-21 Thread Jayendra Parab

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222507
---




repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
Lines 39 (patched)


Can we make ARGUMENT_INDEX_CLASSIFICATION_NAMES and 
ARGUMENT_INDEX_VERTEX_IDS as final as well?



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
Lines 58 (patched)


Can we store the error message in graphdb whenever we store the status as 
COMPLETE_WITH_ERRORS?


- Jayendra Parab


On Jan. 20, 2021, 11:14 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Jan. 20, 2021, 11:14 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 396aad06f 
>   intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
>   intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
>   intg/src/test/resources/atlas-application.properties 7e74d5107 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> 86b369fc9 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  244072289 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2cfcc0bc1 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
>  PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java 
> PRE-CREATION 
>   
> repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
>  84e9bfa04 
>   

Re: Review Request 73076: Deferred Actions Implementation

2021-01-20 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Jan. 20, 2021, 11:14 p.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Addressed review comments.
- Rebased with latest.
- Addressed the problem with /admin/task get API (thanks to Jayendra).


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 396aad06f 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 244072289 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 216ba08d3 
  
server-api/src/main/java/org/apache/atlas/listener/ActiveStateChangeHandler.java
 bb7f8fcc4 
  webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java 
b20b40474 
  webapp/src/test/java/org/apache/atlas/web/resources/AdminResourceTest.java 
77422b2a5 


Diff: https://reviews.apache.org/r/73076/diff/11/

Changes: https://reviews.apache.org/r/73076/diff/10-11/


Testing
---

**Unit tests**
New tests.

**Manual tests**
In-progress.

**Volume test**
Pending.

**Pre-commit build**
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/281/



Re: Review Request 73076: Deferred Actions Implementation

2021-01-19 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Jan. 19, 2021, 6:31 p.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include:
- Addressed review comments.
- Removed duplicated lines due to merge problem.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java ea9f26d47 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 244072289 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 216ba08d3 
  
server-api/src/main/java/org/apache/atlas/listener/ActiveStateChangeHandler.java
 bb7f8fcc4 
  webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java 
b20b40474 
  webapp/src/test/java/org/apache/atlas/web/resources/AdminResourceTest.java 
77422b2a5 


Diff: https://reviews.apache.org/r/73076/diff/10/

Changes: https://reviews.apache.org/r/73076/diff/9-10/


Testing
---

**Unit tests**
New tests.

**Manual tests**
In-progress.

**Volume test**
Pending.

**Pre-commit build**
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/281/


Thanks,

Ashutosh Mestry



Re: Review Request 73076: Deferred Actions Implementation

2021-01-19 Thread Ashutosh Mestry via Review Board


> On Jan. 10, 2021, 9:43 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
> > Lines 2037 (patched)
> > 
> >
> > consider adding tags to impacted vertices in batches (25 or 50) and 
> > committing the transaction, batch commits will save time rather than 
> > committing a single transaction with large changes.

I am planning to change this as part of a follow-up patch.


> On Jan. 10, 2021, 9:43 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
> > Lines 191 (patched)
> > 
> >
> > In line 191 and 203, we pass classificationNames/classifications in 
> > addition to classifications vertexIds.
> > 
> > Since classification vertexIds is available, we may not need to pass 
> > classificationNames or Classifications. These can be mapped from 
> > classificationVertex using entityGraphRetriever.

I had the implementaiton this way. I realized that this becomes problematic if 
classificaitons are added and removed. It better to fetch the IDs when we get 
to schedule the task.


> On Jan. 10, 2021, 9:43 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
> > Lines 83 (patched)
> > 
> >
> > this line always passes 0th classificationVertexId instead of i'th:
> > 
> > classificationVertexIds.get(0) => classificationVertexIds.get(i)

This is simply a mechanism for providing arguments. Right now, only 1 argument 
is being passed, hence the index is always set to 0.


> On Jan. 10, 2021, 9:43 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
> > Lines 103 (patched)
> > 
> >
> > taskDef status here is always set to 'COMPLETE' irrespective of the 
> > outcome of task.run()
> > 
> > task.run() should return the status of the execution and it should be 
> > set to taskDef. Please review.

task.run can potentially throw an exception, in that case the status will be 
updated by the exception handlers. Status will be set to complete only on 
successful completion of the task. Hope this makes sense.


> On Jan. 10, 2021, 9:43 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
> > Lines 263 (patched)
> > 
> >
> > consider soft deleting the task vertex. It will be useful to check the 
> > status of the completed or failed tasks. We can purge the task vertices 
> > periodically (every hour, every day) or if some threshold is reached to 
> > avoid proliferations of these vertices.

In my testing I realized a lot of task data gets generated. This will cause a 
huge prolifiaration of task vertexs that are complete. Hence I am vertices of 
tasks that succeed. There is entry in log files to track the start and done 
status of tasks.


> On Jan. 10, 2021, 9:43 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
> > Lines 267 (patched)
> > 
> >
> > why update status of taskVertex and hard delete it?

The vertex holding the status will be available (not deleted) only if it fails.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222428
---


On Jan. 12, 2021, 6:34 a.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Jan. 12, 2021, 6:34 a.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a 

Re: Review Request 73076: Deferred Actions Implementation

2021-01-12 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222458
---




repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
Lines 371 (patched)


line 371-377 can be removed? since this is replaced by pause()



webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java
Lines 753 (patched)


API duplicated? line 746/753


- Sarath Subramanian


On Jan. 11, 2021, 10:34 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Jan. 11, 2021, 10:34 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single queue for processing tasks in a way 
> that is non-blocking.
> - Uses task type and the associated TaskFactory to construct concrete task.
> 
> Changes to _EntityGraphMapper_
> There are 3 new methods added that use portion of existing classification 
> propagation logic and create a separate method.
> 
> New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
> All calls that operate without tasks get channeled to the existing 
> implementation. The ones involving tasks are modified to use the new task 
> framework.
> 
> **Concurrent Changes to Classifications**
> Via REST APIs and operations from web UI, it is possible to make changes to 
> the same entity. This situation will be more pronounced in the new setup.
> 
> The existing framework has 
> _GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
> ReentrantLock on an entity that is identified by the supplied GUID. This will 
> cause a lock on the entity and all the impacted vertices. This has potential 
> for adversely impacting scalability.
> 
> 
> Diffs
> -
> 
>   intg/src/main/java/org/apache/atlas/AtlasConfiguration.java ea9f26d47 
>   intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
>   intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
>   intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
>   intg/src/test/resources/atlas-application.properties 7e74d5107 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
> 86b369fc9 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
>  244072289 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
>  2cfcc0bc1 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
>  PRE-CREATION 
>   
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
>  PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
> PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java 
> PRE-CREATION 
>   
> repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
>  84e9bfa04 
>   repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
> PRE-CREATION 
>   

Re: Review Request 73076: Deferred Actions Implementation

2021-01-11 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Jan. 12, 2021, 6:34 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: Addressed review comments.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java ea9f26d47 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  intg/src/test/resources/atlas-application.properties 7e74d5107 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 244072289 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 216ba08d3 
  
server-api/src/main/java/org/apache/atlas/listener/ActiveStateChangeHandler.java
 bb7f8fcc4 
  webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java 
b20b40474 
  webapp/src/test/java/org/apache/atlas/web/resources/AdminResourceTest.java 
77422b2a5 


Diff: https://reviews.apache.org/r/73076/diff/9/

Changes: https://reviews.apache.org/r/73076/diff/8-9/


Testing
---

**Unit tests**
New tests.

**Manual tests**
In-progress.

**Volume test**
Pending.

**Pre-commit build**
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/281/


Thanks,

Ashutosh Mestry



Re: Review Request 73076: Deferred Actions Implementation

2021-01-10 Thread Sarath Subramanian

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/#review222428
---




repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2037 (patched)


consider adding tags to impacted vertices in batches (25 or 50) and 
committing the transaction, batch commits will save time rather than committing 
a single transaction with large changes.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 2324 (patched)


include this if condition along with existing if condition in previous 
line(2323). This will avoid additional nested indentation



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
Lines 191 (patched)


In line 191 and 203, we pass classificationNames/classifications in 
addition to classifications vertexIds.

Since classification vertexIds is available, we may not need to pass 
classificationNames or Classifications. These can be mapped from 
classificationVertex using entityGraphRetriever.



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
Lines 20 (patched)


nit: remove unused import



repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
Lines 83 (patched)


this line always passes 0th classificationVertexId instead of i'th:

classificationVertexIds.get(0) => classificationVertexIds.get(i)



repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java
Lines 103 (patched)


taskDef status here is always set to 'COMPLETE' irrespective of the outcome 
of task.run()

task.run() should return the status of the execution and it should be set 
to taskDef. Please review.



repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
Lines 55 (patched)


taskTypeTaskFactoryMap => taskTypeFactoryMap



repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
Lines 263 (patched)


consider soft deleting the task vertex. It will be useful to check the 
status of the completed or failed tasks. We can purge the task vertices 
periodically (every hour, every day) or if some threshold is reached to avoid 
proliferations of these vertices.



repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java
Lines 267 (patched)


why update status of taskVertex and hard delete it?



repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
Lines 369 (patched)


nit: refactor sleep/pause to a separate method.



webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java
Lines 750 (patched)


deleting a task should not be exposed through REST API (even as admin). 
tasks which needs to be deleted should be handled probably through a utility 
(similar to repair index)


- Sarath Subramanian


On Dec. 16, 2020, 3:42 p.m., Ashutosh Mestry wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/73076/
> ---
> 
> (Updated Dec. 16, 2020, 3:42 p.m.)
> 
> 
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, 
> Nixon Rodrigues, and Sarath Subramanian.
> 
> 
> Bugs: ATLAS-3919
> https://issues.apache.org/jira/browse/ATLAS-3919
> 
> 
> Repository: atlas
> 
> 
> Description
> ---
> 
> **Approach**
> Model: _TaskDef_
> - Definition of task.
> - Contains status of the task for querying.
> 
> _AbstractTask_
> - Takes TaskDef as input and calls run.
> - The concrete tasks has logic for doing real work.
> 
> _TaskFactory_
> 
> - Creates tasks given task type.
> - Enumerates all the tasks that are supported.
> 
> _TaskManagement_
> 
> - Component/Service. Constructed after all existing components are 
> constructed.
> - Provides a simple interface for adding tasks.
> - Processes tasks that are in pending state.
> - Supports HA mode, where the tasks are not executed on passive instance.
> 
> _TaskManagement.Registry_
> 
> - CRUD operations with TaskDef vertex.
> 
> _TaskManagement.Processor_
> - Provides mechanism for provides single 

Re: Review Request 73076: Deferred Actions Implementation

2020-12-16 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Dec. 16, 2020, 11:42 p.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: 
- Additional unit tests.
- Refactoring for task arguments.


Summary (updated)
-

Deferred Actions Implementation


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

Changes to _EntityGraphMapper_
There are 3 new methods added that use portion of existing classification 
propagation logic and create a separate method.

New class _EntityGraphMapperWithTasks_ acts as a proxy for EntityGraphMapper. 
All calls that operate without tasks get channeled to the existing 
implementation. The ones involving tasks are modified to use the new task 
framework.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.


Diffs (updated)
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java ea9f26d47 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 244072289 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 216ba08d3 
  
server-api/src/main/java/org/apache/atlas/listener/ActiveStateChangeHandler.java
 bb7f8fcc4 
  webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java 
b20b40474 
  webapp/src/test/java/org/apache/atlas/web/resources/AdminResourceTest.java 
77422b2a5 


Diff: https://reviews.apache.org/r/73076/diff/7/

Changes: https://reviews.apache.org/r/73076/diff/6-7/


Testing
---

**Unit tests**
New tests.

**Manual tests**
In-progress.

**Volume test**
Pending.

**Pre-commit build**
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/279/


Thanks,

Ashutosh Mestry



Re: Review Request 73076: Deferred Actions Implementation: Work In Progress

2020-12-15 Thread Ashutosh Mestry via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/73076/
---

(Updated Dec. 16, 2020, 5:18 a.m.)


Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon 
Rodrigues, and Sarath Subramanian.


Changes
---

Updates include: Updated design approach.


Bugs: ATLAS-3919
https://issues.apache.org/jira/browse/ATLAS-3919


Repository: atlas


Description (updated)
---

**Approach**
Model: _TaskDef_
- Definition of task.
- Contains status of the task for querying.

_AbstractTask_
- Takes TaskDef as input and calls run.
- The concrete tasks has logic for doing real work.

_TaskFactory_

- Creates tasks given task type.
- Enumerates all the tasks that are supported.

_TaskManagement_

- Component/Service. Constructed after all existing components are constructed.
- Provides a simple interface for adding tasks.
- Processes tasks that are in pending state.
- Supports HA mode, where the tasks are not executed on passive instance.

_TaskManagement.Registry_

- CRUD operations with TaskDef vertex.

_TaskManagement.Processor_
- Provides mechanism for provides single queue for processing tasks in a way 
that is non-blocking.
- Uses task type and the associated TaskFactory to construct concrete task.

**Concurrent Changes to Classifications**
Via REST APIs and operations from web UI, it is possible to make changes to the 
same entity. This situation will be more pronounced in the new setup.

The existing framework has 
_GraphTransactionInterceptor.lockObjectAndReleasePostCommit_. This provides 
ReentrantLock on an entity that is identified by the supplied GUID. This will 
cause a lock on the entity and all the impacted vertices. This has potential 
for adversely impacting scalability.


Diffs
-

  intg/src/main/java/org/apache/atlas/AtlasConfiguration.java ea9f26d47 
  intg/src/main/java/org/apache/atlas/model/tasks/TaskDef.java PRE-CREATION 
  intg/src/main/java/org/apache/atlas/type/AtlasType.java b30b483e0 
  intg/src/main/java/org/apache/atlas/utils/AtlasJson.java abeddf640 
  repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java 
86b369fc9 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java
 b80e42eed 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
 2cfcc0bc1 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapperWithTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagateTaskFactory.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationPropagationTasks.java
 PRE-CREATION 
  
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/tasks/ClassificationTask.java
 PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/AbstractTask.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskExecutor.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskFactory.java PRE-CREATION 
  repository/src/main/java/org/apache/atlas/tasks/TaskManagement.java 
PRE-CREATION 
  repository/src/main/java/org/apache/atlas/util/BeanUtilRepo.java PRE-CREATION 
  
repository/src/test/java/org/apache/atlas/repository/tagpropagation/ClassificationPropagationTest.java
 84e9bfa04 
  repository/src/test/java/org/apache/atlas/tasks/TaskExecutorTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskManagementTest.java 
PRE-CREATION 
  repository/src/test/java/org/apache/atlas/tasks/TaskRegistryTest.java 
PRE-CREATION 
  server-api/src/main/java/org/apache/atlas/RequestContext.java 32ffddde1 
  
server-api/src/main/java/org/apache/atlas/listener/ActiveStateChangeHandler.java
 bb7f8fcc4 
  webapp/src/main/java/org/apache/atlas/web/resources/AdminResource.java 
b20b40474 


Diff: https://reviews.apache.org/r/73076/diff/6/


Testing (updated)
---

**Unit tests**
New tests.

**Manual tests**
In-progress.

**Volume test**
Pending.

**Pre-commit build**
https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/279/


Thanks,

Ashutosh Mestry