> On Feb. 2, 2017, 10:08 p.m., Nate Cole wrote:
> > Have you tried this patch on a large cluster with success?

Yes. 
Verified on 1000 node cluster that the patch does not regress big operations. 
Executed Stop Services and Start Services API call which gernerated around 9000 
tasks and compared request completion time for these operations.

I have also mentioned this in "Testing Done" section.


> On Feb. 2, 2017, 10:08 p.m., Nate Cole wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/actionmanager/HostRoleStatus.java,
> >  line 186
> > <https://reviews.apache.org/r/53686/diff/7/?file=1608270#file1608270line186>
> >
> >     See other comment, either make this non-static on the status enum or 
> > just use !isCompletedState() directly.

Fixed with new revision of the patch


> On Feb. 2, 2017, 10:08 p.m., Nate Cole wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/controller/internal/CalculatedStatus.java,
> >  line 310
> > <https://reviews.apache.org/r/53686/diff/7/?file=1608273#file1608273line310>
> >
> >     nit: Can remove redundant types here, and a lot of other places in this 
> > patch:
> >     
> >     Map<StatusType, Map<HostRoleStatus, Integer>> counters = new 
> > HashMap<>();
> >     List<String> list = new ArrayList<>
> >     
> >     etc etc

Fixed with new revision of the patch


> On Feb. 2, 2017, 10:08 p.m., Nate Cole wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/events/listeners/tasks/TaskStatusListener.java,
> >  lines 215-216
> > <https://reviews.apache.org/r/53686/diff/7/?file=1608277#file1608277line215>
> >
> >     Potentially doing db work to multiple stages.  What are the 
> > implications of failure here?  Should it be @Transactional (if yes, 
> > remember the method cannot be public).
> >     
> >     Also, if some of them can be updated, and others not, can they be 
> > reconciled to "try again"?

Patch is genrally marking entire workflow of updating tasks, updating multiple 
stage and updating multiple requests as transactional.
This is with an intention that we want hostrolecommand status and its 
corresponding stage and request status to be always consistent in the database.

I believe we don't require this method to be marked as transactional because 
status for one stage should be independent of status of another stage. 

If some of them are updated and others are not then on subsequent reporting of 
task status, the ones which did not get updated will be correctly updated


> On Feb. 2, 2017, 10:08 p.m., Nate Cole wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/events/listeners/tasks/TaskStatusListener.java,
> >  lines 260-262
> > <https://reviews.apache.org/r/53686/diff/7/?file=1608277#file1608277line260>
> >
> >     Same as other comment.  Transactional?  Partial updates?

As Answered above, status of a request should be independent of status of any 
other request. 
If some of them are updated and others are not then on subsequent reporting of 
task status, the ones which did not get updated will be correctly updated


> On Feb. 2, 2017, 10:08 p.m., Nate Cole wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/events/publishers/TaskEventPublisher.java,
> >  lines 35-38
> > <https://reviews.apache.org/r/53686/diff/7/?file=1608278#file1608278line35>
> >
> >     This means that task events will happen on the same thread they came in 
> > on.  Is that ok?  We can still make this an async bus with one thread so it 
> > won't hold up processing of large amounts of Stages and HostRoleCommands.

Yes. This is intentional. we don't want hostrolecommand status to be updated on 
one thread and respective stage/request status to be updated on another thread. 
This can lead to DB inconsistency in terms of partial failures. 

With this patch workflow is:

function() {
  1. update hostrolecommand status using HostRoleCommadDAO#merge function. This 
function is marked as Transactional
  2. HostRoleCommadDAO#merge calls merge function for HRC entity and also 
publishes an event to update stage/request status
}

Thus the entire flow from merging status for HostRoleCommand Entity (before 
event is published) to merging status for corresponding RequestEntity(called by 
event subscriber method) is under a single transactional boundary. 
The transactional boundary extending from publisher method to subscriber 
methods is possible because we are using synchronous event bus.

On the other hand we are using hashmaps (activeTasksMap, activeStageMap and 
activeRequestMap) for storing status of non-completed request, it's stages and 
it's hostrolecommands. I belive that should be fast and not cause any 
performance regression even on large clusters.


> On Feb. 2, 2017, 10:08 p.m., Nate Cole wrote:
> > ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java,
> >  lines 647-650
> > <https://reviews.apache.org/r/53686/diff/7/?file=1608279#file1608279line647>
> >
> >     Since you're deferring to mergeWithoutPublishEvent() that is marked 
> > @Transactional already, this method may not need those annotations.
> >     
> >     @Jonathan for input if it's required for both methods.

This method is calling mergeWithoutPublishEvent() and also 
publishTaskUpdateEvent()

We want both of them to be under same transaction, so if merging of 
stage/request status fails in the subscriber method of the taskupdate event 
then rollback happens for hostrolecommand status being updated from 
HostRoleCommandDAO#mergeWithoutPublishEvent method.


- Jaimin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53686/#review164025
-----------------------------------------------------------


On Feb. 6, 2017, 11:09 p.m., Jaimin Jetly wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53686/
> -----------------------------------------------------------
> 
> (Updated Feb. 6, 2017, 11:09 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley, Nate Cole, Sumit Mohanty, and Sid 
> Wagle.
> 
> 
> Bugs: AMBARI-18868
>     https://issues.apache.org/jira/browse/AMBARI-18868
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Stage and Request status should be persisted in the database.
> 
> upgrading to ambari-3.0.0 should add status for all present stages and 
> request for the cluster.
> 
> 
> Diffs
> -----
> 
>   
> ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionDBAccessorImpl.java
>  7837a7b 
>   
> ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ActionScheduler.java
>  dabcb98 
>   
> ambari-server/src/main/java/org/apache/ambari/server/actionmanager/Request.java
>  31e11c1 
>   
> ambari-server/src/main/java/org/apache/ambari/server/actionmanager/Stage.java 
> 4a05b32 
>   
> ambari-server/src/main/java/org/apache/ambari/server/controller/internal/CalculatedStatus.java
>  3c415df 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/TaskCreateEvent.java
>  PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/events/TaskEvent.java 
> PRE-CREATION 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/TaskUpdateEvent.java
>  PRE-CREATION 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/listeners/tasks/TaskStatusListener.java
>  PRE-CREATION 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/publishers/TaskEventPublisher.java
>  PRE-CREATION 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/dao/HostRoleCommandDAO.java
>  02c4091 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/dao/RequestDAO.java 
> 1c4d0a3 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/dao/StageDAO.java 
> d2f899f 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/HostRoleCommandEntity.java
>  74271b9 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/RequestEntity.java
>  7944d21 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/StageEntity.java
>  f9c8810 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/StageEntityPK.java
>  9ca0470 
>   
> ambari-server/src/main/java/org/apache/ambari/server/upgrade/UpgradeCatalog300.java
>  4f90ef3 
>   ambari-server/src/main/resources/Ambari-DDL-Derby-CREATE.sql b79c945 
>   ambari-server/src/main/resources/Ambari-DDL-MySQL-CREATE.sql 1c502bc 
>   ambari-server/src/main/resources/Ambari-DDL-Oracle-CREATE.sql c6d4ad0 
>   ambari-server/src/main/resources/Ambari-DDL-Postgres-CREATE.sql 1be87bb 
>   ambari-server/src/main/resources/Ambari-DDL-SQLAnywhere-CREATE.sql abe48e8 
>   ambari-server/src/main/resources/Ambari-DDL-SQLServer-CREATE.sql 169a464 
>   
> ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionDBAccessorImpl.java
>  177ac70 
>   
> ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
>  ade625a 
>   
> ambari-server/src/test/java/org/apache/ambari/server/alerts/AmbariPerformanceRunnableTest.java
>  7b1a5a2 
>   
> ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeResourceProviderTest.java
>  a0701b6 
>   
> ambari-server/src/test/java/org/apache/ambari/server/controller/internal/UpgradeSummaryResourceProviderTest.java
>  619e367 
>   
> ambari-server/src/test/java/org/apache/ambari/server/events/listeners/tasks/TaskStatusListenerTest.java
>  PRE-CREATION 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/ConfigHelperTest.java
>  b1c10f5 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/cluster/ClusterEffectiveVersionTest.java
>  9d339e2 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/services/RetryUpgradeActionServiceTest.java
>  ed95b0b 
>   
> ambari-server/src/test/java/org/apache/ambari/server/upgrade/UpgradeCatalog300Test.java
>  d7979e8 
> 
> Diff: https://reviews.apache.org/r/53686/diff/
> 
> 
> Testing
> -------
> 
> Verified manually on a cluster by making api requests and upgrading ambari.
> Add unit tests.
> Verified that the patch does not break any existing unit tests on dev box. 
> Jenkins job overall unit test result pending..
> Verified on 1000 node cluster that the patch does not regress big operations. 
> Executed Stop Services and Start Services API call which gernerated around 
> 9000 tasks and compared  request completion time for these operations. There 
> was a minor performance gain with the patch. As part of 
> https://issues.apache.org/jira/browse/AMBARI-18889, I will look if we can use 
> the request status and stage status to further enhance performance.
> 
> 
> Thanks,
> 
> Jaimin Jetly
> 
>

Reply via email to