> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/messages/messages.proto, line 65
> > <https://reviews.apache.org/r/7212/diff/1/?file=159109#file159109line65>
> >
> >     Move type closer to the enum, and make it  '= 1' (it's the thing that 
> > differentiates the type of message you have).

done


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.hpp, line 19
> > <https://reviews.apache.org/r/7212/diff/1/?file=159110#file159110line19>
> >
> >     Why not match the filename? s/STATUSUPDATES/STATUS_UPDATES/

done


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.hpp, line 58
> > <https://reviews.apache.org/r/7212/diff/1/?file=159110#file159110line58>
> >
> >     Use an Option (and default it to none) if you're going to optionally 
> > send a message based on executorPID == UPID().

done


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 39
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line39>
> >
> >     If you used the slave's PID rather than self (line 243) then the slave 
> > could dispatch to statusUpdateAcknowledgement which would give the added 
> > benefit of pushing any errors in recording the status update 
> > acknowledgement back to the slave to take action. It's more work on the 
> > slave side, but it enables capturing those errors fairly cleanly. I'm happy 
> > to discuss this more in detail.

agreed. i also like the added benefit of symmetry.

passing slave pid via initialize(), so that we can inject different kind of 
SUMs (mock/no disk checkpointer) into the slave.


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 83
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line83>
> >
> >     If the slave removed the executor from it's data structures, and the 
> > slave dispatches to the SUM, how is this case possible? This sounds like it 
> > should be more of a CHECK than a LOG.

I dont' remember the scenario that needed me to have this log instead of check. 
Reverting this to check for now. I will update it later if the scenario is 
caught in tests.


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 102
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line102>
> >
> >     We might need to rethink the 'acknowledged' set, since it grows 
> > (without bound) with the number of tasks.

Discussed offline. The acknowledged set doesnt grow unbounded.


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 147
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line147>
> >
> >     Is this just your guess about how this will work? Or is there code that 
> > actually uses this? If so, I'd love to see that review too please.

This is how it is done in the slave. I was in 2 minds regarding where this 
recovery should happen. I went with all the recovery logic being inside the 
slave. Open to discussion.

I will send out that review (integration with he slave) next.


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 136
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line136>
> >
> >     How about having sendStatusUpdate return the timeout (since delay 
> > returns a Timer from which you can call Timer::timeout)?

done


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 187
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line187>
> >
> >     s/no  associated/no associated/

hawk eye!


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 206
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line206>
> >
> >     Just to be pedantic, you need to do this AFTER you write it to disk.

:)


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 248
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line248>
> >
> >     return delay(...).timeout();

done


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 282
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line282>
> >
> >     The update can have a large 'data' chunk. Any reason not to just write 
> > the UUID so that we don't have to write the data chunk twice?

good point. fixed


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 318
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line318>
> >
> >     Better yet, do a get which should return an Option.

even better, just killed this function in favor of inlining it :)


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 344
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line344>
> >
> >     Again, Option seems best here. And delete after you remove from the 
> > data structure please. ;)

How would that avoid double look up here?


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.cpp, line 123
> > <https://reviews.apache.org/r/7212/diff/1/?file=159111#file159111line123>
> >
> >     You could return the error through the future and let the slave commit 
> > suicide.

agreed. done


> On Oct. 15, 2012, 5:01 p.m., Benjamin Hindman wrote:
> > src/slave/status_updates_manager.hpp, line 47
> > <https://reviews.apache.org/r/7212/diff/1/?file=159110#file159110line47>
> >
> >     There needs to be a high-level comment here describing the semantics of 
> > receiving (from an executor, when is the executor ACKed?), recording (what 
> > about frameworks that don't want the penalty of recording?), and sending 
> > status updates (how long are they resent?). Likewise, describing a few of 
> > the data structures here such as what a "status update stream" is, how we 
> > "name" it, how long it sticks around, etc.

done. Let me know how it reads.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7212/#review12429
-----------------------------------------------------------


On Sept. 21, 2012, 6:20 p.m., Vinod Kone wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7212/
> -----------------------------------------------------------
> 
> (Updated Sept. 21, 2012, 6:20 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Description
> -------
> 
> This is Status Update Manager's API and implementation.
> 
> Haven't included the integration (into slave) and tests yet.
> 
> Rebased off latest trunk
> 
> 
> Diffs
> -----
> 
>   src/Makefile.am df8696920fd48968907270decbef3dda0803f80a 
>   src/messages/messages.proto 2a0321cb3d9e3f6499e2108a3b21516a3bd18d2f 
>   src/slave/status_updates_manager.hpp PRE-CREATION 
>   src/slave/status_updates_manager.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/7212/diff/
> 
> 
> Testing
> -------
> 
> make check.
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>

Reply via email to