> On Feb. 8, 2015, 2:06 p.m., Benjamin Hindman wrote:
> > src/launcher/fetcher.cpp, line 388
> > <https://reviews.apache.org/r/30774/diff/1/?file=857926#file857926line388>
> >
> >     Kill '} else {'.

That would be wrong. Then the CHECK would fire every time.


> On Feb. 8, 2015, 2:06 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/fetcher.cpp, line 75
> > <https://reviews.apache.org/r/30774/diff/1/?file=857931#file857931line75>
> >
> >     This error message is a bit little wierd since it's not that it can't 
> > be removed, it's that we can determine it's 'realpath'. Also, why not use 
> > that for the actual os::rmdir below?

This was accidental merge mess when I combined and rebased patches. Sorry! 
Cleaned up now.


> On Feb. 8, 2015, 2:06 p.m., Benjamin Hindman wrote:
> > src/slave/containerizer/fetcher.cpp, line 323
> > <https://reviews.apache.org/r/30774/diff/1/?file=857931#file857931line323>
> >
> >     Were you explicitly trying to do this asynchronously? Since it could be 
> > an expensive operation (filesystem) I'm totally for it but just wanted to 
> > confirm.

Yes, any file operation needs to be async! I have sometimes encountered bad 
blocks on hard disks and experienced that when that happens any file operation 
that touches one may stall for minutes. Besides, we could be using a 
network-attached/mounted file system with arbitrarily long timeouts. Etc.


> On Feb. 8, 2015, 2:06 p.m., Benjamin Hindman wrote:
> > src/tests/fetcher_cache_tests.cpp, line 220
> > <https://reviews.apache.org/r/30774/diff/1/?file=857937#file857937line220>
> >
> >     This should occur in the HttpServer but then you should have an 
> > HttpServerProcess. When does process terminate? How are you sure that 
> > libprocess is not calling something like 'visit' after the memory for this 
> > class has been cleaned up!!!

I added this comment to the class to clarify what the intent is here:
// A minimal HTTP server, not intended as an actor, just reusing what
// is already implemented somewhere to serve some HTTP requests for
// file downloads, plus counting how many requests are made.

There is an explicit terminate() call before Shutdown() in every test that uses 
this. I will add this to the fixture when creating one for this kind of test.

BTW, this is similar to this stretch of code in process_tests.cpp:

class FileServer : public Process<FileServer>
{
public:
  explicit FileServer(const string& _path)
    : path(_path) {}

  virtual void initialize()
  {
    provide("", path);
  }

  const string path;
};
...
  FileServer server(path);
  PID<FileServer> pid = spawn(server);

...
  terminate(server);
  wait(server);
---


Should I add wait() after terminate() in my tests?


> On Feb. 8, 2015, 2:06 p.m., Benjamin Hindman wrote:
> > src/tests/fetcher_cache_tests.cpp, line 242
> > <https://reviews.apache.org/r/30774/diff/1/?file=857937#file857937line242>
> >
> >     I don't like the mixture of actors and threads here. Even if this is 
> > only in the tests, it sets a bad precedent that we're willing compromise on 
> > the integrity of the assumptions you have when executing within an actor.
> >     
> >     There is also a lot of expectations shared between the implementation 
> > here and the tests themselves. For example, 'pause' is meant to be used in 
> > order to enable queuing up multiple pending cache entry downloads.
> >     
> >     The abstraction I think you want here is something that lets you 
> > control what happens whenever we get a 'visit' callback, and to do that we 
> > really need to mock 'visit'. If you had a mocked 'visit' then you can 
> > explicitly control "how", and most importantly, "when" each 'visit' call 
> > occurs.

The HttpServer here is not an "actor". I am just reusing some code to produce 
an HttpServer as quickly as possible.

I think the problem I did miss is to establish that tasks have in fact queued 
up. Rather than just observing this when debugging. I may still need pausing 
the HTTP request handling then. Or at least some HTTP server to test the 
net::-related code paths in mesos-fetcher.

I don't see how a mocked method would have more control over when the fetcher 
does anything than an overridden method.


> On Feb. 8, 2015, 2:06 p.m., Benjamin Hindman wrote:
> > src/tests/fetcher_cache_tests.cpp, line 438
> > <https://reviews.apache.org/r/30774/diff/1/?file=857937#file857937line438>
> >
> >     There is a ton of state, expectations, control flow requirements, etc, 
> > throughout these tests. This makes them very hard to follow, and thus hard 
> > to understand if you're really testing the things you want to test (i.e., 
> > concurrent requests, etc). While it's probably more code, I'd rather have 
> > seen each test case testing exactly what you wanted to test rather than a 
> > bunch of functions pulled out to cover "commonality". It makes it very 
> > difficult for me to determine if certain test situations are covered!
> >     
> >     For example, in this function you await tasks if in "concurrent" mode. 
> > Yet, I'm pretty sure tasks aren't going to run in parallel unless there are 
> > enough resources to do so, and tracking down the size of the tasks and how 
> > many resources were launched with the slave is so non-local that it 
> > requires looking all over this source file and after doing so I'm not 
> > convinced that anything can actually run in parallel since the default 
> > slave has 2 CPUs and each task takes 1 CPU (so maybe 2 tasks run in 
> > parallel but not all of them?).
> >     
> >     Why not be more explicit in the tests, even if it's more code, so that 
> > we're convinced we're testing the things we want to test? My guess is that 
> > there will still be things we can pull out and abstract after having done 
> > that.


> On Feb. 8, 2015, 2:06 p.m., Benjamin Hindman wrote:
> > src/tests/fetcher_cache_tests.cpp, lines 132-135
> > <https://reviews.apache.org/r/30774/diff/1/?file=857937#file857937line132>
> >
> >     There is only "checkpointing" mode now, so don't bother trying to test 
> > both cases.

I tried removing this distinction and tests failed, both the recovery test and 
the others that don't test it. Keeping the distinction for now.


- Bernd


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30774/#review71575
-----------------------------------------------------------


On Feb. 8, 2015, 11:34 a.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/30774/
> -----------------------------------------------------------
> 
> (Updated Feb. 8, 2015, 11:34 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and 
> Timothy Chen.
> 
> 
> Bugs: MESOS-2057, MESOS-2069, MESOS-2070, MESOS-2072, and MESOS-2074
>     https://issues.apache.org/jira/browse/MESOS-2057
>     https://issues.apache.org/jira/browse/MESOS-2069
>     https://issues.apache.org/jira/browse/MESOS-2070
>     https://issues.apache.org/jira/browse/MESOS-2072
>     https://issues.apache.org/jira/browse/MESOS-2074
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Replaces all other reviews for the fetcher cache except those related to 
> stout: 30006, 30033, 30034, 30036, 30037, 30039, 30124, 30173, 30614, 30616, 
> 30618, 30621, 30626. See descriptions of those.
> 
> 
> Diffs
> -----
> 
>   include/mesos/fetcher/fetcher.proto 
> facb87b92bf3194516f636dcc348e136af537721 
>   include/mesos/mesos.proto 3a2921dff856d37455593dcbf7340aa537997d35 
>   src/Makefile.am 0fe4809542a7d23785a221901947771320b43d52 
>   src/hdfs/hdfs.hpp 968545d9af896f3e72e156484cc58135405cef6b 
>   src/launcher/fetcher.cpp 5b2d86d1867b25bf71461fd73df91b089002325b 
>   src/slave/constants.hpp 7717abd4f9a3bd6fdca6af2364864e374ce2e056 
>   src/slave/containerizer/docker.hpp 70114dcef7a416ae904e95eb9ce365bcd866aebc 
>   src/slave/containerizer/docker.cpp 813ddf8c98622748b2da1b739bf122387abc4c79 
>   src/slave/containerizer/fetcher.hpp 
> bfd98dbe16e2bd5df3e2c8e9b10e303654f33446 
>   src/slave/containerizer/fetcher.cpp 
> 6e6bce08d76bb8a5813c905e3ffeff9b2411fd6d 
>   src/slave/containerizer/mesos/containerizer.hpp 
> b3aafe4efa9f3469d7a3fd39243ad66b46d6a54d 
>   src/slave/containerizer/mesos/containerizer.cpp 
> fa40d47aee7803833bcde6cce1e86a21d7cf27d0 
>   src/slave/flags.hpp f6033355d129f0013d39dd053455c936596bf159 
>   src/slave/slave.cpp fff2d725fe49eee984d9151cfb2131202c47994f 
>   src/tests/docker_containerizer_tests.cpp 
> 8b212d4e6ed5a179ebadce1bdbbf3edde87d706c 
>   src/tests/fetcher_cache_tests.cpp PRE-CREATION 
>   src/tests/fetcher_tests.cpp fcbf7ad912f0b1b3d106a75a9dcb8213a0c69c07 
>   src/tests/mesos.hpp 83a369968ab2403fa341829ac5d11f7243095190 
>   src/tests/mesos.cpp 21a405366f56c963611324076efe775f85b9d9f7 
>   src/tests/mock_hadoop.sh PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/30774/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>

Reply via email to