[
http://issues.apache.org/jira/browse/NUTCH-116?page=comments#action_12332546 ]
Paul Baclace commented on NUTCH-116:
------------------------------------
Doug,
Thanks for the quick response.
1. Should BLOCKREPORT_INTERVAL and DATANODE_STARTUP_PERIOD be removed from
FSConstants entirely?
I see two choices here: (1) have these constants remain as defaults used when
override properties are not used (the normal case for these settings) or (2)
remove them and have the defaults be literal constants in statements scattered
around. My preference is (1) because (2) allows for multiple, different
literal constant defaults when a setting is retrieved in more than one code
block.
2.A. Regarding Server.java and switching all notify() to notifyAll(): I'm glad
you spotted that. I was debugging a non-liveliness problem and should have
circled back and justified the dogmatic use of notifyAll(). Looking at it
again, two of them are waiting on a private object and that is perfect
candidate for notify(). The wait and notify between join() and stop() should
be notifyAll() because it is waiting on a non-private object. I revised this
and tightened the join() to be a proper while (running) { wait()} which avoids
the hazard of "spurious wakeup" in Posix threads (as noted in Effective Java by
Joshua Bloch).
2.B. I eliminated a change that removed ' synchronized (FSNamesystem.this) '
from
LeaseMonitor thinking it was unnecessary. Looking at it again, I am not sure
why
is is necessary to synchronize on the enclosing scope FSNamesystem.this, but it
would
ensure that many file operations could not begin until the lease renewals were
finished.
It seems that method synchronization was used throughout for conservative
safety.
3. "Changes to comments, documentation, logging, etc. are better contributed as
separate patches". I separated out changes not strictly required for TestNDFS
in an earlier patch, but I did not attempt to do this at a granularity below
the file level when these human-understanding changes co-occur with ones that
are functionally necessary because it is unwieldy to separate out the edits,
and test separately, etc. Additionally, some logging changes like giving
numbers to Server Handler Thread names make it much easier to determine when a
daemon is properly shutdown.
Paul
> TestNDFS a JUnit test specifically for NDFS
> -------------------------------------------
>
> Key: NUTCH-116
> URL: http://issues.apache.org/jira/browse/NUTCH-116
> Project: Nutch
> Type: Test
> Components: fetcher, indexer, searcher
> Versions: 0.8-dev
> Reporter: Paul Baclace
> Attachments: TestNDFS.java, required_by_TestNDFS.patch,
> required_by_TestNDFS_v2.patch
>
> TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more
> strictly, pseudo distributed) meaning all daemons run in one process and
> sockets are used to communicate between daemons.
> The test permutes various block sizes, number of files, file sizes, and
> number of datanodes. After creating 1 or more files and filling them with
> random data, one datanode is shutdown, and then the files are verfified.
> Next, all the random test files are deleted and we test for leakage
> (non-deletion) by directly checking the real directories corresponding to the
> datanodes still running.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira