[ http://issues.apache.org/jira/browse/NUTCH-116?page=all ]

Paul Baclace updated NUTCH-116:
-------------------------------

    Attachment: required_by_TestNDFS_v2.patch

Change Notes revised for patch required_by_TestNDFS_v2.patch which supercedes 
required_by_TestNDFS.patch:

src/java/org/apache/nutch/ipc/Server.java
  Set thread names to make it possible to view logging output and known when 
proper shutdown is completed. Using the safer notifyAll() for Server.stop() and 
Server.join() instead of 
  notify() since the wait condition is on a public object, added comments that 
clarify
  actual implementation.
  Tightened the join() to be a proper while (running) { wait()} which avoids 
the 
  hazard of "spurious wakeup" in Posix threads (as noted in Effective Java by 
Joshua Bloch).

src/java/org/apache/nutch/ndfs/DataNode.java
  improved logging details, added comments, improved error message, 
  refactored reuseable code into makeInstanceForDir(), added toString(),
  added properties ndfs.blockreport.intervalMsec and ndfs.datanode.startupMsec
  to allow the override of BLOCKREPORT_INTERVAL and DATANODE_STARTUP_PERIOD,
  respectively, in order to speed up TestNDFS runs (otherwise it would take an 
hour).
  These FSConstant fields are worth keeping as default values when a property is
  not set so that lookup idiom is:
    conf.getLong("ndfs.datanode.startupMsec", DATANODE_STARTUP_PERIOD);
  instead of:
    conf.getLong("ndfs.datanode.startupMsec", 1000*60*10);
  When a property lookup occurs in more than one place, it is best to have the
  default value come from FSConstants rather than have multiple, possibly 
  different, literal values as the default.

src/java/org/apache/nutch/ndfs/FSDataset.java
  added toString() methods used in logging elsewhere.

src/java/org/apache/nutch/ndfs/FSNamesystem.java
  Changed chooseTarget() to behave as commented rather than as implemented (it
  says it fobids picking a target on the same host, but it was using
  host:port as the basis of comparison, so different ports on the same host
  would appear to be different hosts; this mistake was probably the result of
  DatanodeInfo.getName() really returning host:port, not just hostname which
  is what the method name implies (DatanodeInfo.getHost() removes the port 
number).  
  Added property test.ndfs.same.host.targets.allowed which allows target 
datanode
  selection to use same host (same host:port is never allowed.)
  TestNDFS uses host:port comparison
  and normal operation just uses 'host' to better distribute replicants;
  simplified a chooseTarget() conditional which was redundantly
  checking against forbidden1, forbidden2 and the just constructed
  forbiddenMachines containing the union of forbidden1, forbidden2:

          if ((forbidden1 == null || ! forbidden1.contains(node)) &&
              (forbidden2 == null || ! forbidden2.contains(node)) &&
              (! forbiddenMachines.contains(node.getName()))) {
   
  The following:
     forbidden1.contains(node) == forbiddenMachines.contains(node.getName()) 
  is always true and uses host:port for the comparison.

  Added logging for previously
  silent errors, emit more info for some logging, change LOG.info() to 
  LOG.warning(), added javadoc comments, 

src/java/org/apache/nutch/ndfs/NameNode.java
  Added a way to stop the daemon for JUnit testing, added javadoc comments, 
  renames offerService() to join() to better indicate what the method 
  really does, added property ndfs.namenode.handler.count to adjust the
  number of handlers to speed up testing, changed access of some fields 
  from package to private (protected is also reasonable) to quickly indicate 
  how it is self-contained when studying the code.


> TestNDFS a JUnit test specifically for NDFS
> -------------------------------------------
>
>          Key: NUTCH-116
>          URL: http://issues.apache.org/jira/browse/NUTCH-116
>      Project: Nutch
>         Type: Test
>   Components: fetcher, indexer, searcher
>     Versions: 0.8-dev
>     Reporter: Paul Baclace
>  Attachments: TestNDFS.java, required_by_TestNDFS.patch, 
> required_by_TestNDFS_v2.patch
>
> TestNDFS is a JUnit test for NDFS using "pseudo multiprocessing" (or more 
> strictly, pseudo distributed) meaning all daemons run in one process and 
> sockets are used to communicate between daemons.  
> The test permutes various block sizes, number of files, file sizes, and 
> number of datanodes.  After creating 1 or more files and filling them with 
> random data, one datanode is shutdown, and then the files are verfified. 
> Next, all the random test files are deleted and we test for leakage 
> (non-deletion) by directly checking the real directories corresponding to the 
> datanodes still running.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to