Hi,

After today's big update, it seems invertlinks doesn't work if a linkdb doesn't exist already, because fs.exists checks the wrong directory (linkdb/ but not linkdb/current).

A simple patch is attached.

--
Doğacan Güney
Index: src/java/org/apache/nutch/crawl/LinkDb.java
===================================================================
--- src/java/org/apache/nutch/crawl/LinkDb.java	(revision 490745)
+++ src/java/org/apache/nutch/crawl/LinkDb.java	(working copy)
@@ -212,6 +212,7 @@
   public void invert(Path linkDb, Path[] segments, boolean normalize, boolean filter, boolean force) throws IOException {
 
     Path lock = new Path(linkDb, LOCK_NAME);
+    Path currentLinkDb = new Path(linkDb, CURRENT_NAME);
     FileSystem fs = FileSystem.get(getConf());
     LockUtil.createLockFile(fs, lock, force);
     if (LOG.isInfoEnabled()) {
@@ -233,14 +234,14 @@
       LockUtil.removeLockFile(fs, lock);
       throw e;
     }
-    if (fs.exists(linkDb)) {
+    if (fs.exists(currentLinkDb)) {
       if (LOG.isInfoEnabled()) {
         LOG.info("LinkDb: merging with existing linkdb: " + linkDb);
       }
       // try to merge
       Path newLinkDb = job.getOutputPath();
       job = LinkDb.createMergeJob(getConf(), linkDb, normalize, filter);
-      job.addInputPath(new Path(linkDb, CURRENT_NAME));
+      job.addInputPath(currentLinkDb);
       job.addInputPath(newLinkDb);
       try {
         JobClient.runJob(job);
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to