[ 
https://issues.apache.org/jira/browse/NUTCH-1446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18064226#comment-18064226
 ] 

ASF GitHub Bot commented on NUTCH-1446:
---------------------------------------

lewismc commented on code in PR #905:
URL: https://github.com/apache/nutch/pull/905#discussion_r2907415018


##########
src/java/org/apache/nutch/indexer/IndexerOutputFormat.java:
##########
@@ -40,32 +33,67 @@ public RecordWriter<Text, NutchIndexAction> getRecordWriter(
     Configuration conf = context.getConfiguration();
     final IndexWriters writers = IndexWriters.get(conf);
 
-    String name = getUniqueFile(context, "part", "");
-    writers.open(conf, name);
+    // open writers (no temporary file output anymore)
+    writers.open(conf, "index");
     LOG.info(writers.describe());
 
     return new RecordWriter<Text, NutchIndexAction>() {
 
       @Override
       public void close(TaskAttemptContext context) throws IOException {
-        // do the commits once and for all the reducers in one go
-        boolean noCommit = conf
-            .getBoolean(IndexerMapReduce.INDEXER_NO_COMMIT, false);
+
+        boolean noCommit =
+            conf.getBoolean(IndexerMapReduce.INDEXER_NO_COMMIT, false);
+
         if (!noCommit) {
           writers.commit();
         }
+
         writers.close();
       }
 
       @Override
       public void write(Text key, NutchIndexAction indexAction)
           throws IOException {
+
         if (indexAction.action == NutchIndexAction.ADD) {
           writers.write(indexAction.doc);
+
         } else if (indexAction.action == NutchIndexAction.DELETE) {
           writers.delete(key.toString());
         }
       }
     };
   }
-}
+
+  @Override
+  public void checkOutputSpecs(JobContext context)
+      throws IOException, InterruptedException {
+    // No output specs required since we don't write files
+  }
+
+  @Override
+  public OutputCommitter getOutputCommitter(TaskAttemptContext context)
+      throws IOException, InterruptedException {
+
+    return new OutputCommitter() {
+
+      @Override
+      public void setupJob(JobContext jobContext) {}
+
+      @Override
+      public void setupTask(TaskAttemptContext taskContext) {}
+
+      @Override
+      public boolean needsTaskCommit(TaskAttemptContext taskContext) {
+        return false;
+      }
+
+      @Override
+      public void commitTask(TaskAttemptContext taskContext) {}
+
+      @Override
+      public void abortTask(TaskAttemptContext taskContext) {}
+    };

Review Comment:
   @shishir-kuet can you address this issue? Thank you.





> Port NUTCH-1444 to trunk (Indexing should not create temporary files)
> ---------------------------------------------------------------------
>
>                 Key: NUTCH-1446
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1446
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Ferdy
>            Priority: Major
>             Fix For: 1.23
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to