[GitHub] keith-turner commented on a change in pull request #743: New MapReduce API

GitBox Thu, 01 Nov 2018 08:13:23 -0700

keith-turner commented on a change in pull request #743: New MapReduce API
URL: https://github.com/apache/accumulo/pull/743#discussion_r230074967


 ##########
 File path: 
hadoop-mapreduce/src/main/java/org/apache/accumulo/hadoop/mapred/AccumuloFileOutputFormat.java
 ##########
 @@ -16,184 +16,48 @@
  */
 package org.apache.accumulo.hadoop.mapred;
 
-import java.io.IOException;
-
-import org.apache.accumulo.core.client.rfile.RFile;
-import org.apache.accumulo.core.client.rfile.RFileWriter;
-import org.apache.accumulo.core.client.sample.SamplerConfiguration;
-import org.apache.accumulo.core.client.summary.Summarizer;
 import org.apache.accumulo.core.client.summary.SummarizerConfiguration;
-import org.apache.accumulo.core.conf.AccumuloConfiguration;
-import org.apache.accumulo.core.conf.Property;
 import org.apache.accumulo.core.data.Key;
-import org.apache.accumulo.core.data.Value;
-import org.apache.accumulo.hadoopImpl.mapreduce.lib.ConfiguratorBase;
-import org.apache.accumulo.hadoopImpl.mapreduce.lib.FileOutputConfigurator;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
+import org.apache.accumulo.hadoop.mapreduce.FileOutputInfo;
+import org.apache.accumulo.hadoopImpl.mapred.AccumuloFileOutputFormatImpl;
 import org.apache.hadoop.mapred.FileOutputFormat;
 import org.apache.hadoop.mapred.JobConf;
-import org.apache.hadoop.mapred.RecordWriter;
-import org.apache.hadoop.mapred.Reporter;
-import org.apache.hadoop.util.Progressable;
-import org.apache.log4j.Logger;
 
 /**
  * This class allows MapReduce jobs to write output in the Accumulo data file 
format.<br>
  * Care should be taken to write only sorted data (sorted by {@link Key}), as 
this is an important
  * requirement of Accumulo data files.
  *
  * <p>
- * The output path to be created must be specified via
- * {@link AccumuloFileOutputFormat#setOutputPath(JobConf, Path)}. This is 
inherited from
- * {@link FileOutputFormat#setOutputPath(JobConf, Path)}. Other methods from
- * {@link FileOutputFormat} are not supported and may be ignored or cause 
failures. Using other
- * Hadoop configuration options that affect the behavior of the underlying 
files directly in the
- * Job's configuration may work, but are not directly supported at this time.
+ * The output path to be created must be specified via {@link #setInfo(Job, 
FileOutputInfo)} using
+ * {@link FileOutputInfo#builder()}.outputPath(path). For all available 
options see
+ * {@link FileOutputInfo#builder()}
+ * <p>
+ * Methods inherited from {@link FileOutputFormat} are not supported and may 
be ignored or cause
+ * failures. Using other Hadoop configuration options that affect the behavior 
of the underlying
+ * files directly in the Job's configuration may work, but are not directly 
supported at this time.
 
 Review comment:
   Needs a since tag

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] keith-turner commented on a change in pull request #743: New MapReduce API

Reply via email to