[jira] [Commented] (ORC-52) Add support for mapreduce InputFormat and OutputFormat

ASF GitHub Bot (JIRA) Tue, 31 May 2016 15:19:24 -0700

    [ 
https://issues.apache.org/jira/browse/ORC-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308758#comment-15308758
 ]


ASF GitHub Bot commented on ORC-52:
-----------------------------------

Github user wagnermarkd commented on a diff in the pull request:

    https://github.com/apache/orc/pull/27#discussion_r65273378
  
    --- Diff: 
java/mapreduce/src/java/org.apache.orc.mapreduce/OrcOutputFormat.java ---
    @@ -0,0 +1,83 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.orc.mapreduce;
    +
    +import org.apache.hadoop.conf.Configuration;
    +import org.apache.hadoop.fs.Path;
    +import org.apache.hadoop.io.NullWritable;
    +import org.apache.hadoop.io.Writable;
    +import org.apache.hadoop.mapreduce.OutputCommitter;
    +import org.apache.hadoop.mapreduce.RecordWriter;
    +import org.apache.hadoop.mapreduce.TaskAttemptContext;
    +import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    +import org.apache.hadoop.util.ReflectionUtils;
    +import org.apache.orc.CompressionKind;
    +import org.apache.orc.OrcConf;
    +import org.apache.orc.OrcFile;
    +import org.apache.orc.TypeDescription;
    +import org.apache.orc.Writer;
    +
    +import java.io.IOException;
    +
    +/**
    + * An ORC output format that satisfies the org.apache.hadoop.mapreduce API.
    + */
    +public class OrcOutputFormat<V extends Writable>
    +    extends FileOutputFormat<NullWritable, V> {
    +  private static final String EXTENSION = ".orc";
    +  // This is useful for unit tests or local runs where you don't need the
    +  // output committer.
    +  public static final String SKIP_TEMP_DIRECTORY =
    +      "orc.mapreduce.output.skip-temporary-directory";
    +
    +  @Override
    +  public RecordWriter<NullWritable, V>
    +       getRecordWriter(TaskAttemptContext taskAttemptContext
    +                       ) throws IOException {
    +    Configuration conf = taskAttemptContext.getConfiguration();
    +    Path filename = getDefaultWorkFile(taskAttemptContext, EXTENSION);
    +    Writer writer = OrcFile.createWriter(filename,
    +        OrcFile.writerOptions(conf)
    --- End diff --
    
    Same question as reader options: Can we pull all this boiler plate into the 
writerOptions method?


> Add support for mapreduce InputFormat and OutputFormat
> ------------------------------------------------------
>
>                 Key: ORC-52
>                 URL: https://issues.apache.org/jira/browse/ORC-52
>             Project: Orc
>          Issue Type: Improvement
>          Components: Java, MapReduce
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 1.1.0
>
>
> We have the mapred InputFormat and OutputFormat, but we need one for the 
> newer API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (ORC-52) Add support for mapreduce InputFormat and OutputFormat

Reply via email to