[
https://issues.apache.org/jira/browse/TAJO-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633507#comment-14633507
]
ASF GitHub Bot commented on TAJO-1464:
--------------------------------------
Github user hyunsik commented on a diff in the pull request:
https://github.com/apache/tajo/pull/579#discussion_r34988886
--- Diff:
tajo-storage/tajo-storage-hdfs/src/main/java/org/apache/tajo/storage/orc/ORCScanner.java
---
@@ -0,0 +1,328 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.tajo.storage.orc;
+
+import com.google.protobuf.InvalidProtocolBufferException;
+import org.apache.commons.logging.Log;
+import org.apache.commons.logging.LogFactory;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.tajo.catalog.Schema;
+import org.apache.tajo.catalog.TableMeta;
+import org.apache.tajo.common.TajoDataTypes;
+import org.apache.tajo.conf.TajoConf;
+import org.apache.tajo.datum.*;
+import org.apache.tajo.exception.UnsupportedException;
+import org.apache.tajo.plan.expr.EvalNode;
+import org.apache.tajo.storage.FileScanner;
+import org.apache.tajo.storage.StorageConstants;
+import org.apache.tajo.storage.Tuple;
+import org.apache.tajo.storage.VTuple;
+import org.apache.tajo.storage.fragment.Fragment;
+import com.facebook.presto.orc.*;
+import com.facebook.presto.orc.metadata.OrcMetadataReader;
+import org.apache.tajo.storage.thirdparty.orc.HdfsOrcDataSource;
+import org.apache.tajo.util.datetime.DateTimeUtil;
+import org.joda.time.DateTimeZone;
+
+import java.io.IOException;
+import java.util.HashSet;
+import java.util.Set;
+
+/**
+ * OrcScanner for reading ORC files
+ */
+public class ORCScanner extends FileScanner {
+ private static final Log LOG = LogFactory.getLog(ORCScanner.class);
+ private OrcRecordReader recordReader;
+ private Vector [] vectors;
+ private int currentPosInBatch = 0;
+ private int batchSize = 0;
+
+ public ORCScanner(Configuration conf, final Schema schema, final
TableMeta meta, final Fragment fragment) {
+ super(conf, schema, meta, fragment);
+ }
+
+ private Vector createOrcVector(TajoDataTypes.DataType type) {
+ switch (type.getType()) {
+ case INT1: case INT2: case INT4: case INT8:
+ case UINT1: case UINT2: case UINT4: case UINT8:
+ case INET4:
+ case TIMESTAMP:
--- End diff --
Does ORC use long type for timestamp, date, and INET4?
As far as I know, Tajo represents INET4 or date values as a integer value
and timestamp as a long value. So, if you use LongVector for those types, it
won't be compatible.
> Add ORCFileScanner to read ORCFile table
> ----------------------------------------
>
> Key: TAJO-1464
> URL: https://issues.apache.org/jira/browse/TAJO-1464
> Project: Tajo
> Issue Type: Sub-task
> Components: Storage
> Affects Versions: 0.10.0
> Reporter: Dongjoon Hyun
> Assignee: Jongyoung Park
> Fix For: 0.11.0
>
> Attachments: TAJO-1464.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)