massdosage commented on a change in pull request #843: InputFormat support for 
Iceberg
URL: https://github.com/apache/incubator-iceberg/pull/843#discussion_r403967401
 
 

 ##########
 File path: mr/src/test/java/org/apache/iceberg/mr/TestIcebergInputFormat.java
 ##########
 @@ -0,0 +1,242 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.mr;
+
+import com.google.common.collect.FluentIterable;
+import com.google.common.collect.ImmutableMap;
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Locale;
+import java.util.function.Function;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.Job;
+import org.apache.hadoop.mapreduce.RecordReader;
+import org.apache.hadoop.mapreduce.TaskAttemptContext;
+import org.apache.hadoop.mapreduce.TaskAttemptID;
+import org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl;
+import org.apache.iceberg.DataFile;
+import org.apache.iceberg.DataFiles;
+import org.apache.iceberg.FileFormat;
+import org.apache.iceberg.Files;
+import org.apache.iceberg.PartitionSpec;
+import org.apache.iceberg.Schema;
+import org.apache.iceberg.StructLike;
+import org.apache.iceberg.Table;
+import org.apache.iceberg.TableProperties;
+import org.apache.iceberg.TestHelpers.Row;
+import org.apache.iceberg.avro.Avro;
+import org.apache.iceberg.catalog.Catalog;
+import org.apache.iceberg.catalog.TableIdentifier;
+import org.apache.iceberg.data.RandomGenericData;
+import org.apache.iceberg.data.Record;
+import org.apache.iceberg.data.avro.DataWriter;
+import org.apache.iceberg.data.parquet.GenericParquetWriter;
+import org.apache.iceberg.hadoop.HadoopCatalog;
+import org.apache.iceberg.hadoop.HadoopTables;
+import org.apache.iceberg.io.FileAppender;
+import org.apache.iceberg.parquet.Parquet;
+import org.apache.iceberg.types.Types;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import static org.apache.iceberg.types.Types.NestedField.required;
+
+
+@RunWith(Parameterized.class)
+public class TestIcebergInputFormat {
 
 Review comment:
   We've had similar issues in our branch where we are trying to get the Hive 
InputFormat to work. Hive 2.3.6 requires Guava 11.0.2, if a newer Guava version 
is on the classpath Hive is unable to use the InputFormat due to exceptions 
similar to the one above.  So we have to remove Guava as an exposed dependency 
from all Iceberg artifacts which appear on the Hive classpath. The only way 
we've managed to get it to work is by doing the following:
   
   * Alter every Iceberg module that uses Guava to shade and relocate it (which 
IMHO is a good thing to do anyway so external users of Iceberg can use their 
own versions of Guava).
   * Depend on the shaded version of these modules from iceberg-mr.
   * Remove Guava from `versions.props` so that different subprojects can 
depend on different versions of it.
   * The Guava version that then gets used in iceberg-mr is the transitive one 
from Hive 2.3.6 (in this case) which is Guava 11.0.2.
   
   You can see these changes here: 
https://github.com/ExpediaGroup/incubator-iceberg/blob/078a06ddd78d08648127d8b2e8dc41e0febf7f49/build.gradle
 We're not Gradle experts so hopefully there is an easier way to do all of this 
but I think the general steps outlined above will still be required.
   
   Ultimately I think this issue is going to have to be solved for both 
InputFormats so any changes that would allow different versions of Guava to be 
used would be great.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to