1.6.0rc1 is pretty old, have you tried with 1.6.0 ? On Tue, May 5, 2015 at 9:31 AM, Wei Yan <[email protected]> wrote:
> Hi, > > Have met a problem for using AvroParquetInputFromat for my MapReduce job. > The input files are written using two different version schemas. One field > in v1 is "int", while in v2 is "long". The Exception: > > Exception in thread "main" > parquet.schema.IncompatibleSchemaModificationException: can not merge type > optional int32 a into optional int64 a > at parquet.schema.PrimitiveType.union(PrimitiveType.java:513) > at parquet.schema.GroupType.mergeFields(GroupType.java:359) > at parquet.schema.GroupType.union(GroupType.java:341) > at parquet.schema.GroupType.mergeFields(GroupType.java:359) > at parquet.schema.MessageType.union(MessageType.java:138) > at parquet.hadoop.ParquetFileWriter.mergeInto(ParquetFileWriter.java:497) > at parquet.hadoop.ParquetFileWriter.mergeInto(ParquetFileWriter.java:470) > at > > parquet.hadoop.ParquetFileWriter.getGlobalMetaData(ParquetFileWriter.java:446) > at parquet.hadoop.ParquetInputFormat.getSplits(ParquetInputFormat.java:429) > at parquet.hadoop.ParquetInputFormat.getSplits(ParquetInputFormat.java:412) > at > > org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:589) > > I'm using Parquet 1.5, and it looks "int" cannot be merged with "long". I > tried 1.6.rc1, and set the "parquet.strict.typing", but still cannot help. > > So I want to ask is there anyway to solve this problem, like automatically > convert "int" to "long"? instead of re-writing all data using the same > version. > > thanks, > Wei > -- Alex Levenson @THISWILLWORK
