Thanks for reply.
I am now using ...
Writable[] values = new Writable[1];
values[0] = new Text("abc");
values[1] = new IntWritable(24);
ArrayWritable value = new ArrayWritable(Writable.class, values);
List<String> columnnames = new ArrayList<String>();
columnnames.add("name");
columnnames.add("age");
List<TypeInfo> columnTypes = new ArrayList<TypeInfo>();
columnTypes = TypeInfoUtils.getTypeInfosFromTypeString("string,int32");
TypeInfo rowTypeInfo = TypeInfoFactory.getStructTypeInfo(columnnames,
columnTypes
writer.write(new
ParquetHiveRecord(value,(StructObjectInspector)objInspector));
Above code works for String, long, double, float, int,Boolean, date types..but
I am getting below wxcwption for datetime and decimal
message basket {
required int32 b (DECIMAL(2,0));
}
log4j:WARN No appenders could be found for logger
(org.apache.hadoop.conf.Configuration.deprecation).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more
info.
Exception in thread "main" java.lang.RuntimeException: Parquet record is
malformed:
parquet.column.values.dictionary.DictionaryValuesWriter$PlainIntegerDictionaryValuesWriter
at
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:64)
at
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:59)
at
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriteSupport.write(DataWritableWriteSupport.java:31)
at
parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:121)
at parquet.hadoop.ParquetWriter.write(ParquetWriter.java:258)
at ParquetTestWriter.main(ParquetTestWriter.java:107)
Caused by: java.lang.UnsupportedOperationException:
parquet.column.values.dictionary.DictionaryValuesWriter$PlainIntegerDictionaryValuesWriter
at parquet.column.values.ValuesWriter.writeBytes(ValuesWriter.java:95)
at
parquet.column.values.fallback.FallbackValuesWriter.writeBytes(FallbackValuesWriter.java:162)
at parquet.column.impl.ColumnWriterV2.write(ColumnWriterV2.java:157)
at
parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.addBinary(MessageColumnIO.java:346)
at
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writePrimitive(DataWritableWriter.java:302)
at
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeValue(DataWritableWriter.java:106)
at
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.writeGroupFields(DataWritableWriter.java:89)
at
org.apache.hadoop.hive.ql.io.parquet.write.DataWritableWriter.write(DataWritableWriter.java:60)
... 5 more
Please help!
-----Original Message-----
From: Mohammad Islam [mailto:[email protected]]
Sent: Saturday, October 31, 2015 6:50 AM
To: [email protected]
Subject: Re: Write a list in parquet using JAVA api
Prematurely sent ...
Adding on Ryan's comment.Sometime it seems confusing to understand how does
parquet and other object model work.Some relevant link :
http://grepalex.com/2014/05/13/parquet-file-format-and-object-model/
Regards,Mohammad
On Friday, October 30, 2015 6:18 PM, Mohammad Islam <[email protected]>
wrote:
Adding on Ryan's comment.Sometime it seems confusing to understand how parquet
and object model works.Some relevant link :
On Thursday, October 29, 2015 10:02 AM, Ryan Blue <[email protected]>
wrote:
Hi Manisha,
The main recommendation I have is to not use the
org.apache.parquet.example.* classes. Those are an example of how to implement
an object model, not classes that can or should be used in an application that
reads or writes Parquet data.
The best thing is to use one of the real object models, like Avro or Thrift.
That way you get the option of using row-oriented or column-oriented storage in
your application without translating between object models.
rb
On 10/29/2015 01:46 AM, Manisha Sethi wrote:
> Hi All,
>
> I am trying to write a list in parquet using below code, but something is
> going wrong..
>
> MessageType schema = MessageTypeParser.parseMessageType("message basket {
> required group myList (LIST) { repeated group list { required float
> listfloat;} } }");
> ParquetWriter<Group> writer=new ParquetWriter<Group>(outDirPath,new
> GroupWriteSupport() {
> @Override
> public WriteContext init(Configuration configuration) {
> if (configuration.get(GroupWriteSupport.PARQUET_EXAMPLE_SCHEMA) == null) {
> configuration.set(GroupWriteSupport.PARQUET_EXAMPLE_SCHEMA,
> schema.toString());
> }
> return super.init(configuration);
> }},CompressionCodecName.SNAPPY, 256*1024*1024, 100*1024);
> GroupWriteSupport.setSchema(schema,config);
> SimpleGroupFactory f=new SimpleGroupFactory(schema);
> writer.write(f.newGroup().append("listfloat", (
> float)2.8).append("listfloat", 3.3f));
>
>
> Its not working....exception :
> log4j:WARN No appenders could be found for logger
> (org.apache.hadoop.conf.Configuration.deprecation).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more
> info.
> Exception in thread "main" parquet.io.InvalidRecordException: listfloat not
> found in message basket {
> required group myList (LIST) {
> repeated group list {
> required float listfloat;
> }
> }
> }
>
> at parquet.schema.GroupType.getFieldIndex(GroupType.java:147)
> at parquet.example.data.Group.add(Group.java:39)
> at parquet.example.data.Group.append(Group.java:107)
> at ParquetTestWriter.main(ParquetTestWriter.java:90)
>
>
>
> Appreciate the response!!.
>
> Manisha
>
> ________________________________
>
>
--
Ryan Blue
Software Engineer
Cloudera, Inc.
________________________________