On 03/25/2015 10:09 AM, Richard, Tyler wrote:
Something like this? This is the source for the head command, a very simple
reader.
https://github.com/Parquet/parquet-mr/blob/master/parquet-tools/src/main/java/parquet/tools/command/HeadCommand.java
________________________________________
From: manish agarwal [manishagarwa...@gmail.com]
Sent: Tuesday, March 24, 2015 9:56 PM
To: dev@parquet.incubator.apache.org
Subject: Re: reader is a bit complex
actaully i wanted to write a java utility which will read me columns and
then i can process those fields as per my requirements . Hence was looking
for some examples
The first thing to do is to choose how you want to represent the data,
the objects that you want Parquet to pass back to your program. This
determines the variant you end up using to do the read, like
parquet-avro, parquet-thrift, or parquet-protobuf. If you don't know
which one you want, I recommend Avro because it's pretty flexible and
you can use basically all the same code with Avro files instead of
Parquet if you need to.
It might be tempting to go without an object model because they seem
like add-ons, but that's a much bigger commitment. I also do NOT
recommend using the "example" object model (it is what it says: an
example) or the "simple" model from parquet-tools. When in doubt, use
parquet-avro.
Once you know that, then it's just a matter of instantiating your reader:
AvroParquetReader<GenericRecord> reader =
new AvroParquetReader<GenericRecord>(file);
GenericRecord nextRecord;
while ((nextRecord = reader.read()) != null) {
// process nextRecord
}
reader.close();
I hope that helps!
rb
--
Ryan Blue
Software Engineer
Cloudera, Inc.