On 03/25/2015 10:09 AM, Richard, Tyler wrote:
Something like this? This is the source for the head command, a very simple 
reader.

https://github.com/Parquet/parquet-mr/blob/master/parquet-tools/src/main/java/parquet/tools/command/HeadCommand.java

________________________________________
From: manish agarwal [manishagarwa...@gmail.com]
Sent: Tuesday, March 24, 2015 9:56 PM
To: dev@parquet.incubator.apache.org
Subject: Re: reader is a bit complex

actaully i wanted to write a java utility which will read me columns and
then i can process those fields as per my requirements . Hence was looking
for some examples

The first thing to do is to choose how you want to represent the data, the objects that you want Parquet to pass back to your program. This determines the variant you end up using to do the read, like parquet-avro, parquet-thrift, or parquet-protobuf. If you don't know which one you want, I recommend Avro because it's pretty flexible and you can use basically all the same code with Avro files instead of Parquet if you need to.

It might be tempting to go without an object model because they seem like add-ons, but that's a much bigger commitment. I also do NOT recommend using the "example" object model (it is what it says: an example) or the "simple" model from parquet-tools. When in doubt, use parquet-avro.

Once you know that, then it's just a matter of instantiating your reader:

  AvroParquetReader<GenericRecord> reader =
      new AvroParquetReader<GenericRecord>(file);
  GenericRecord nextRecord;
  while ((nextRecord = reader.read()) != null) {
    // process nextRecord
  }
  reader.close();

I hope that helps!

rb

--
Ryan Blue
Software Engineer
Cloudera, Inc.

Reply via email to