To do column projection, you need to specify an include array in the
options to Reader.rows. It looks like this:
Reader reader = OrcFile.createReader(new Path(filename), options);
TypeDescription schema = reader.getSchema();
boolean[] include = new boolean[schema.getMaximumId() + 1];
// select only the first column to read
TypeDescription col0 = schema.getChildren().get(0);
for(int c=col0.getId(); c <= col0.getMaximumId(); ++c) {
include[c] = true;
}
RecordReader rows = reader.rows(new Reader.Options().include(include));
.. Owen
On Thu, Jun 23, 2016 at 5:59 PM, Kavinder Dhaliwal <[email protected]>
wrote:
> Hi,
>
> I am new to the ORC library and am looking for an example of how to read
> ORC files directly through the Java API. Specifically, how to project
> columns through the RecordReader. I have taken a look at the example at
> https://orc.apache.org/docs/core-java.html but don't know how to actually
> extract a single row from the inner loop. The Hive ORC RecordReader
> interface has a .next() method which I don't see available in the
> org.apache.orc interface.
>
> I appreciate the help and apologize for my ignorance
>
> Kavinder
>