I'm assuming that you are using the Java rather than C++ side of of the project.
What you want is org.apache.orc:orc-core, which includes the protobuf class as org.apache.orc.OrcProto. That jar depends on org.apache.hive:hive-storage-api, which comes from Hive and defines the vectorized API. The ORC project also releases a variant using the "nohive" classifier. It incorporates the storage-api and protobuf libraries into orc-core and shrouds them so that they do not conflict with Hive. This allows projects that already depend on a particular version of Hive to use ORC's "nohive" variant without a conflict. ORC-core provides the vectorized API, which is very efficient and does not create any objects in the inner loop. If you want an easier API with OrcStruct, you will want to use the orc-mapreduce jar. .. Owen On Mon, Feb 12, 2018 at 1:53 PM, Matt Burgess <mattyb...@apache.org> wrote: > Hi all, sorry if this is a n00b question or has been answered, I > looked in the mailing list archives and could find anything. > > I'm trying to bring Apache ORC into Apache NiFi as basically a > third-party library, to support a processor that writes data as ORC. A > version of this processor already exists, but it uses Hive 1.2.1 which > has hive-orc. Now that Apache ORC is its own project, and we're > upgrading the Hive processors in NiFi to Hive 2.x (and 3.x), I'd like > to add a version of the processor that uses the current version of > Apache ORC (for Java). > > However, when I bring in org.apache.orc:orc:1.4.3 as a Maven > dependency, it is trying to find a JAR with those coordinates, when it > is only published as a POM (even when I set the type of artifact as > "pom"). If instead I bring in orc-core, I don't have access to > orc-proto, etc. > > What's the approach for bringing in all the necessary dependencies to > be able to use (including subclassing) ORC classes? > > Thanks in advance, > Matt >