I'm assuming that you are using the Java rather than C++ side of of the
project.

What you want is org.apache.orc:orc-core, which includes the protobuf class
as org.apache.orc.OrcProto.

That jar depends on org.apache.hive:hive-storage-api, which comes from Hive
and defines the vectorized API.

The ORC project also releases a variant using the "nohive" classifier. It
incorporates the storage-api and protobuf libraries into orc-core and
shrouds them so that they do not conflict with Hive. This allows projects
that already depend on a particular version of Hive to use ORC's "nohive"
variant without a conflict.

ORC-core provides the vectorized API, which is very efficient and does not
create any objects in the inner loop. If you want an easier API with
OrcStruct, you will want to use the orc-mapreduce jar.

.. Owen


On Mon, Feb 12, 2018 at 1:53 PM, Matt Burgess <mattyb...@apache.org> wrote:

> Hi all, sorry if this is a n00b question or has been answered, I
> looked in the mailing list archives and could find anything.
>
> I'm trying to bring Apache ORC into Apache NiFi as basically a
> third-party library, to support a processor that writes data as ORC. A
> version of this processor already exists, but it uses Hive 1.2.1 which
> has hive-orc. Now that Apache ORC is its own project, and we're
> upgrading the Hive processors in NiFi to Hive 2.x (and 3.x), I'd like
> to add a version of the processor that uses the current version of
> Apache ORC (for Java).
>
> However, when I bring in org.apache.orc:orc:1.4.3 as a Maven
> dependency, it is trying to find a JAR with those coordinates, when it
> is only published as a POM (even when I set the type of artifact as
> "pom"). If instead I bring in orc-core, I don't have access to
> orc-proto, etc.
>
> What's the approach for bringing in all the necessary dependencies to
> be able to use (including subclassing) ORC classes?
>
> Thanks in advance,
> Matt
>

Reply via email to