Hi Ryan.

If you need to build from the scratch, you may want to see a standalone
converter example in Apache ORC repository.

    -
https://github.com/apache/orc/blob/master/java/tools/src/java/org/apache/orc/tools/convert/ConvertTool.java

Although it doesn't support Avro, there are CsvReader and JsonReader
in the same directory. So, you may implement AvroReader similarly.

    -
https://github.com/apache/orc/blob/master/java/tools/src/java/org/apache/orc/tools/convert/CsvReader.java
    -
https://github.com/apache/orc/blob/master/java/tools/src/java/org/apache/orc/tools/convert/JsonReader.java

However, you can use the existing software or converter tools.
For example, You can simply dockerize Apache Spark 3.0.0 on JDK11
docker image and use it. The full JDK11 (openjdk:11) is 627MB.
If you use 11-jre-slim(`204MB`) as a base image,
the final docker image (Apache Spark 3.0.0 + JDK11) will be 500MB.

Bests,
Dongjoon.


On Wed, Jul 15, 2020 at 1:51 PM Ryan Schachte <coderyanschac...@gmail.com>
wrote:

> I'm writing a standalone Java process and interested in converting the
> consumed Avro messages to ORC. I've seen a plethora of examples of writing
> to ORC, but the conversion to ORC from Avro is what I can't seem to find a
> lot of examples of.
>
> This is just a standard Java process running inside of a container.
>

Reply via email to