Re: Trevni status

Doug Cutting Fri, 01 Mar 2013 12:48:33 -0800

Ken,

Trevni is alive and well.  It's been included in the last two Avro
releases, 1.7.3 and 1.7.4.  There were some good improvements to
Trevni in 1.7.4, so use that version if possible.

Patches that add Trevni support to Pig and Hive are available:

  https://issues.apache.org/jira/browse/PIG-3015
  https://issues.apache.org/jira/browse/HIVE-3585

If you already use Avro then Trevni's easy to incorporate.  For
MapReduce jobs, you can write Trevni output from a program that
produced Avro before by simply changing the OutputFormat.  Similarly,
to read Trevni input in MapReduce simply change the InputFormat and
specify a subset schema (deleting fields you don't need, i.e.,
projecting).

So it shouldn't be hard to use Trevni with Cascading.  If you work on
this, please let us know how it goes.

Cheers,

Doug

On Fri, Mar 1, 2013 at 12:37 PM, Ken Krugler
<[email protected]> wrote:
> Hi all,
>
> Any input as to the status of Trevni?
>
> I'm researching column-oriented file formats that aren't tightly coupled to
> specific platforms - this precludes ORCFile, for example.
>
> CIF seemed interesting, but IBM hasn't released the code. And Trevni seems
> to be a reasonable open source implementation of what they describe.
>
> But I hadn't heard much about Trevni recently, or if anybody is using it for
> real work.
>
> I see it mentioned in conjunction with Impala, but it sounds like that's on
> the roadmap versus being available yet.
>
> For context, I'm looking into using a column store to speed up Cascading
> workflows.
>
> Thanks,
>
> -- Ken
>
> PS - resending to user@, since the first email to dev@ seems to have
> disappeared…maybe I'm no longer getting dev@ emails?
>
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Cassandra & Solr
>
>
>
>
>

Re: Trevni status

Reply via email to