Ken, Trevni is alive and well. It's been included in the last two Avro releases, 1.7.3 and 1.7.4. There were some good improvements to Trevni in 1.7.4, so use that version if possible.
Patches that add Trevni support to Pig and Hive are available: https://issues.apache.org/jira/browse/PIG-3015 https://issues.apache.org/jira/browse/HIVE-3585 If you already use Avro then Trevni's easy to incorporate. For MapReduce jobs, you can write Trevni output from a program that produced Avro before by simply changing the OutputFormat. Similarly, to read Trevni input in MapReduce simply change the InputFormat and specify a subset schema (deleting fields you don't need, i.e., projecting). So it shouldn't be hard to use Trevni with Cascading. If you work on this, please let us know how it goes. Cheers, Doug On Fri, Mar 1, 2013 at 12:37 PM, Ken Krugler <[email protected]> wrote: > Hi all, > > Any input as to the status of Trevni? > > I'm researching column-oriented file formats that aren't tightly coupled to > specific platforms - this precludes ORCFile, for example. > > CIF seemed interesting, but IBM hasn't released the code. And Trevni seems > to be a reasonable open source implementation of what they describe. > > But I hadn't heard much about Trevni recently, or if anybody is using it for > real work. > > I see it mentioned in conjunction with Impala, but it sounds like that's on > the roadmap versus being available yet. > > For context, I'm looking into using a column store to speed up Cascading > workflows. > > Thanks, > > -- Ken > > PS - resending to user@, since the first email to dev@ seems to have > disappeared…maybe I'm no longer getting dev@ emails? > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://www.scaleunlimited.com > custom big data solutions & training > Hadoop, Cascading, Cassandra & Solr > > > > >
