Hey Lukas

I am not quite following your point? What do you mean by "add option to use
own compiled class or dynamic message."

Chen
On Sun, Oct 26, 2014 at 7:31 PM, lukas nalezenec <[email protected]>
wrote:

> Hi,
> You are right, i will add option to use own compiled class or dynamic
> message.
>
> Lukas
>
> On Sun, Oct 26, 2014 at 8:27 PM, Chen Song <[email protected]> wrote:
>
> > Hi,
> >
> > I am new to Parquet and we have a complicated use case in which we want
> to
> > adopt Parquet as our storage format.
> >
> > Current:
> >
> >    - The data is stored in Sequence files as Protobuf.
> >    - We have map reduce jobs to write the data. Hive tables were created
> >    with Protobuf Serde using elephant-bird so people can query the data
> via
> >    Hive.
> >    - We enhance elephant-bird to add our own serializer so one can write
> >    data into table via Hive and data is stored in Sequence files as
> > Protobuf.
> >
> >
> > Future:
> > We want to use Parquet as the underlying storage format without losing
> > Protobuf abstraction at application layer. After a bit research and
> > practice, I have a few questions.
> >
> >    - Say if Hive table is created as Parquet table, and data is written
> via
> >    Hive.
> >    - If I want to read data in map reduce jobs as Protobuf records, can I
> >       use ProtoParquetInputFormat in
> >
> >
> https://github.com/Parquet/parquet-mr/blob/master/parquet-protobuf/src/main/java/parquet/proto/ProtoParquetInputFormat.java
> > ?
> >       After looking at the API, it doesn't seem possible that I can
> > specific the
> >       Protobuf class for the input path. Instead,
> > ProtoParquetInputFormat derives
> >       the class from the footer of the underlying data. Is it fair to
> >       day ProtoParquetInputFormat will only read data written
> >       by ProtoParquetOutputFormat? Is there a way to work around this?
> >       - If not, is there any out of the box Hive output format I can use
> to
> >       piggy back ProtoParquetOutputFormat?
> >    - If data is written by map reduce job with ProtoParquetOutputFormat.
> >    Will read query in Hive work automatically?
> >
> > Thanks a lot in advance. Any suggestions would be appreciated.
> >
> > --
> > Chen Song
> >
>



-- 
Chen Song

Reply via email to