Re: Thrift?

Min Zhou Fri, 14 Sep 2012 20:18:20 -0700

Protobuf +1

Even though the google hasn't officially open source their internal RPC
based on protobuf, there are lots of third-party implementations for all
most all of the popular languages. After all, protobuf give me a better
use experience then that of thrift :)


Thanks,
Min


On Sat, Sep 15, 2012 at 9:56 AM, Clark Yang (杨卓荦) <[email protected]>wrote:

> protobuf +1
> I don't think it is a standard problem. protobuf has already shown a great
> many benefits and success in many open source projects. It's widely used
> and few better alternative, I think.
>
> BTW, I have posted the first comment of the first jira.
>
> Cheers,
> Zhuoluo (Clark) Yang
>
>
>
> 2012/9/15 Constantine Peresypkin <[email protected]>
>
> > I really have no idea how one can estimate telco traffic.
> > But I highly doubt that you can fruitfully compare reliability of
> > internal-only protocol (same implementation, easy to enforce
> compatibility)
> > to an interoperable one.
> >
> > On Sat, Sep 15, 2012 at 1:41 AM, Ryan Rawson <[email protected]> wrote:
> >
> > > I didn't say I was the one making the argument...
> > >
> > > Google has put probably > 10^24 bytes of data thru protobuf in
> > > multiple implementations (eg: serialization on disk and on wire RPC).
> > >   That is a low estimate.
> > >
> > > I'd be interested in hearing what 20 years of telco protocol traffic
> > > might compare to 10 years of google's usage of protobuf.  Exponential
> > > curve and all of that.
> > >
> > >
> > >
> > >
> > >
> > > On Fri, Sep 14, 2012 at 3:36 PM, Constantine Peresypkin
> > > <[email protected]> wrote:
> > > > More battle tested than more than 20 year old standard used almost in
> > > every
> > > > telecom protocol that exists nowdays?
> > > > I think your statement is a little on "too bold" side. :)
> > > >
> > > > On Sat, Sep 15, 2012 at 1:30 AM, Ryan Rawson <[email protected]>
> > wrote:
> > > >
> > > >> Funny thing, given how much use protobufs has been put thru, I think
> > > >> one could make the argument its more battle tested than ASN.1 ...
> > > >>
> > > >> On Fri, Sep 14, 2012 at 3:24 PM, Constantine Peresypkin
> > > >> <[email protected]> wrote:
> > > >> > Protobuf is an attempt to make ASN.1 more developer friendly (not
> a
> > > bad
> > > >> > attempt).
> > > >> > It's simpler, has much less features, easier to implement and has
> a
> > > >> compact
> > > >> > encoding.
> > > >> > But on other hand it's non-standard, "reinvented wheel" they could
> > > just
> > > >> do
> > > >> > a "better than PER" encoding for ASN.1, and AFAIK has no support
> for
> > > the
> > > >> > new and shiny Google encodings, like "group varint".
> > > >> > All in all in current situation it seems a better choice than
> ASN.1,
> > > not
> > > >> > even arguing about something even more vague and non-standard as
> > > Thrift.
> > > >> >
> > > >> > On Sat, Sep 15, 2012 at 12:38 AM, Ryan Rawson <[email protected]
> >
> > > >> wrote:
> > > >> >
> > > >> >> Thanks for that Ted.
> > > >> >>
> > > >> >> Correct - internal wire format doesnt mean 'drill only supports
> > > >> >> protobuf encoded data'.
> > > >> >>
> > > >> >> Part of the reason to favor protobuf is that a lot of people in
> the
> > > >> >> broader 'big data' community are building a lot of experience
> with
> > > it.
> > > >> >>  Hadoop and HBase both are moving to/moved to protobuf on the
> wire.
> > > >> >> Being able to leverage this expertise is valuable.
> > > >> >>
> > > >> >> There is a JIRA in Hadoop-land where someone had done a deep dive
> > > >> >> 'bake off' between thrift, protobuf and avro.  The ultimate
> choice
> > > was
> > > >> >> protobuf for a number of reasons.  If people want to re-do the
> > > >> >> analysis, I'd like to see it in the context of THAT analysis (eg:
> > why
> > > >> >> the assumptions there are not the same for Drill)... if anything
> > it'd
> > > >> >> give a concrete form to what can be a mire.
> > > >> >>
> > > >> >> For what it's worth, I've had many discussion along these angles
> > with
> > > >> >> a variety of people including committers on Thrift, and the
> > consensus
> > > >> >> is both are good choices.
> > > >> >>
> > > >> >> -ryan
> > > >> >>
> > > >> >> On Fri, Sep 14, 2012 at 2:31 PM, Ted Dunning <
> > [email protected]>
> > > >> >> wrote:
> > > >> >> > I think that it is important to ask a few questions leading up
> a
> > > >> decision
> > > >> >> > here.
> > > >> >> >
> > > >> >> > The first is a (rhetorical) show of hands about how many people
> > > >> believe
> > > >> >> > that there are no serious performance or expressivity killers
> > when
> > > >> >> > comparing alternative serialization frameworks.  As far as I
> > know,
> > > >> >> > performance differences are not massive (and protobufs is one
> of
> > > the
> > > >> >> > leaders in any case) and the expressivity differences are
> > > essentially
> > > >> >> nil.
> > > >> >> >  If somebody feels that there is a serious show-stopper with
> any
> > > >> option,
> > > >> >> > they should speak.
> > > >> >> >
> > > >> >> > The second is to ask the sense of the community whether they
> > judge
> > > >> >> progress
> > > >> >> > or perfection in this decision is most important to the
> project.
> > >  My
> > > >> >> guess
> > > >> >> > is that almost everybody would prefer to see progress as long
> as
> > > the
> > > >> >> > technical choice is not subject to some horrid missing bit.
> > > >> >> >
> > > >> >> > The final question is whether it is reasonable to go along with
> > > >> protobufs
> > > >> >> > given that several very experienced engineers prefer it and
> would
> > > >> like to
> > > >> >> > produce code based on it.  If the first two answers are
> answered
> > to
> > > >> the
> > > >> >> > effect of protobufs is about as good as we will find and that
> > > progress
> > > >> >> > trumps small differences, then it seems that moving to follow
> > this
> > > >> >> > preference of Jason and Ryan for protobufs might be a
> reasonable
> > > >> thing to
> > > >> >> > do.
> > > >> >> >
> > > >> >> > The question of an internal wire format, btw, does not
> constrain
> > > the
> > > >> >> > project relative to external access.  I think it is important
> to
> > > >> support
> > > >> >> > JDBC and ODBC and whatever is in common use for querying.  For
> > > >> external
> > > >> >> > access the question is quite different.  Whereas for the
> internal
> > > >> format
> > > >> >> > consensus around a single choice has large benefits, the
> external
> > > >> format
> > > >> >> > choice is nearly the opposite.  For an external format,
> limiting
> > > >> >> ourselves
> > > >> >> > to a single choice seems like a bad idea and increasing the
> > > audience
> > > >> >> seems
> > > >> >> > like a better choice.
> > > >> >> >
> > > >> >> > On Fri, Sep 14, 2012 at 12:44 PM, Ryan Rawson <
> > [email protected]>
> > > >> >> wrote:
> > > >> >> >
> > > >> >> >> Hi folks,
> > > >> >> >>
> > > >> >> >> I just commented on this first JIRA.  Here is my text:
> > > >> >> >>
> > > >> >> >> This issue has been hashed over a lot in the Hadoop projects.
> > > There
> > > >> >> >> was work done to compare thrift vs avro vs protobuf. The
> > > conclusion
> > > >> >> >> was protobuf was the decision to use.
> > > >> >> >>
> > > >> >> >> Prior to this move, there had been a lot of noise about
> > pluggable
> > > RPC
> > > >> >> >> transports, and whatnot. It held up adoption of a backwards
> > > >> compatible
> > > >> >> >> serialization framework for a long time. The problem ended up
> > > being
> > > >> >> >> the analysis-paralysis, rather than the specific
> implementation
> > > >> >> >> problem. In other words, the problem was a LACK of
> > implementation
> > > >> than
> > > >> >> >> actual REAL problems.
> > > >> >> >>
> > > >> >> >> Based on this experience, I'd strongly suggest adopting
> protobuf
> > > and
> > > >> >> >> moving on. Forget about pluggable RPC implementations, the
> > > complexity
> > > >> >> >> doesnt deliver benefits. The benefits of protobuf is that its
> > the
> > > RPC
> > > >> >> >> format for Hadoop and HBase, which allows Drill to draw on the
> > > broad
> > > >> >> >> experience of those communities who need to implement high
> > > >> performance
> > > >> >> >> backwards compatible RPC serialization.
> > > >> >> >>
> > > >> >> >> ====
> > > >> >> >>
> > > >> >> >> Expanding a bit, I've looked in to this issue a lot, and there
> > is
> > > >> very
> > > >> >> >> few significant concrete reasons to choose protobuf vs thrift.
> > >  Tiny
> > > >> >> >> percent faster of this, and that, etc.  I'd strongly suggest
> > > protobuf
> > > >> >> >> for the expanded community.  There is no particular Apache
> > > imperative
> > > >> >> >> that Apache projects re-use libraries.  Use what makes sense
> for
> > > your
> > > >> >> >> project.
> > > >> >> >>
> > > >> >> >> As regards to Avro, it's a fine serialization format for long
> > term
> > > >> >> >> data retention, but the complexities that exist to enable that
> > > make
> > > >> it
> > > >> >> >> non-ideal for an RPC.  I know of no one who uses AvroRPC in
> any
> > > form.
> > > >> >> >>
> > > >> >> >> -ryan
> > > >> >> >>
> > > >> >> >> On Tue, Sep 4, 2012 at 12:30 PM, Tomer Shiran <
> > > [email protected]>
> > > >> >> >> wrote:
> > > >> >> >> > We plan to propose the architecture and interfaces in the
> next
> > > >> couple
> > > >> >> >> > weeks, which will make it easy to divide the project into
> > clear
> > > >> >> building
> > > >> >> >> > blocks. At that point it will be easier to start
> contributing
> > > >> >> different
> > > >> >> >> > data sources, data formats, operators, query languages, etc.
> > > >> >> >> >
> > > >> >> >> > The contributions are done in the usual Apache way. It's
> best
> > to
> > > >> open
> > > >> >> a
> > > >> >> >> > JIRA and then post a patch so that others can review and
> then
> > a
> > > >> >> committer
> > > >> >> >> > can check it in.
> > > >> >> >> >
> > > >> >> >> > On Tue, Sep 4, 2012 at 12:23 PM, Chandan Madhesia <
> > > >> >> >> [email protected]
> > > >> >> >> >> wrote:
> > > >> >> >> >
> > > >> >> >> >> Hi
> > > >> >> >> >>
> > > >> >> >> >> Hi
> > > >> >> >> >>
> > > >> >> >> >> What is the process to become a contributor to drill ?
> > > >> >> >> >>
> > > >> >> >> >> Regards
> > > >> >> >> >> chandan
> > > >> >> >> >>
> > > >> >> >> >> On Tue, Sep 4, 2012 at 9:51 PM, Ted Dunning <
> > > >> [email protected]>
> > > >> >> >> wrote:
> > > >> >> >> >>
> > > >> >> >> >> > Suffice it to say that if *you* think it is important
> > enough
> > > to
> > > >> >> >> implement
> > > >> >> >> >> > and maintain, then the group shouldn't say naye.  The
> > > consensus
> > > >> >> stuff
> > > >> >> >> >> > should only block things that break something else.
> >  Additive
> > > >> >> features
> > > >> >> >> >> that
> > > >> >> >> >> > are highly maintainable (or which come with commitments)
> > > >> shouldn't
> > > >> >> >> >> > generally be blocked.
> > > >> >> >> >> >
> > > >> >> >> >> > On Tue, Sep 4, 2012 at 9:14 AM, Michael Hausenblas <
> > > >> >> >> >> > [email protected]> wrote:
> > > >> >> >> >> >
> > > >> >> >> >> > > Good. Feel free to put me down for that, if the group
> as
> > a
> > > >> whole
> > > >> >> >> thinks
> > > >> >> >> >> > > that (supporting Thrift) makes sense.
> > > >> >> >> >> > >
> > > >> >> >> >> >
> > > >> >> >> >>
> > > >> >> >> >
> > > >> >> >> >
> > > >> >> >> >
> > > >> >> >> > --
> > > >> >> >> > Tomer Shiran
> > > >> >> >> > Director of Product Management | MapR Technologies |
> > > 650-804-8657
> > > >> >> >>
> > > >> >>
> > > >>
> > >
> >
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: Thrift?

Reply via email to