On Mon, Apr 18, 2016 at 4:57 PM, Devaraj Das <[email protected]> wrote:
> Your mail is very timely, Stack. Since I have been involved in this work > for some time now, let me try replying to this email... > > Vamsi and his team is working with us (Hortonworks) to get a C++ HBase > client implementation up and running. We were discussing internally that we > should reach out to the Dev community now, discuss, and get help on having > these two implementations (Elliott's and Vamsi's) converge, if possible. > > +1 Lets not have two C++ clients or, to put it another way, I don't think we can check in two C++ clients. We will only confuse our users (thrift1 vs thrift2 redux!) > Elliott is right - since he already had something in the works at > Facebook, we thought it'd be good to start with that.. One of the major > things that existed in the patch Elliott put up was the use of Buck as the > way to build the C++ library. We thought that Make is a much more common > tool for building C++ and we went with that (and granted we should have > discussed this aspect earlier in the dev cycle). > > Yes, to both of the above. ... 1. Be able to do async RPC. Vamsi's current implementation does sync. I > know Vamsi is already looking at that aspect. Just to be clear, this in my > mind this doesn't qualify as a blocker - since we don't have an async > client API to support yet. > See org.apache.hadoop.hbase.ipc.AsyncRpcClient (HBASE-12684 and the still open HBASE-13784 where we'd surface an async API). Also consider asynchbase, which many consider a superior client to our own. The other arguments -- sync on async is easier to do than sync on async and scaling/threads -- make sense to me. > 2. Switch to C++ ways of configuring the hbase client configuration. This > is something I am really not sure about. By going this route, we'd have to > be able to manage two different ways of configuring things - one for Java > and another for C++. This will lead to unnecessary duplication of configs > and such (and the deployment tools would now have to be aware about a new > way of configuring c++ clients). But we can take a look at making this > configuration method pluggable if it makes sense. > 3. Use Facebook's Folly instead of POCO for the RPC layer implementation. > This is under consideration. Maybe, if we decide to have one implementation > going forward, this would be an area of active collaboration. > 4. Use of Facebook's Buck build system. This I already talked about above. > > Elliott, regarding your concern as to who would support the large code > drop .. we did talk about breaking the patch up into smaller ones if it > makes sense for reviews and such. I personally would like to avoid having > multiple implementations of the C++ client, and would like to see how we > can work together... I think Vamsi has already addressed the other concerns > to do with Copyright headers, etc. Vamsi can add more color wherever > needed... > The discussion here seems overdue but seems to be making good progress now. Thanks, St.Ack > > ________________________________________ > From: Elliott Clark <[email protected]> > Sent: Monday, April 18, 2016 3:38 PM > To: [email protected] > Subject: Re: What's going on? Two C++ clients being developed at the > moment? > > Yeah there's currently two different implementation efforts on-going. > > I started working on a cpp client a while ago. Then there was some interest > in working on the cpp client from other parties. So I put some of the > implementation up. Things stayed there for a while. Then interest surged > again. Vamsi had been working on some code away from jira's. > > As I see it right now, the cpp client is a great place to learn from our > mistakes. Async client is needed; it's simpler and cleaner. Retrofitting > async onto of a synchronous implementation leads to a very large mess (I > give you AsyncProcess). So I've been working on a fully async client using > Boost, Folly and Wangle. These are the libraries that power thrift at > Facebook. So I have some good faith in them being very fast and well > maintained. Folly and Wangle together allow for very few copy network > layer. I've provided a convenience docker file that has all libraries > needed. This allows everyone that's building to build vs the exact same > versions. > > Vamsi et al have created something that works now. It's able to connect and > send some commands. It's synchronous building on the poco library. It > builds using autotools. > > > For me I have a few concerns around the other implementations: > > * Who will support it. The HBase community has not had good luck with large > code drops from people who are not running the code every day. > * Sync client has been very hard to keep a clean code base. Why start with > that when there's a way forward that doesn't > * Poco: It's a library that I haven't heard of and I don't know the > scale/testing of it. > * There's code with other people's copyrights on the headers. For me this > is just a no-go. Importing code that has questions about who wrote what is > just a recipe to have Apache's lawyers get upset. Dima raised some points > that some things look to be gnu licensed. > * It uses XML for configuration of a native lib. That's something that is > VERY strange. I don't know another client lib that does that. > > > For the async implementation there are still some things that need to be > cleaned up: > * Some people would like to use a build system other than buck. That's > fine, I think anyone that wanted to add on a cmake file would be a nice > addition. > * There's still more work to go. Right now we can connect, send the header, > send a request header, and serialize across the request body. Getting the > response isn't there, and locating things in meta isn't done. > > On Mon, Apr 18, 2016 at 2:56 PM, Stack <[email protected]> wrote: > > > Correct me if I am wrong, but it seems like there are two (different?) > C++ > > clients underway? There is the work by Vamsi Mohan V S Thattikota that is > > going on in HBASE-15534 and then there is what seems like a different > > effort over in HBASE-14850 C++ client implementation by the mighty > Elliott. > > > > Whats up? We going to carry two c++ clients? Work together? > > > > Thanks, > > St.Ack > > >
