I agree with Gary here. What would "common" be anyways? Just curious.
Also, Flume is using Thrift, so do quite a few here as the gateway server to access HBase. Could we get some reports from those who used it if they are happy with the Thrift RPC? It seems like they are, but maybe we should carefully check into the option. I am also wary about using REST. Especially when you handle very small payloads, then the performance is driven by the protocol overhead. Lars On Wed, Jun 1, 2011 at 12:47 AM, Gary Helmling <[email protected]> wrote: > On Tue, May 31, 2011 at 1:22 PM, Stack <[email protected]> wrote: > >> On Mon, May 30, 2011 at 9:55 PM, Eric Yang <[email protected]> wrote: >> > Maven modulation could be enhanced to have a structure looks like this: >> > >> > Super POM >> > +- common >> > +- shell >> > +- master >> > +- region-server >> > +- coprocessor >> > >> > The software is basically group by processor type (role of the process) >> and a shared library. >> > >> >> I'd change the list above. shell should be client and perhaps master >> and regionserver should be both inside a single 'server' submodule. >> We need to add security in there. Perhaps we'd have a submodule for >> thrift, avro, rest (and perhaps rest war file)? (Is this too many >> submodules -- I suppose once we are submodularized, adding new ones >> is trivial. Its the initial move to submodules that is painful) >> >> > I'd be in favor of starting simply as well. Something like: > > - common > - client > - server > - security > > or even combine the "common" bits just in to "client". I agree thrift, avro > and rest would make perfect module candidates as well, but I don't feel > particularly strongly about them myself. I also don't really see the > coprocessor framework as a separate module. It's more like part of the > server infrastructure. > > HTTP/REST is one good option to have (among many) as an application > interface to HBase. But I'm skeptical of it's applicability as an internal > RPC transport. Personally, I think we need a well defined (but still > performant) serialization format to better support cross-version operation > and alternate clients such as asynchbase. The actual RPC framework we use > (from Hadoop) may not be perfect, but it's seen a lot of profiling and it's > threading model seems to perform pretty well for HBase workloads with > long-lived connections. > > The current framework also continues to evolve, with some recent effort to > work in asynchronous handling on the server-side. And in addition we have > full support for security via Kerberos and token-based DIGEST-MD5 > authentication in a separate branch. I'm personally not really interested > in repeating the work to incorporate security over a new HTTP based stack. > I think I'd need some convincing that an HTTP transport would perform better > than what we have. I'm more inclined to go an evolutionary route in > improving our current stack. > > --gh >
