On Mon, May 3, 2010 at 10:03 AM, Mayan Moudgill <[email protected]> wrote:
> The idea of marshalling to strings seems somewhat counter-productive; after > all, you're marshalling the data using Thrift, which then gets sent to a > server, and demarshalls it. Now, on top of that you're adding another layer > of marshalling. > Understood; that's why, ideally, I'd rather have Thrift handle this for me transparently and more efficiently. > If, however, you're encoding the data for demarshalling at the server, it > sounds like you want a different RPC framework. For instance, do you really > need the version flexibility that is provided by Thrift? Are your types > fixed at source & destination? Do you need a leaner transport? In fact, why > did you pick Thrift in the first place? > Yes, I want to version my service interfaces. I chose Thrift because we already use it for other services -- in the spirit of consistency and minimizing the number of RPC frameworks we use. In this case, I'm actually not looking for something optimal in terms of >> efficiency. The data structures I'm passing in are small and the services >> I'm calling are coarse-grained so the transport+marshalling costs should >> be >> relatively insignificant compared to what happens in the service. >> > > Have you actually measured this? Why do you think that this might be the > case? No, but the service I'm calling is IO+CPU intensive so it's a safe assumption that any data marshalling will be only a fraction of I/O and processing costs. I understand that, but being sloppy about performance can lead to real pain. > Consider the case where an efficient implementation fits in a single CPU. > Being a little sloppy means that you have to go to a multi-threaded > implementation, but within a chip. Being a lot sloppy means that you may > have to go to a distributed implementation.... > Ugh? I'm talking about serializing simple data types like int32 and int64 to/from strings. I think you're getting carried away ;) Mind you, the service itself is already a multi-threaded and distributed implementation -- it provides query processing for a "big data" multi-dimensional cube. alex
