Lists are meant to be lists for convenience. Performance would be nice, but turning them into *Buffers would be a major usability hit. I recommended using binary for your list<byte> because typically a list of bytes is a byte array. The same advice is not relevant to the other types.
On Mon, Aug 1, 2011 at 7:57 PM, Gautam Thaker <[email protected]> wrote: > On 7/30/2011 2:28 AM, Bryan Duxbury wrote: > > If you want efficient data structure for binary data, use the "binary" > > thrift data type. It maps to ByteBuffer backed by a byte[] in Java. > > > > I have made some detailed measurement of performance. > > typedef list<byte> OctetSeq > service ATL { > i16 octet_thruput(1:OctetSeq payload), > i16 octet_thruput2(1:binary payload), > } > > Thus, I have two types of payload and have tested this RPC performance > using 2 languages. > > Java (java_client<-->java_server) and > cpp (cpp_client<-->cpp_server) > > This gives me 4 results in all. The size of payload I have varied from 4 > bytes to 64K bytes. For each call I measure round trip latency. I did 1 > million roundtrips for all but the "Java list<byte>" case, there I did > 10,000 round trips in order to save time. Client and server are on two > Dell PC1950 Fedora 12 machines on isolated network, GigEthernet. (Java > and C++ TCP results are identical.) > > As one can see from attached graphic the "list<byte>" payload in java > is about 100x slower than same thrift RPC being made in C++. Surely it > seems one should find to efficient mappings possible, no? While in place > of "list<byte>" one can use "binary", there is no similar substitute for > "list<i32>" or "list<double>". So it seems that for > "list<primitive_data_type>" mapping should use either > ByteBuffer,DoubleBuffer,Int/Long/ShortBuffer, no? > > Gautam > >
