Apologies if this has been discussed before but I didn't see it in the archives.
I see poor performance of any Perl code against Cassandra compared to Java. I generally clock a 5-20x speed difference using the raw Thrift API, depending on the number of structures that need to be serialized/deserialized. This is with Perl 5.10 vs. the latest Sun JVM. I maintain the Net::Cassandra::Easy Perl module that uses this interface so I'd like to make it faster. I think any performance improvements would be good for all Thrift users so I am posting here in the hopes of getting some feedback. It seems to me like one of the problems is the large number of OO method calls, which in Perl are slower than function calls. Another is that pack()/unpack() is probably the fastest way to serialize/deserialize data in Perl, but it's not used much. Instead I see step-by-step accumulation of values from the source data, which is suboptimal. In Java this makes perfect sense but in Perl it drags performance down. Perhaps a good optimization would be to generate the pack/unpack format strings at compilation time, combine them with static function wrappers, and use that instead of multiple OO calls? Although I am comfortable with Perl, I don't know Thrift well enough to recommend the best approach there. I hope to be helpful with benchmarks and specific optimizations, though. Thanks Ted
