Stack, Thanks for the quick response. Well, the extra layer really kill the Performance. The 'hop' is so expensive
Is there another C/C++ api to try out? I saw there is a jira Hbase-1015, but was inactive for a while. Demai Stack <[email protected]> wrote: >Is it because of the 'hop'? Java goes against RS. The thrift C++ goes to a >thriftserver which hosts a java client and then it goes to the RS? >St.Ack > >On Fri, Mar 6, 2015 at 4:46 PM, Demai Ni <[email protected]> wrote: > >> hi, guys, >> >> I am trying to get a rough idea about the performance comparison between >> c++ and java client when access HBase table, and is surprised to find out >> that Thrift (c++) is 4X slower >> >> The performance result is: >> C++: real *16m11.313s*; user 5m3.642s; sys 2m21.388s >> Java: real *4m6.012s*;user 0m31.228s; sys 0m8.018s >> >> >> I have a single node HBase(98.6) cluster, with 1X TPCH loaded, and use the >> largest table : lineitem, which has 6M rows, roughly 600MB data. >> >> For c++ client, I used the thrift example provided by hbase-examples, the >> C++ code looks like: >> >> > std::string t("lineitem"); >> > int scanner = client.scannerOpenWithScan(t, tscan, dummyAttributes); >> > int count = 0; >> > .. >> > while (true) { >> > std::vector<TRowResult> value; >> > client.scannerGet(value, scanner); >> > if (value.size() == 0) break; >> > count ++; >> > } >> > >> > std::cout << count << " rows scanned"<< std::endl; >> > >> >> For java client is the most simple one: >> >> > HTable table = new HTable(conf,"lineitem"); >> > >> > Scan scan = new Scan(); >> > ResultScanner resScanner; >> > resScanner = table.getScanner(scan); >> > int count = 0; >> > for (Result res: resScanner) { >> > count ++; >> > } >> > >> >> >> >> Since most of the time should be on I/O, I don't expect any significant >> difference between Thrift(C++) and Java. Any ideas? Many thanks >> >> Demai >>
