bq. when the result line is so much lines By line, did you mean number of rows ?
bq. one table with rowkey as A_B_time, another as B_A_time In the above case, handling failed write (to the second table) becomes a bit tricky. Cheers On Fri, Feb 5, 2016 at 12:08 AM, Jameson Li <hovlj...@gmail.com> wrote: > 2016-01-26 2:29 GMT+08:00 Henning Blohm <henning.bl...@zfabrik.de>: > > > I am looking for advise on an HBase mass data access optimization > problem. > > > > For multi-get and multi-scan: > In my opion, multi-get(make less line) can work in realtime query, but > multi-scan maybe work but it will let server busy easy and effect other > small-query to a big query time. > But multi-get's query time will not stable, when one of the region is busy > the whole time will up. > > For realtime and offline: > watch your real query result, when the result line is so much lines, like > Mbyte or 10Mbyte, it's quert time will not so good as miliseconds, because > of the network trans time. We must reduce the result lines or result sizes > or result columns. or it is not suit the real-realtime query. > if actually need so much querys and so much big-szie results, suggest to > work with offline and parallel, but not realtime, because also the server > network-through will not work(1000M BIT NIC for 2M byte/qps, a server just > handler 50qps). > > if just the query issue(multi-scan and multi-get), I think we can waste > store to up the query performance, just using an extra table(maybe will > write twice) and using another schema, eg: one table with rowkey as > A_B_time, another as B_A_time, when query B%, we just query table rowkey > B_A_time that just one small-scan, and not need for query table row > A_B_time with multi_scans. > > Hope helpful for U. > > > > > -- > > > Thanks & Regards, > 李剑 Jameson Li > Focus on Hadoop,Mysql >