Re: parallel scanning?

Ted Yu Fri, 05 Feb 2016 04:14:20 -0800

bq. when the result line is so much lines

By line, did you mean number of rows ?


bq. one table with rowkey as A_B_time, another as B_A_time

In the above case, handling failed write (to the second table) becomes a
bit tricky.

Cheers

On Fri, Feb 5, 2016 at 12:08 AM, Jameson Li <hovlj...@gmail.com> wrote:

> 2016-01-26 2:29 GMT+08:00 Henning Blohm <henning.bl...@zfabrik.de>:
>
> > I am looking for advise on an HBase mass data access optimization
> problem.
> >
>
> For multi-get and multi-scan:
> In my opion, multi-get(make less line) can work in realtime query, but
> multi-scan maybe work but it will let server busy easy and effect other
> small-query to a big query time.
> But multi-get's query time will not stable, when one of the region is busy
> the whole time will up.
>
> For realtime and offline:
> watch your real query result, when the result line is so much lines, like
> Mbyte or 10Mbyte, it's quert time will not so good as miliseconds, because
> of the network trans time. We must reduce the result lines or result sizes
> or result columns. or it is not suit the real-realtime query.
> if actually need so much querys and so much big-szie results, suggest to
> work with offline and parallel, but not realtime, because also the server
> network-through will not work(1000M BIT NIC for 2M byte/qps, a server just
> handler 50qps).
>
> if just the query issue(multi-scan and multi-get), I think we can waste
> store to up the query performance, just using an extra table(maybe will
> write twice) and using another schema, eg: one table with rowkey as
> A_B_time, another as B_A_time, when query B%, we just query table rowkey
> B_A_time that just one small-scan, and not need for query table row
> A_B_time with multi_scans.
>
> Hope helpful for U.
>
>
>
>
> --
>
>
> Thanks & Regards,
> 李剑 Jameson Li
> Focus on Hadoop,Mysql
>

Re: parallel scanning?

Reply via email to