Re: parallel scanning?

Jameson Li Fri, 05 Feb 2016 21:49:21 -0800

''By line, did you mean number of rows ?

Yes, sorry for my poor English.


''In the above case, handling failed write (to the second table) becomes a
bit tricky.

Yes, But I think sometimes write question will can solve easier than read,
and that sometimes we can write twice/multi-time with no problem(premise do
not operate column timestamp)




2016-02-05 20:13 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:

> bq. when the result line is so much lines
>
> By line, did you mean number of rows ?
>
> bq. one table with rowkey as A_B_time, another as B_A_time
>
> In the above case, handling failed write (to the second table) becomes a
> bit tricky.
>
> Cheers
>
> On Fri, Feb 5, 2016 at 12:08 AM, Jameson Li <hovlj...@gmail.com> wrote:
>
> > 2016-01-26 2:29 GMT+08:00 Henning Blohm <henning.bl...@zfabrik.de>:
> >
> > > I am looking for advise on an HBase mass data access optimization
> > problem.
> > >
> >
> > For multi-get and multi-scan:
> > In my opion, multi-get(make less line) can work in realtime query, but
> > multi-scan maybe work but it will let server busy easy and effect other
> > small-query to a big query time.
> > But multi-get's query time will not stable, when one of the region is
> busy
> > the whole time will up.
> >
> > For realtime and offline:
> > watch your real query result, when the result line is so much lines, like
> > Mbyte or 10Mbyte, it's quert time will not so good as miliseconds,
> because
> > of the network trans time. We must reduce the result lines or result
> sizes
> > or result columns. or it is not suit the real-realtime query.
> > if actually need so much querys and so much big-szie results, suggest to
> > work with offline and parallel, but not realtime, because also the server
> > network-through will not work(1000M BIT NIC for 2M byte/qps, a server
> just
> > handler 50qps).
> >
> > if just the query issue(multi-scan and multi-get), I think we can waste
> > store to up the query performance, just using an extra table(maybe will
> > write twice) and using another schema, eg: one table with rowkey as
> > A_B_time, another as B_A_time, when query B%, we just query table rowkey
> > B_A_time that just one small-scan, and not need for query table row
> > A_B_time with multi_scans.
> >
> > Hope helpful for U.
> >
> >
> >
> >
> > --
> >
> >
> > Thanks & Regards,
> > 李剑 Jameson Li
> > Focus on Hadoop,Mysql
> >
>



-- 


Thanks & Regards,
李剑 Jameson Li
Focus on Hadoop,Mysql

Re: parallel scanning?

Reply via email to