Re: parallel scanning?

Stack Mon, 01 Feb 2016 12:20:18 -0800

On Mon, Jan 25, 2016 at 10:29 AM, Henning Blohm <henning.bl...@zfabrik.de>
wrote:


> Hi,
>
> I am looking for advise on an HBase mass data access optimization problem.
>
> In our application all data records stored in Hbase have a time dimension
> (as inverted time) and a GUID in the row key. Retrieving a record requires
> issueing a scan with the GUID as prefix.
>
>
So GUID precedes the inverted timestamp?



> In order to get to entry (there is various access paths) we use a simple
> secondary index that also has a time dimension in the row and so needs a
> scan as well.
>
> For mass updates I am currently seeking ways to improve lookup performance.
>
> I found various discussions and issues on multi-scans (as in multi-Get,
> multi-Delete) but none of it was really helpful in sorting out the most
> promising direction.
>
>
The multi-Get does not help? Downside is one slow server slows the whole
query. It is not satisfactorily parallel enough in its querying?



> Currently I am experimenting with simply parallelizing lookups in chunks
> from the client. That reduces eplapsed wait time a bit. It seems though
> that avoiding roundtrips altogether by "scanning in parallel server-side"
> should show much better improvements.
>


How would this work? You'd pass over a list of GUIDs you knew were on a
particular server, then in a coprocessor, we'd do whatever per GUID?

St.Ack



> Thanks,
> Henning
>
> --
> Henning Blohm
>
> *ZFabrik Software GmbH & Co. KG*
>
> T:      +49 6227 3984255
> F:      +49 6227 3984254
> M:      +49 1781891820
>
> Lammstrasse 2 69190 Walldorf
>
> henning.bl...@zfabrik.de <mailto:henning.bl...@zfabrik.de>
> Linkedin <http://www.linkedin.com/pub/henning-blohm/0/7b5/628>
> ZFabrik <http://www.zfabrik.de>
> Blog <http://www.z2-environment.net/blog>
> Z2-Environment <http://www.z2-environment.eu>
> Z2 Wiki <http://redmine.z2-environment.net>
>
>

Re: parallel scanning?

Reply via email to