[
https://issues.apache.org/jira/browse/HBASE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876701#action_12876701
]
stack commented on HBASE-1845:
------------------------------
.bq ...but how is a MultiGet different from a Scan with a filter? I suppose I
understand from a query execution perspective, but it feels a bit odd to have
different client APIs for grabbing multiple rows.
MultiGet ia parallel fetching of many random rows. For small numbers
(hundreds/thousands?) itt should run faster than the (usually) single-threaded
full table scan plus filter that only returns those rows that pass the filter.
There is an inflection point at which the scan becomes faster than MultiGet but
you'd have be returning a good percentage of the table for that to be the case.
Also, a filter to return some random set of the table content would have to
carry all rows I believe, unless there is a pattern to the wanted row keys that
you can extract; it'd be pretty fat w/ payload.
(HBASE-1935 is scanning in parallel. It has a patch attached).
> MultiGet, MultiDelete, and MultiPut - batched to the appropriate region
> servers
> -------------------------------------------------------------------------------
>
> Key: HBASE-1845
> URL: https://issues.apache.org/jira/browse/HBASE-1845
> Project: HBase
> Issue Type: New Feature
> Reporter: Erik Holstad
> Fix For: 0.21.0
>
> Attachments: batch.patch, hbase-1845_0.20.3.patch,
> hbase-1845_0.20.5.patch, multi-v1.patch
>
>
> I've started to create a general interface for doing these batch/multi calls
> and would like to get some input and thoughts about how we should handle this
> and what the protocol should
> look like.
> First naive patch, coming soon.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.