[
https://issues.apache.org/jira/browse/HBASE-26122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Beaudreault resolved HBASE-26122.
---------------------------------------
Fix Version/s: (was: 3.0.0-alpha-2)
(was: 2.6.0)
Release Note: (was: Can now call Get.setMaxResultSize(). When set to a
positive value, the server will return the results when that threshold is met.
This may result in partial results for large rows, so the caller is expected to
handle the case where Result#mayHaveMoreCellsInRow() is true when
setMaxResultSize is used. Possible options include paginating using PageFilter,
reducing the returned data set using other filters, converting the Get to a
Scan (which can take advantage of partial response stitching), or throwing a
non-retryable exception if using this as a guardrail. See below for example
usage in shell:
Create table
hbase:005:0> create 't1', 'f1'
Created table t1
Took 1.1306 seconds
Insert test data
hbase:012:0> put 't1', 'r1', 'f1:c1', 'a'
Took 0.0416 seconds
hbase:014:0> put 't1', 'r1', 'f1:c2', 'b'
Took 0.0059 seconds
hbase:015:0> put 't1', 'r1', 'f1:c3', 'c'
Took 0.0097 seconds
Get without setMaxResultSize, returns full row and mayHaveMoreCellsInRow = false
hbase:037:0> g = Get.new('r1'.to_s.to_java_bytes)
=> #<Java::OrgApacheHadoopHbaseClient::Get:0x11fa11b2>
hbase:038:0> result = @hbase.table('t1',
@shell).instance_variable_get(:@table).get(g)
=> #<Java::OrgApacheHadoopHbaseClient::Result:0x217009bd>
hbase:039:0> result.mayHaveMoreCellsInRow
=> false
hbase:040:0> result.toString
=> "keyvalues={r1/f1:c1/1627498270850/Put/vlen=1/seqid=0,
r1/f1:c2/1627498276326/Put/vlen=1/seqid=0,
r1/f1:c3/1627498280413/Put/vlen=1/seqid=0}"
Get with setMaxResultSize, returns first two columns and mayHaveMoreCellsInRow
= true
hbase:059:0> g = Get.new('r1'.to_s.to_java_bytes).setMaxResultSize(100)
=> #<Java::OrgApacheHadoopHbaseClient::Get:0x5ed88e31>
hbase:060:0> result = @hbase.table('t1',
@shell).instance_variable_get(:@table).get(g)
=> #<Java::OrgApacheHadoopHbaseClient::Result:0x574e4184>
hbase:061:0> result.mayHaveMoreCellsInRow
=> true
hbase:062:0> result.toString
=> "keyvalues={r1/f1:c1/1627498270850/Put/vlen=1/seqid=0,
r1/f1:c2/1627498276326/Put/vlen=1/seqid=0}")
Resolution: Won't Fix
> Limit max result size of individual Gets
> ----------------------------------------
>
> Key: HBASE-26122
> URL: https://issues.apache.org/jira/browse/HBASE-26122
> Project: HBase
> Issue Type: New Feature
> Components: Client, regionserver
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Major
>
> Scans have the ability to have a configured max result size, which causes
> them to return a partial result once the limit has been reached. MultiGets
> also can throw MultiActionResultTooLarge if the response size is over a
> configured quota. Neither of these really accounts for a single Get of a
> too-large row. Such too-large Gets can cause substantial GC pressure or worse
> if sent at volume.
> Currently one can work around this by converting their Get to a single row
> Scan, but this requires a developer to proactively know about and prepare for
> the issue by using a Scan upfront or wait for the RegionServer to choke on a
> large request and only then rewrite the Get for future requests.
> We should implement the same response size limits for for Get as for Scan,
> whereby the server returns a partial result to the client for handling.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)