bbeaudreault commented on pull request #3565:
URL: https://github.com/apache/hbase/pull/3565#issuecomment-903690738


   Thank you for your comment Anoop. The problem we face is that we have 
hundreds of engineers using hbase in innumerable ways, on multi-tenant 
clusters. People shift teams, leave, get hired, etc. When you set out to read a 
row from hbase, you don't always know how large that row will be. 
   
   Yes I agree with you, if someone knew they were going to fetch a large row, 
they should do a Scan. But given the above, people often don't. This new 
feature acts as a guardrail against causing RegionServer pain in those cases.
   
   The way we've used this feature is we have our own TableFactory that 
everyone must use to get tables. The returned Table objects are wrapped. When a 
Get is submitted, the wrapper uses this to limit the max result size. When data 
is returned, we inspect the Result and throw an exception if it's a partial 
result. 
   
   At that point the user can rewrite their query to use a Scan or add a 
filter, etc. We also have an escape hatch for urgent issues to allow them to go 
through, but that is audited. 
   
   That said, I just noticed `hbase.table.max.rowsize`. Surprisingly that was 
added shortly after our original patch of this and we haven't noticed it since. 
The default value looks too large, but given we've already been enforcing this 
I think we might be able to move to that instead. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to