bbeaudreault commented on a change in pull request #3532:
URL: https://github.com/apache/hbase/pull/3532#discussion_r686269713



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/Get.java
##########
@@ -339,6 +340,21 @@ public Get setFilter(Filter filter) {
     return this;
   }
 
+  /**
+   * Set the maximum result size. The default is -1; this means that no 
specific
+   * maximum result size will be set for this Get.
+   *
+   * If set to a value greater than zero, the server may respond with a Result 
where
+   * {@link Result#mayHaveMoreCellsInRow()} is true. The user is required to 
handle
+   * this case.

Review comment:
       At HubSpot we have a wrapper implementation of Table which all 
downstream users go through. This wrapper table enforces that 
`setMaxResultSize` is set to a standard value that we've deemed safe. If a 
result comes back and `mayHaveMoreCellsInRow` is true, we throw an exception. 
If a team gets such an exception they can request a temporary allowance which 
disables the check. In the meantime they are expected to add a filter to 
paginate so they don't hit the max limit.
   
   This is a little draconian, but we used to have lots of OOM issues due to 
large gets/puts/scans. Another possible solution is to iterate with PageFilter, 
like I did in `testGetPartialResults`. We planned to do something like that 
eventually, but in the end we had rolled this out in such a way that the number 
of exceptions were so few that we never did the work.
   
   Would you be open to an automatic stitching in the future, like we do with 
Scans? I can't do that now, but might be a reasonable followup jira.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to