junegunn commented on PR #7432: URL: https://github.com/apache/hbase/pull/7432#issuecomment-3494354832
I understand your point. > Not all users will look deeply into the javadoc True, and unfortunately, those users will still complain that `Scan#setLimit` doesn't work as expected no matter what. So we should: - Document the limitation of the method in the javadoc (_"this doesn't work with TableInputFormat"_), - And in case it's overlooked, print a warning message when serializing a Scan with a limit for an MR job. In order to do that, we need to introduce an internal version of `ProtobufUtil.toScan` that doesn't print a warning message and use it in `RequestConverter.buildScanRequest`. However, the public `ProtobufUtil.toScan` cannot tell if users are setting the new per-split-limit parameter in their configuration, or they are aware of the limitation; cases where the warning message can feel redundant. Users would then have to manually unset the limit to silence it, which is not ideal, so I'm not entirely sure about adding the warning. > introduce different meanings when using Scan limit I thought it was acceptable, because we already have constructs that behave differently in parallel scenarios (e.g. stateful filters like `PageFilter` and `WhileMatchFilter`). > _This is because the filter is applied separately on different region servers._ > https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/filter/PageFilter.html So I assumed it was already well-understood that a separate Scan operates per split in such cases. But maybe that's just me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
