gerlowskija commented on code in PR #1040:
URL: https://github.com/apache/solr/pull/1040#discussion_r980343005
##########
solr/core/src/java/org/apache/solr/update/processor/IgnoreLargeDocumentProcessorFactory.java:
##########
@@ -59,15 +71,25 @@ public void init(NamedList<?> args) {
public UpdateRequestProcessor getInstance(
SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor
next) {
return new UpdateRequestProcessor(next) {
+
@Override
public void processAdd(AddUpdateCommand cmd) throws IOException {
long docSize =
ObjectSizeEstimator.estimate(cmd.getSolrInputDocument());
if (docSize / 1024 > maxDocumentSize) {
+ handleViolatingDoc(cmd, docSize);
+ } else {
+ super.processAdd(cmd);
+ }
+ }
+
+ private void handleViolatingDoc(AddUpdateCommand cmd, long
estimatedSizeBytes) {
+ if (onlyLogErrors) {
+ log.warn("Skipping doc {} bc size {} exceeds limit {}",
cmd.getPrintableId(), estimatedSizeBytes / 1024, maxDocumentSize);
Review Comment:
I actually ended up keeping the estimated size in bytes here, with the
"limit" value still in KB.
"Limit" makes sense to keep in kb, because that's the unit that users
specify when configuring this URP. As for the estimated size, I thought about
using "kb" for that as well, but I was worried that users might be confused in
cases where the integer-division/rounding needed to convert bytes to kb might
result in displaying a message where the estimated size and limit are equal.
In short I didn't want to have us printing log messages that looked like:
> Skipping doc asdf bc size 2kb exceeds limit 2kb
But I've added explicit units to the variables and log-messages involved
here, so hopefully that's good enough to address the core of your issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]