[
https://issues.apache.org/jira/browse/SOLR-16428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17610619#comment-17610619
]
ASF subversion and git services commented on SOLR-16428:
--------------------------------------------------------
Commit b1e742ab2e65dcd1128891fe36a00264ce890d3a in solr's branch
refs/heads/branch_9x from Jason Gerlowski
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=b1e742ab2e6 ]
SOLR-16428: Add "permissive" mode to IgnoreLargeDocumentsProcessorFactory
(#1040)
Prior to this commit, IgnoreLargeDocumentProcessorFactory only had a
single way to handle documents that exceeded its configurable size
limit. The first violation would throw a SolrException: in effect,
short-circuiting any other documents in the "batch" being processed.
This approach is very dependent on the ordering of docs within a
batch. A 100-doc batch with only 1 "size offender" might index 99 docs
or none at all, depending on where the offender is in the list. This is
great for end users whose clients are built to handle the resulting 400
response, and resubmit their whole batch. But it's not ideal for every
use case.
This commit introduces an alternate approach for handling these
violations: quietly log out the ID and size of the offending doc but
don't throw any exception that will short-circuit the remainder of the
batch.
The desired error-handling can be chosen using the URP's new config
parameter 'permissiveMode'. When false (the default), the legacy
behavior of short-circuiting the batch and surfacing a 400 error is
used. Otherwise, the new "just log things out and continue" behavior is
used.
> IgnoreLargeDocumentsProcessorFactory should have a "permissive" mode
> --------------------------------------------------------------------
>
> Key: SOLR-16428
> URL: https://issues.apache.org/jira/browse/SOLR-16428
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: UpdateRequestProcessors
> Affects Versions: 9.0, main (10.0)
> Reporter: Jason Gerlowski
> Assignee: Jason Gerlowski
> Priority: Minor
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> IgnoreLargeDocumentProcessorFactory only has a single way to handle documents
> that exceed its configurable size limit. The first violation throws a
> SolrException: in effect, short-circuiting any remaining documents in the
> "batch" and returning a 400 to the user.
> This is great for end users whose clients are built to handle the resulting
> 400 response, and who can modify and resubmit the batch. But it's not ideal
> for every use-case, especially where "best-effort" indexing is good enough.
> This ticket proposes adding a new "permissive" mode of handling too-large
> documents to ILDPF. Under this new mode "too-large" documents will be logged
> (and not indexed), but won't cause the entire batch to be aborted/error-out.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]