[
https://issues.apache.org/jira/browse/SOLR-16363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17623876#comment-17623876
]
Aman Verma commented on SOLR-16363:
-----------------------------------
Hi [~krisden] Thanks to taking it up. We are using a custom
UpdateRequestHandler. For us, it was the solr.TextField incoming field where we
had analyzer for index to "URL-decode the string". The decoding was failing
because the inbound string sometimes was like "cmd.exe /c echo jj1 >
%SystemDrive%\temp\jjj.txt" (The URLDecoder did not like that – a bug in our
implementation here, however, the stacktrace like you see was deceiving). On
receiving /update with this field containing strings that could not be
URLDecoded, we would fail to addDocs().
As our bug fix, we ended up manipulating inbound string similar to following
(i.e., replace what could impact URLDecoding).
{code:java}
String s2 = s.replaceAll("%(?![0-9a-fA-F]\{2})", "%25");
s2 = s2.replaceAll("\\+", "%2B");
String decodedText = URLDecoder.decode(s2, "utf-8");
{code}
> DirectUpdateHandler2 should not throw UnknownFormatConversionException
> ----------------------------------------------------------------------
>
> Key: SOLR-16363
> URL: https://issues.apache.org/jira/browse/SOLR-16363
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: UpdateRequestProcessors
> Affects Versions: 9.0, 8.11.2, 9.1
> Reporter: Aman Verma
> Assignee: Kevin Risden
> Priority: Minor
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In certain situation, to handle IllegalArgumentException while adding doc to
> solr is linked below
> [https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/solr/core/src/java/org/apache/solr/update/DirectUpdateHandler2.java#L249|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/solr/core/src/java/org/apache/solr/update/DirectUpdateHandler2.java#L250]
> EDIT: Adding Code piece for current main branch:
> [https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/update/DirectUpdateHandler2.java#L314]
>
> This can be problematic if IllegalArgumentException (of the following format)
> is thrown during processing of docs (in my case it was via Filters to
> URL-Decode a string)
>
> {code:java}
> java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in
> escape (%) pattern - Error at index 0 in: "sy"{code}
> The _iae.getMessage()_ in this case contains "{*}%){*}" which conflicts with
> String.format which would further throw
> {code:java}
> java.util.UnknownFormatConversionException: Conversion = ')'
> at java.util.Formatter.checkText(Unknown Source) ~[?:?]
> at java.util.Formatter.parse(Unknown Source) ~[?:?]
> at java.util.Formatter.format(Unknown Source) ~[?:?]
> at java.util.Formatter.format(Unknown Source) ~[?:?]
> at java.lang.String.format(Unknown Source) ~[?:?]
> at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:249){code}
> This particular exception is not caught and as a result the BAD_REQUEST was
> never returned to the client along with failure point in the chain.
> The ticket is a proposal to make this more robust i.e., in this particular
> situation either getMessage() could replaceAll "%" or perhaps another try?
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]