[ 
https://issues.apache.org/jira/browse/SOLR-18191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18079651#comment-18079651
 ] 

Renato Haeberli commented on SOLR-18191:
----------------------------------------

[~malliaridis]  Do we need in in 10x or shall we just merge it into 11?

> Disable lenient json parsing at API level
> -----------------------------------------
>
>                 Key: SOLR-18191
>                 URL: https://issues.apache.org/jira/browse/SOLR-18191
>             Project: Solr
>          Issue Type: Improvement
>          Components: JSON Request API, v2 API
>    Affects Versions: 10.0
>            Reporter: Christos Malliaridis
>            Priority: Major
>              Labels: V2, api, json, pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The current JSON parsing is lenient and allows fields like configset or 
> colleciton values during creation to be numbers instead of strings. This is 
> problematic for cases where resources are auto-populated with numeric IDs.
> h3. Problematic Use Cases
> The following use cases for creating a configset (but also other resources) 
> should provide more information why lenient input is problematic:
> ||Numeric Input||Example Value||API Response||Notes||
> |Normal Numbers|12345|200 _created_|Simple use case that would make sense, 
> number is converted to string in background|
> |Prefixed / Padded Numbers|012345|500 error: _Invalid numeric value: Leading 
> zeroes not allowed_|Leading zeros in string cases allowed, but not in numbers 
> (number parsing issue)|
> |Prefixed / Padded Numbers as Strings|"012345"|200 _created_|Same number that 
> caused error before works if converted to string|
> |Negative Numbers|-12345|400 _Invalid configset: [-12345]. configset names 
> must consist entirely of periods, underscores, hyphens, and alphanumerics as 
> well as not start with a hyphen._|Negative number is converted to string, and 
> fails because it starts with a hyphen|
> |Extremly large numbers|99999...|500 _Number value length (6720) exceeds the 
> maximum allowed (1000, from 
> `StreamReadConstraints.getMaxNumberLength()`)_|Large numbers fail to be 
> parsed as numbers first|
> |Extremly large numbers as strings|"9999..."|200 _created_|Large numbers that 
> previously failed succeed when converted to string before|
> These use cases show that the conversion of numbers to strings often fail and 
> is inconsistent when lenient input is used.
> h3. Proposed Solution
> The solution proposed here is to disable this json parsing behavior in 3 
> steps, of which the first two may be skipped if we consider this a bug. 
> Otherwise, since this is an API change, we should treat it as a breaking 
> change and migrate as proposed below:
>  # Optional: Introduce a flag for enabling the lenient input that is enabled 
> by default in the next 10.x version
>  # Optional: Change the default value from enabled to disabled in the 10.x+1 
> version
>  # Remove the lenient input configuration for json parsing and require 
> correct input types in 11.0
> h3. Documentation Changes
> We should also add the expected type and validation rules that apply to our 
> documentation, so that it is clear what are valid inputs for individual cases.
> h3. Possible Issues
> If we change the behavior for lenient input parsing globally (in case one 
> parser is used for all inputs), we should check which endpoints / fields are 
> affected by that, and document at endpoint level if specific fields are 
> "unexpectedly changing".
> Do also note that a global change does also affect inputs that are expected 
> to be numbers and that are provided as strings.
> h3. What is not considere here
> During the tests I noticed that large numbers / large strings as inputs for 
> configset names or collection names can also become problematic when 
> exceeding 256 characters. This however should not yet be considered here, as 
> it is another validation issue that should be addressed separately.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to