Hello, Oleksandr.
It deserves JIRA, please raise one.

On Tue, Oct 15, 2019 at 8:17 PM Oleksandr Drapushko <drapus...@gmail.com>
wrote:

> Hello Community,
>
> I've discovered data loss bug and couldn't find any mention of it. Please
> confirm this bug haven't been reported yet.
>
>
> Description:
>
> If you try to update non pre-analyzed fields in a document using atomic
> updates, data in pre-analyzed fields (if there is any) will be lost. The
> bug was discovered in Solr 8.2 and 7.7.2.
>
>
> Steps to reproduce:
>
> 1. Index this document into techproducts
> {
>   "id": "a",
>   "n_s": "s1",
>   "pre":
>
> "{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
> }
>
> 2. Query the document
> {
>   "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
>       {
>         "id":"a",
>         "n_s":"s1",
>         "pre":"Alaska",
>         "_version_":1647475215142223872}]
>   }}
>
> 3. Update using atomic syntax
> {
>   "add": {
>     "doc": {
>       "id": "a",
>       "n_s": {"set": "s2"}
>     }
>   }
> }
>
> 4. Observe the warning in solr log
> UI:
> WARN  x:techproducts_shard2_replica_n6  PreAnalyzedField  Error parsing
> pre-analyzed field 'pre'
>
> solr.log:
> WARN  (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8
> x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing
> pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type
> java.lang.String, expected Map
> at
>
> org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)
>
> 5. Query the document again
> {
>   "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
>       {
>         "id":"a",
>         "n_s":"s2",
>         "_version_":1647475461695995904}]
>   }}
>
> Result: There is no 'pre' field in the document anymore.
>
>
> My thoughts on it:
>
> 1. Data loss can be prevented if the warning will be replaced with error
> (re-throwing exception). Atomic updates for such documents still won't
> work, but updates will be explicitly rejected.
>
> 2. Solr tries to read the document from index, merge it with input document
> and re-index the document, but when it reads indexed pre-analyzed fields
> the format is different, so Solr cannot parse and re-index those fields
> properly.
>
>
> Thank you,
> Oleksandr
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to