https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=38101

--- Comment #20 from Tomás Cohen Arazi (tcohen) <[email protected]> ---
Created attachment 175261
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=175261&action=edit
Bug 38101: Make ES indexer split big fields into chunks

This patch makes the `_process_mappings()` method split the index values
in the event of them being bigger than the allowed 32766 bytes size.

To test:
1. Have KTD running with ES:
   $ ktd --proxy --es7 up -d
2. Perform a search
3. Pick the first result for edition
4. Find a cool Wiki page with lots of paragraphs
5. Copy all of the paragraphs and put them on a 500$a field for the record.
6. Repeat 2
=> FAIL: The record is not found
7. Reindex manually:
   $ ktd --shell
  k$ perl misc/search_tools/rebuild_elasticsearch.pl --biblios --where
"biblionumber=3"  -v -v
=> FAIL: You get something like:
```
[22229] Committing final records...
One or more ElasticSearch errors occurred when indexing documents at
/kohadevbox/koha/Koha/SearchEngine/Elasticsearch/Indexer.pm line 148.
[22229] There were errors during indexing
Record #3 Document contains at least one immense term in field="note.raw"
(whose UTF8 encoding is longer than the max length 32766), all of which were
skipped.  Please correct the analyzer to not produce such terms.  The prefix of
the first immense term is: '[10, 109, 117, 115, 116, 97, 102, 97, 32, 102, 117,
101, 32, 101, 108, 32, 115, 101, 103, 117, 110, 100, 111, 32, 104, 105, 106,
111, 32, 100]...', original message: bytes can be at most 32766 in length; got
32771 (illegal_argument_exception) : max_bytes_length_exceeded_exception (bytes
can be at most 32766 in length; got 32771)
[22229] Total 1 records indexed
```
8. Apply this patch
9. Repeat 7
=> SUCCESS: No error!
10. Repeat 2
=> SUCCESS: The record is indexed and can be found!
11. Sign off :-D

Signed-off-by: Nick Clemens <[email protected]>

-- 
You are receiving this mail because:
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[email protected]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

Reply via email to