[ 
https://issues.apache.org/jira/browse/OAK-5519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Klimetschek updated OAK-5519:
---------------------------------------
    Description: 
If a text extraction is broken (weird PDF) or a blob cannot be found in the 
datastore or any other error upon indexing one item from the repository that is 
outside the scope of the indexer, it currently halts the complete indexing 
(lane). Thus one broken item (that maybe isn't important to the users at all) 
can block the indexing of other, new content (that might be important to 
users), and it always requires manual intervention to fix.

Instead, the item could be remembered in a known issue list, proper warnings 
given, and indexing continue. Maintenance operations should be available to 
come back to reindex these once the issue is fixed, or the indexer could 
automatically retry after some time.

  was:
If a text extraction is broken (weird PDF) or a blob cannot be found in the 
datastore or any other error upon indexing one item from the repository that is 
outside the scope of the indexer, it currently halts the complete indexing 
(lane).

Instead, the item should be remembered in a known issue list, proper warnings 
given, and indexing continue. Maintenance operations should be available to 
come back to reindex these once the issue is fixed, or the indexer could 
automatically retry after some time.


> Skip problematic binaries instead of blocking indexing
> ------------------------------------------------------
>
>                 Key: OAK-5519
>                 URL: https://issues.apache.org/jira/browse/OAK-5519
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query
>            Reporter: Alexander Klimetschek
>
> If a text extraction is broken (weird PDF) or a blob cannot be found in the 
> datastore or any other error upon indexing one item from the repository that 
> is outside the scope of the indexer, it currently halts the complete indexing 
> (lane). Thus one broken item (that maybe isn't important to the users at all) 
> can block the indexing of other, new content (that might be important to 
> users), and it always requires manual intervention to fix.
> Instead, the item could be remembered in a known issue list, proper warnings 
> given, and indexing continue. Maintenance operations should be available to 
> come back to reindex these once the issue is fixed, or the indexer could 
> automatically retry after some time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to