Hi all,

Today, we noted a possible problem with the automatic reindexing for
the database of one of our projects. And we'd like some recommendation
how we could avoid this problem in the future.


First a problem description:
>From the user perspective, search requests suddenly didn't return
results anymore that actually were contained in the database.

After a long investigation, we found out that at the root of the
problem was a searchable expression comparing a node to an explicit
xs:date similar to this:

  .//meta-834:dob[. eq xs:date("1958-05-22")]

We noticed that removing the xs:date cast helped, and the search
would return results. This pointed us to the range element indexes.

For this request a range element index of type date was configured.
The logs confirmed that this index is applied:

2017-07-05 08:31:47.434 Info:
/MarkLogic/appservices/search/search-impl.xqy at 7:106: Comparison
contributed date range value constraint: meta-834:dob =
xs:date("1958-05-22")
2017-07-05 08:31:47.434 Info:
/MarkLogic/appservices/search/search-impl.xqy at 7:93: Step 1
predicate 1 contributed 1 constraint: . eq xs:date("1958-05-22")

But it just didn't return the result on this system, while our
parallel UAT environment (with the same installation, database
settings, and data) did.

Checking with cts:element-values:
  cts:element-values(xs:QName("meta-834:dob"))
we saw that some return values were apparently of type text while
others (the newly loaded data) returned dates.

Fix:
We concluded that the index probably got inconsistent. Therefore,
we reindexed the database manually, after which the search just
worked in all cases again


Core questions:
We currently do not understand how the index got out of sync and
apparently returned inconsistent results. The reindexer was always
enabled for the database in question. We thought with such a setup
there should be no need for an explicit reindex at all. Did we
overlook an important maintenance task or setting? Could we maybe
just have done a common mistake somewhere?

Also, when is a manual reindex necessary at all? Looking at the
Indexing Best Practices guide, it seems that reindexing should
rather be controlled using the reindexer-enabled field.


Our environment:
Settings in our databases are done initially using the REST
interface ("PUT /manage/v2/forests/{id|name}/properties").
But not changes were done before the problem occurred.
No reindexing was in progress. The reindexer was always enabled.
This particular database is running on MarkLogic Server 8.0-5.5.

We couldn't find any hints in the log files of MarkLogic.
>From our perspective, the index just got silently inconsistent
resulting in failing searches.

Thanks for your time.

Best regards,
Max
_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to