Re: [MarkLogic Dev General] Adding new fields

ville Mon, 13 Oct 2014 23:44:09 -0700

Hi,

The indexes do not include the root element. Unfortunately we need toindex an element that exists in all documents - the only thing thatdiffers is the attribute value. (The field index settings are tweakedfor specific purposes in each case, result from having different fulltext search requirements for specific document sets.)

As this is built on top of another product, we need to have the elementnamed like it is in there, and the element is found in all documents.When I look at the database status right after adding one such field, Ican see that the forests are all reindexing totalling millions of docsto go. With new tiered hardware this is completed in order of hours,sometimes takes over a day though, and with old hardware it took inorder of weeks to add one. Our monitoring also reveals that it reallyspikes the usable disk bandwidth, so it is definitely working a lot. (Myguess is that it selects all the docs with the element, but is notintelligent enough to limit using the attribute value too.)

Indexes that include only elements that can be found from a fraction ofdocuments are not a problem. Is there some indexing option that I canturn on so that ML can index only the docs that have a specificattribute value in the given element? Now it seems only capable ofquerying the docs that have the element

This may also be a design issue, but unfortunately I'm unable to do anybig changes to the way we do things in the codebase I've inherited.


We're running 7.0-2.3 btw, if that matters.

Ville

------ Original Message ------
From: "Danny Sokolsky" <[email protected]>
To: "MarkLogic Developer Discussion" <[email protected]>
Sent: 14.10.2014 0:41:51
Subject: Re: [MarkLogic Dev General] Adding new fields

Hi Ville,
I don’t know of a way to tell MarkLogic to trust you in this case, andyou should not need it to. If you do not have any content to reindex,and if reindexing is enabled, it should not rewrite all of the content.It will query all of the content to see if it needs reindexing, whichwill not be free but should not be too expensive, but I would notexpect a full reindex to happen. In that case you should see somemessages in the log about reindexing that database and a little lateranother message saying you reindexes 0 fragments (in fact, you will seethese messages each time the config files change).
You mention your fields are doing includes. I would recommend usingpaths for your fields instead. Also, make sure your fields are notincluding the root, as that is almost never the correct way to do it.Are you using 7.0-4 for this? If not, try upgrading.
-Danny
From:[email protected][mailto:[email protected]] On Behalf Of[email protected]
Sent: Monday, October 13, 2014 12:58 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Adding new fields



Hi,
when developing applications with ML as the database, we need to addnew indexes regularly to deliver new features. We often (probably 95%)of the time add new indexes that will not hit any content in thedatabase currently, but know that eventually will when new content isadded.
As we have terabytes / millions of docs of content, these reindexoperations can be costly and take considerable time to run.
So finally to the question: given that we're adding a new field thathas one include, it seems that ML goes through all documents in thedatabase (include limits by element and attribute value) - is there away to tell ML that hey, we know, and we take the responsibility, thatthe database currently does not have any content that needs to bereindex, so even though the database wide "reindexer enable" is on,please do not do any reindexing for this field?
Would it work to toggle reindexer enable off while adding the fields,and then toggling it back on. What about new documents added duringreindexer is off? (We don't have the luxury to stop writes at any giventime.)
Ville

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Adding new fields

Reply via email to