Hi Folks,
I'm not up to speed on some of the latest innovations and practices in MarkLogic and rather than dig and try to figure it all out I figured I'd save a little time and ask. Here's what I would really like to do: Given mixed content such as: <full-name> <surname>James</surname>, <first-name>John</first-name></full-name> I would like to create a range index on full-name, surname, and first-name without having to create a separate full-name element that contains no sub-elements. That way I can have and do the following things: 1. Obtain a searchable lexicon of full-names for search purposes. 2. Provide a master database and schema from which derivative documents can be extracted uses less granular elements, in this case <full-name> without <surname> and <first-name>, such that I can use an element range index on <full-name> in the master database to analyze and/or normalize any and all variations of <full-name>. 3. I can also further analyze aberrant <full-name> forms to develop enhanced parsing algorithms to obtain the surname and first-name (and for that matter middle names, prefixes, and suffixes as would have it in a real world scenario and not as in this limited example). Thanks for any help with this! Tim Meagher
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
