Hi Anne,

   The usual way to do this is with a range index.  Range indexes contain the 
unique values, conveniently sorted, that occur for a given element (or 
element/attribute) throughout the database.  They are very fast and are 
dynamically updated as content changes so they always reflect the actual set of 
values.  Range indexes are especially useful when the unique set of values is 
small (like a category, name of a country, etc).  This makes them very useful 
for populating drop down menus and the like.

   There are two typical use cases for this scenario, and the code is basically 
the same for both:

 1) List all the unique values that exist in the corpus, such as a rating (like 
1-10), country code, publication year, language, etc.  This is useful for 
generating search facets by offering each of the unique values found in a 
search as a further drill-down constraint.

2) Offering a list of choices, some of which may not exist in the content yet.  
This would be useful for a registration form, for example, to populate a list 
of choices (such as the dreaded “where did you hear about us?” question).  As 
you describe, these could be placed in a reference data document that is 
updated as needed.

   Both would be implemented by configuring a range index on the relevant 
element, and then just using cts:element-values() to fetch the sorted values.  
For reference data list documents, you might put them in a named collection and 
use cts:collection-query() to constrain the values if the same element name 
might also appear elsewhere.

   As far as single vs multiple documents, with a range index it doesn’t really 
matter, since using the index to get a list of values works the same regardless 
of where the values came from (fetching range index values does not touch the 
original document).  So you could place all these ref data values in one big 
document, one document per topic, or many individual documents.

   For this use case there is no performance impact and you should choose the 
structure that’s easiest and the least error prone.  There can be 
considerations when you’re using range indexes to optimize searches for 
specific documents.  In that case you generally want only one indexed value per 
document (like a type or category code) because it make composing multiple 
queries easier.  But if you’re just after the values, it doesn’t matter.

   Hope that helps.

---
Ron Hitchens {[email protected]}  +44 7879 358212

> On Mar 3, 2017, at 12:35 PM, Anne Taylor <[email protected]> wrote:
> 
> Hi all,
>  
> My colleagues and I are looking for a recommendation of how best to store 
> what can be considered semi-static lists in MarkLogic.  These are the kind of 
> lists that would be used to populate the dropdown lists on the front end web 
> site.  For example, a list of countries, languages, and also in our case a 
> fixed list of crops, all of which can be used to associate as tags for a 
> document when it is uploaded.  Is there a standard, accepted way to do this 
> in MarkLogic or generally in XML data modelling?  When a user uploads a new 
> document any associated tags will then be stored in that document. 
>  
> The options we’ve considered are:
>  
> 1.  We have one document which contains lists of the type:
> <myns:countries>
> <myns:country>Afghanistan</myns:country>
> <myns:country>Aland Islands</myns:country>
> <myns:country>Albania</myns:country>
> <etc>
>  
> <myns:languages>
>         <myns:language>Arabic</myns:language>
>         <myns:language>Bambara</myns:language>
>         <myns:language>Bariba</myns:language>
> <etc>
>  
> <myns:crops>
>         <myns:crop>Apple</Myns:crop>
>         <myns:crop>Banana</Myns:crop>
>         <myns:crop>Cocoa</Myns:crop>
>  
> 2. Similar to above, but each type is stored in its own document
>  
> 3.  Each individual item is stored in its own document, and a collection is 
> added to help with filtering the relevant document, so we have a collection 
> “CountryList” and a collection “LanguageList” or similar.  This seems to fit 
> best with the recommendation of “one document is one record” when we try to 
> convert our relational database thinking into document style, but makes lots 
> of very small document fragments.  This also gives each value a related URI.
>  
> The lists may be updated occasionally, but not on a frequent basis.
>  
> We’d find it very useful to hear anyone else’s experience and recommendations.
>  
> Many thanks,
> Anne
>  
> P Think Green - don't print this email unless you really need to 
> ************************************************************************
> The information contained in this e-mail and any files transmitted with it is 
> confidential and is for the exclusive use of the intended recipient. If you 
> are not the intended recipient please note that any distribution, copying or 
> use of this communication or the information in it is prohibited. 
> 
> Whilst CAB International trading as CABI takes steps to prevent the 
> transmission of viruses via e-mail, we cannot guarantee that any e-mail or 
> attachment is free from computer viruses and you are strongly advised to 
> undertake your own anti-virus precautions.
> 
> If you have received this communication in error, please notify us by e-mail 
> at [email protected] <mailto:[email protected]> or by telephone on +44 (0)1491 832111 
> and then delete the e-mail and any copies of it.
> 
> CABI is an International Organization recognised by the UK Government under 
> Statutory Instrument 1982 No. 1071...
> 
> **************************************************************************
> _______________________________________________
> General mailing list
> [email protected] <mailto:[email protected]>
> Manage your subscription at: 
> http://developer.marklogic.com/mailman/listinfo/general 
> <http://developer.marklogic.com/mailman/listinfo/general>
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to