Re: [MarkLogic Dev General] Hierarchical facet support / examples in MarkLogic?

Jason Hunter Thu, 13 May 2010 23:24:53 -0700

You have a few ways to do this, depending on your requirements.

Option 1: Mark up the documents with elements holding the leaf node values and 
other elements holding the parent values.  This bakes your taxonomy into the 
documents.  Danny explained this one nicely.

Option 2: Mark up the documents with just the (assumed unique) leaf node 
values.  Maintain a separate declarative document with the hierarchy showing 
how the leaf node values fit together.  Perhaps that's more useful.  You'll do 
your query and quickly fetch all the leaf node values, and when you want to 
show facets above the leaf nodes just do some coalescing math.  The performance 
should be good.

As an example, if you're modeling a biological taxonomy, you can quickly find 
the distinct number and count of animals matching any query, and then if you 
want to show mammals vs reptiles you walk the list of distinct animal matches 
and use your declarative document to figure out how many you have of each.  Use 
the MarkLogic "map" API and I expect this will be very fast even for thousands 
of distinct animals which is probably more than you have in your case.

If you want to limit a query to a certain parent node (i.e. reptiles), you'd 
use an or-query for the leaf nodes.  That's how the thesaurus works in essence. 
 You don't want many thousands of expanded values though.  So...

Option 3: Put the taxonomy hierarchy into a single string.  Perhaps you'd have 
"reptile/snake/cobra" or something.  This is similar to the option above but 
bakes the hierarchy into the documents again which is mentally simpler perhaps 
and has some query performance perks.  For any given query you can get the 
distinct list of matching strings and you can easily do the math (again 
probably using map) for how many results have values starting with reptile vs 
starting with mammal.

You can also then really easily limit your query to "reptile" by using a 
word-query or range-query against this field.  If terms repeat in different 
places you can use an initial anchor word and a phrase search to make sure 
you're left-anchored.

If these approaches don't sound suitable, maybe you can give more details about 
your use case, the taxonomy, the performance needs, and the size of your corpus.

If one sounds suitable and you get stuck making it happen, let me know.

And maybe someone else has a good Option 4.  :)

-jh-

On May 13, 2010, at 5:14 PM, Ramon Felciano wrote:

> Hi –
>  
> I just attended my first MarkLogic user conference and liked the demos I saw, 
> especially those that demonstrated the ability to build a faceted search 
> application fully within MarkLogic. I’m looking to build a similar 
> search-and-browse application for a document collection that is organized 
> using tags from a very large hierarchical controlled vocabulary, and would 
> like to use these tags as the basis for the faceted navigation. I was 
> planning to use Lucene/Solr, but now am wondering whether I could do this 
> largely within MarkLogic, but am getting stuck on how to auto-generate the 
> facets within the UI.
>  
> Are there any examples showing how to dynamically construct *hierarchical* 
> faceted UIs all within ML (e.g. using XQuery)?
>  
> Thanks,
>  
> Ramon
>  
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Hierarchical facet support / examples in MarkLogic?

Reply via email to