We are evaluating the potential to use MarkLogic for indexing and storage of
content and have come across a use case that doesn't seem to map well to the
MarkLogic indexing model.
Just wanted to describe the data model we are using (or at least that section
of it that applies to this case), and see if we're potentially overlooking
something.
Our primary requirement for indexing revolves around custom tags that we allow
clients to associate with objects. These custom tags are name/value pairs, and
the values can have various types (string, date, datetime, real, int, etc.).
We need to be able to support fast range queries (that account for data type),
fast ordering, and fast aggregation of distinct values across these tags. Each
of these operations needs to consider the tag name and value and the value's
type.
I believe this would be a nice fit for pre-defined Range Indexes in MarkLogic
if we had a finite, predetermined set of tag names and could create distinct
elements for each tag name and could predefine a Range Index for each. But
since the set of potential tag names is unlimited, and since one tag name could
be potentially associated with values that have multiple types, we can't really
predefine anything.
Based on the documentation we've seen, we might potentially be able to get the
functionality that I describe above to work using xpath queries against the
standard indexes that MarkLogic builds when importing an XML document, but our
concern is that, in the absence of Range Indexes, we would lack scalability (we
need fast performance across a large number of objects each of which would have
a large number of tags).
Is there some way to work around this with Range Indexes?
An example fragment of data:
<item>
<tagList>
<dateTag name="attrName1">20110101</dateTag>
<stringTag name="attrName2">ABC</stringTag>
<realTag name="attrName3">1.123</realTag>
<stringTag name="attrName3">DEF</stringTag>
</tagList>
</item>
Note: we would need dateTag values to have type date, stringTag values to have
type string, and realTag values to have type real for purposes of filtering,
sorting, etc.
Thanks,
Faron
________________________________
This email message and any attachments are for the sole use of the intended
recipients and may contain proprietary and/or confidential information which
may be privileged or otherwise protected from disclosure. Any unauthorized
review, use, disclosure or distribution is prohibited. If you are not an
intended recipient, please contact the sender by reply email and destroy the
original message and any copies of the message as well as any attachments to
the original message. Local registered entity information:
http://www.msci.com/legal/local_registered_entities.html
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general