Thank you Harry.
Rushabh Mehta Tata Consultancy Services Cell:- 9989116577 Mailto: [email protected] Website: http://www.tcs.com ____________________________________________ Experience certainty. IT Services Business Solutions Consulting ____________________________________________ At first glance, I would recommend storing the metadata in their own elements. You can always format it in xhtml if needed later. The advantage is in faceting and search. If everything is in a <p> element, it will be hard to make your search specific. I might also suggest that your content and metadata be stored in one document perhaps something like... <imported-document> <content> [xml of document that is generated in CPF workflow] </content> <metadata> <source-file>my-file-imported-through-cpf.xlsx</source-file> <document-type>xlsx</document-type> <country>United States</country> <region>Western</region> <business>Pharmaceutical Sales</business> <source>...</source> ... </metadata> </imported-document> You can also separate the metadata from the content, but regardless, this type of structure supports facets and filters to make your search more powerful. If your search is ALWAYS going to be keyword-driven and only against the content, then your approach is fine. If you want to be able to develop faceted search, filtering by metadata, etc., then data structure like I illustrated will better support that and make it very easy to do in MarkLogic. Hope this helps, Harry To: [email protected] From: Rushabh M/HYD/TCS Date: 06/12/2013 07:30PM Subject: TExt search and facets on Metadata Hello All, I am taking my baby steps in Marklogic and working on a POC where some thousands of documents(word/pdf/excel/ppt) need to be loaded into Marklogic with necessary meta data(like country, region, line of business, Market...). User will perform a text search on the content of document and the results should be display the List of documents along with the facets based on Meta data. My approach is to use Marklogic CPF for Word/pdf/ppt/excel to store as XHTML doc and the metadata to be stored as elements in the properties of XHTML doc. in XHTML the content is stored in <p> element. I am planning to perform a element search query on <p> and fetch the associated Metadata from Properties doc. Please suggest if you have a better approach and let me know how to do constraints on metadata elements to display facets. Thanks in Advance, Rushabh =====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
