If you have your own dictionary/list of terms, you could use that with cts:highlight()
http://docs.marklogic.com/cts:highlight Also, For a free 3rd party enrichment tool, I like OpenCalais. http://www.opencalais.com/documentation/calais-web-service-api/api-metadata/entity-index-and-definitions Identifies Entities, and of more interest, Facts and Events. The service has gotten pretty powerful over time, is free (usage limits 50k/day, 4 reqs/sec;), and accessible through a REST API. (xdmp:http-get()) Bonus: MarkLogic Server ships with a sample Calais pipeline. The code associated with the pipeline can be found at: \MarkLogic\Modules\MarkLogic\samples\calais-enrich This code will give you a jump start with what you need to start calling the service. There's a ton of info in the Calais response and our sample only identifies entities so hack accordingly. The power of our sample is that it will enrich entities inline (awesome), but to do this the code calls the service more than one time per document/node so be aware: with the sample code, 50k Calais API calls != 50k documents processed. Hope this helps, Pete From: [email protected] [mailto:[email protected]] On Behalf Of Abhishek53 S Sent: Wednesday, November 21, 2012 10:42 AM To: MarkLogic Developer Discussion Subject: [MarkLogic Dev General] Semantic Analysis Using XQuery Hi All, As per my understanding semantic analysis of content is only possible using third party enrichment engine like TEMIS LUXID (already have inbuilt pipeline in Marklogic) Can we built some xQuery based capabily [Any existing API/Research Paper/Algorithm] to provide same functionality eg. St. Paul means Saint Paul Paul st. means Paul street Any suggestion will be highly appreciated!!! Thanks Abhishek Srivastav Tata Consultancy Services Cell:- +91-9883389968 Mailto: [email protected]<mailto:[email protected]> Website: http://www.tcs.com<http://www.tcs.com/> ____________________________________________ Experience certainty. IT Services Business Solutions Outsourcing ____________________________________________ =====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
