If you have your own dictionary/list of terms, you could use that with 
cts:highlight()

http://docs.marklogic.com/cts:highlight

Also, For a free 3rd party enrichment tool, I  like OpenCalais.

http://www.opencalais.com/documentation/calais-web-service-api/api-metadata/entity-index-and-definitions

Identifies Entities, and of more interest, Facts and Events.  The service has 
gotten pretty powerful over time, is free (usage limits 50k/day, 4  reqs/sec;), 
and accessible through a REST API. (xdmp:http-get())

Bonus: MarkLogic Server ships with a sample Calais pipeline.  The code 
associated with the pipeline can be found at:

\MarkLogic\Modules\MarkLogic\samples\calais-enrich

This code will give you a jump start with what you need to start calling the 
service.

There's a ton of info in the Calais response and our sample only identifies 
entities so hack accordingly.

The power of our sample is that it will enrich entities inline (awesome), but 
to do this the code calls the service more than one time per document/node so 
be aware:  with the sample code, 50k Calais API calls != 50k documents 
processed.

Hope this helps,
Pete


From: [email protected] 
[mailto:[email protected]] On Behalf Of Abhishek53 S
Sent: Wednesday, November 21, 2012 10:42 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Semantic Analysis Using XQuery

Hi All,

As per my understanding semantic analysis of content is only possible using 
third party enrichment engine like TEMIS LUXID (already have inbuilt pipeline 
in Marklogic)

Can we built some xQuery based capabily [Any existing API/Research 
Paper/Algorithm] to provide same functionality

eg.

St. Paul means Saint Paul

Paul st. means Paul street


Any suggestion will be highly appreciated!!!
Thanks
Abhishek Srivastav
Tata Consultancy Services
Cell:- +91-9883389968
Mailto: [email protected]<mailto:[email protected]>
Website: http://www.tcs.com<http://www.tcs.com/>
____________________________________________
Experience certainty. IT Services
Business Solutions
Outsourcing
____________________________________________


=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to