Hi Paul,
You need to convert PDF and Office to XML in order to search it. To convert PDF and Office within MarkLogic Server, you need a license key that supports the conversion option. You are correct, you can load the documents as binary without the conversion option, but you cannot run XPath or cts:search on a binary document (well, you can run it, but it will not return any results). You can also use a third-party solution to perform the conversion, either before you load it into MarkLogic or by sending out to a web service after it is loaded. You can use CPF with any MarkLogic Server license, but the PDF/Word conversion built-in functions require a different license key. -Danny From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Vanderveen Sent: Tuesday, November 04, 2008 6:35 AM To: [email protected] Subject: [MarkLogic Dev General] PDF Full Text Search I am able to load PDF documents as binary content and retrieve them, but they are not searchable. Do I need to use CPF to convert them to HTML in order to search them? I had thought that PDF, Word, etc., could be indexed in MarkLogic. We are starting our development with the trial license without CPF features. Do we need to upgrade to enterprise to be able to search PDF's? -Paul
_______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
