Hi Paul,

 

You need to convert PDF and Office to XML in order to search it.  To
convert PDF and Office within MarkLogic Server, you need a license key
that supports the conversion option.   

 

You are correct, you can load the documents as binary without the
conversion option, but you cannot run XPath or cts:search on a binary
document (well, you can run it, but it will not return any results).
You can also use a third-party solution to perform the conversion,
either before you load it into MarkLogic or by sending out to a web
service after it is loaded.  You can use CPF with any MarkLogic Server
license, but the PDF/Word conversion built-in functions require a
different license key.

 

-Danny

 

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Paul
Vanderveen
Sent: Tuesday, November 04, 2008 6:35 AM
To: [email protected]
Subject: [MarkLogic Dev General] PDF Full Text Search

 

I am able to load PDF documents as binary content and retrieve them, but
they are not searchable.  Do I need to use CPF to convert them to HTML
in order to search them?  I had thought that PDF, Word, etc.,  could be
indexed in MarkLogic.   We are starting our development with the trial
license without CPF features.  Do we need to upgrade to enterprise to be
able to search PDF's?

 

-Paul

 

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to