In addition to the suggested CPF, if you want to extract metadata and text, perhaps to create your own searchable proxy document, take a look at xdmp:document-filter:
https://docs.marklogic.com/guide/search-dev/binary-document-metadata On Mon, Jul 7, 2014 at 9:44 AM, <[email protected]> wrote: > Hi, > > when i tried inserting a pdf into ML, it was stored as a binary. > But the problem is i am not able to do any text search out of this pdf. > since text files are searchable i tried inserting the pdf as a text file > by specifying "auto" encoding in xdmp:document-load. But when i ran a > search i did not get results. > > Please suggest me a way to store pdf files inside ML(possibly using ML > API's) whose content can be searched the way xml and text files are > searched. > >
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
