Re: [MarkLogic Dev General] searching pdf content

Christopher Hamlin Mon, 07 Jul 2014 07:40:41 -0700

In addition to the suggested CPF, if you want to extract metadata and text,
perhaps to create your own searchable proxy document, take a look at
xdmp:document-filter:


https://docs.marklogic.com/guide/search-dev/binary-document-metadata


On Mon, Jul 7, 2014 at 9:44 AM, <[email protected]> wrote:

>  Hi,
>
>        when i tried inserting a pdf into ML, it was stored as a binary.
> But the problem is i am not able to do any text search out of this pdf.
>  since text files are searchable i tried inserting the pdf as a text file
> by specifying "auto" encoding in xdmp:document-load. But when i ran a
> search i did not get results.
>
>  Please suggest me a way to store pdf files inside ML(possibly using ML
> API's) whose content can be searched the way xml and text files are
> searched.
>
>

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] searching pdf content

Reply via email to