2009/12/16 Maurizio Pillitu <[email protected]>:
> Hi everyone,
> I'm trying to use the PDFExtractor (using Hippo Repository 1.2.15); I've
> added to my (default) extractors.xml the following:
>
> ....
> <extractor classname="org.apache.slide.extractor.PDFExtractor"
> uri="/files/default.preview/binaries" content-type="application/pdf"/>
> .....
>
> then I dropped a Google Docs generated PDF file (attached) in
> /files/default.preview/binaries (via WebDAV); I see the repository logging
> some interesting bits (attached) as if the extraction process went fine, but
> I can't see the extracted data; I'd have expected a WebDAV property attached
> to the file, but nothing shows up; this is the list of properties related
> with the PDF file (using DAVExplorer)
>
> getlastmodified DAV: Wed, 16 Dec 2009 09:38:35 GMT
> displayname DAV: this_is_my_title.pdf
> modificationdate DAV: 2009-12-16T09:38:35Z
> UID DAV: 96da71317f000001004b0bbb796bcb32
> supportedlock DAV:
> getcontenttype DAV: application/pdf
> getcontentlength DAV: 5078
> resourcetype DAV:
> getcontentlanguage DAV: en
> getetag DAV: ada3fdca64b1fd70a3d7b2ed66b3e68b
> lockdiscovery DAV:
> source DAV:
> creationdate DAV: 2009-12-16T09:38:35Z
>
>
> I feel like I'm missing something on how the PDFExtractor works; I've looked
> for some documentation or specific configurations, but I couldn't find
> anything interesting.
>
> Any hints?
> TIA
>  mau
>
> Met vriendelijke groet,
> --
> Maurizio Pillitu - 0031 (0)615655668

Hey Mau,

the PDFExctrator doesn't set properties. It's just a full text indexer
for PDF files.


Jasha Joachimsthal

[email protected] - [email protected]

www.onehippo.com
Amsterdam - Hippo B.V. Oosteinde 11 1017 WT Amsterdam +31(0)20-5224466
San Francisco - Hippo USA Inc. 185 H Street, suite B, Petaluma CA
94952 +1 (707) 7734646
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Searchable archives can be found at:
MarkMail: http://hippocms-dev.markmail.org
Nabble: http://www.nabble.com/Hippo-CMS-f26633.html

Reply via email to