Jasha,

It works! =]

Thanks!

Wilson


2008/7/16 Jasha Joachimsthal <[EMAIL PROTECTED]>:

> What about using the HippoLastModifiedExtractor [1]? Haven't tested it but
> I guess it would fit your needs.
>
> [1]
> http://hippocms.org/display/CMS/4.+Hippo+Repository+Configure+Extractors#4.HippoRepositoryConfigureExtractors-l.hippo.slide.extractor.HippoLastmodifiedExtr.
> ..
>
> <extractor classname="nl.hippo.slide.extractor.HippoLastmodifiedExtractor"
> uri="/files/default.preview/binaries"
>  content-type="application/pdf">
>    <configuration>
>       <instruction property="publicatieDatum" namespace="
> http://hippo.nl/cms/1.0"; outputFormat="yyyyMMddHHmm"/>
>    </configuration>
>  </extractor>
>
> Jasha
>
>
> -----Oorspronkelijk bericht-----
> Van: [EMAIL PROTECTED] namens Ard Schrijvers
> Verzonden: wo 16-7-2008 18:12
> Aan: Hippo CMS development public mailinglist
> Onderwerp: RE: [HippoCMS-dev] PDF extractor
>
> Hello,
>
> Currently it is not supported, but if you know (take a look at) Apache
> TIKA, you might see how to extract a date from a pdf. If it is not done
> there, then I think you cannot extract a date from pdf, but I assume it
> should be possible
>
> ard
>
> > Hi Jascha,
> > Thanks for your reply.
> > I need this date just to be sorted with the dasl query,
> > manually filling this date it's really no option.
> > Is there a way when uploading the pdf to get the current date
> > and automatically set it to the pdf file as a specific property?
> >
> > Thank you,
> >
> > Wilson
> >
> >
> > 2008/7/16 Jasha Joachimsthal <[EMAIL PROTECTED]>:
> >
> > > The PDF extractor only indexes the text content of the PDF.
> > Some other
> > > extractors can also set a property on a document ehich is either a
> > > static value or a value based on some xpath in your xml document.
> > > Since binaries like PDFs don't get published, you won't have a
> > > publicationDate. It is possible to set properties by hand
> > from the CMS
> > > like the caption. In the properties.xml you use for assets
> > you can add
> > > <Property>
> > >    <Name>publicatieDatum</Name>
> > >    <DisplayName>Publicatie datum</DisplayName>
> > >    <Namespace>http://hippo.nl/cms/1.0</Namespace>
> > >    <NamespacePrefix>cms</NamespacePrefix>
> > >    <Datatype>date</Datatype>
> > >  </Property>
> > >
> > > You'll get a date field when you click on the PDF. A sample
> > > properties.xml can be found in src/cocoon/types/collection
> > >
> > > Jasha
> > >
> > > -----Oorspronkelijk bericht-----
> > > Van: [EMAIL PROTECTED] namens Wilson de Paula
> > > Pedro Junior
> > > Verzonden: wo 16-7-2008 11:00
> > > Aan: Hippo CMS development public mailinglist
> > > Onderwerp: [HippoCMS-dev] PDF extractor
> > >
> > > Hi guys,
> > >
> > > I hope someone can help me with this one.
> > > We have a dasl query which is used to search news articles,
> > pdf's and
> > > word documents.
> > > The resultset must be sorted by date. News article and word has
> > > already a publicationDate property where I can sort.
> > > But the pdf don't. Anybody knows how I can use the extractor to
> > > extract its creationdate and set as property publicationDate in the
> > > http://hippo.nl/cms/1.0 namespace?
> > > The property of those 3 items must have the same name, in
> > order to the
> > > dasl works.
> > >
> > > I have tried:
> > >
> > >  <extractor classname="org.apache.slide.extractor.PDFExtractor"
> > > uri="/files/default.preview/binaries"
> > content-type="application/pdf">
> > > <configuration>
> > >    <instruction property="publicatiedatum" namespace="
> > > http://hippo.nl/cms/1.0"; summary-information="4"/>
> > </configuration>
> > > </extractor>
> > >
> > > But I have no idea if I can use summary-information here.
> > >
> > >
> > > And I have tried to use the ConstantExtractor to set a
> > property from
> > > the DAV
> > > property:
> > >  <extractor classname="nl.hippo.slide.extractor.ConstantExtractor"
> > > uri="/files/project.preview/binaries"
> > content-type="application/pdf" >
> > >   <configuration>
> > >     <instruction property="publicatiedatum" namespace="
> > > http://hippo.nl/cms/1.0"; value="DAV:name" />
> > >   </configuration>
> > >  </extractor>
> > >
> > > Thanks in advance.
> > >
> > > Wilson
> > > ********************************************
> > > Hippocms-dev: Hippo CMS development public mailinglist
> > >
> > > Searchable archives can be found at:
> > > MarkMail: http://hippocms-dev.markmail.org
> > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
> > >
> > >
> > >
> > > ********************************************
> > > Hippocms-dev: Hippo CMS development public mailinglist
> > >
> > > Searchable archives can be found at:
> > > MarkMail: http://hippocms-dev.markmail.org
> > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
> > >
> > >
> > >
> > ********************************************
> > Hippocms-dev: Hippo CMS development public mailinglist
> >
> > Searchable archives can be found at:
> > MarkMail: http://hippocms-dev.markmail.org
> > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
> >
> >
> ********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
>
> Searchable archives can be found at:
> MarkMail: http://hippocms-dev.markmail.org
> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>
>
>
> ********************************************
> Hippocms-dev: Hippo CMS development public mailinglist
>
> Searchable archives can be found at:
> MarkMail: http://hippocms-dev.markmail.org
> Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
>
>
>
********************************************
Hippocms-dev: Hippo CMS development public mailinglist

Searchable archives can be found at:
MarkMail: http://hippocms-dev.markmail.org
Nabble: http://www.nabble.com/Hippo-CMS-f26633.html

Reply via email to