Jasha, It works! =]
Thanks! Wilson 2008/7/16 Jasha Joachimsthal <[EMAIL PROTECTED]>: > What about using the HippoLastModifiedExtractor [1]? Haven't tested it but > I guess it would fit your needs. > > [1] > http://hippocms.org/display/CMS/4.+Hippo+Repository+Configure+Extractors#4.HippoRepositoryConfigureExtractors-l.hippo.slide.extractor.HippoLastmodifiedExtr. > .. > > <extractor classname="nl.hippo.slide.extractor.HippoLastmodifiedExtractor" > uri="/files/default.preview/binaries" > content-type="application/pdf"> > <configuration> > <instruction property="publicatieDatum" namespace=" > http://hippo.nl/cms/1.0" outputFormat="yyyyMMddHHmm"/> > </configuration> > </extractor> > > Jasha > > > -----Oorspronkelijk bericht----- > Van: [EMAIL PROTECTED] namens Ard Schrijvers > Verzonden: wo 16-7-2008 18:12 > Aan: Hippo CMS development public mailinglist > Onderwerp: RE: [HippoCMS-dev] PDF extractor > > Hello, > > Currently it is not supported, but if you know (take a look at) Apache > TIKA, you might see how to extract a date from a pdf. If it is not done > there, then I think you cannot extract a date from pdf, but I assume it > should be possible > > ard > > > Hi Jascha, > > Thanks for your reply. > > I need this date just to be sorted with the dasl query, > > manually filling this date it's really no option. > > Is there a way when uploading the pdf to get the current date > > and automatically set it to the pdf file as a specific property? > > > > Thank you, > > > > Wilson > > > > > > 2008/7/16 Jasha Joachimsthal <[EMAIL PROTECTED]>: > > > > > The PDF extractor only indexes the text content of the PDF. > > Some other > > > extractors can also set a property on a document ehich is either a > > > static value or a value based on some xpath in your xml document. > > > Since binaries like PDFs don't get published, you won't have a > > > publicationDate. It is possible to set properties by hand > > from the CMS > > > like the caption. In the properties.xml you use for assets > > you can add > > > <Property> > > > <Name>publicatieDatum</Name> > > > <DisplayName>Publicatie datum</DisplayName> > > > <Namespace>http://hippo.nl/cms/1.0</Namespace> > > > <NamespacePrefix>cms</NamespacePrefix> > > > <Datatype>date</Datatype> > > > </Property> > > > > > > You'll get a date field when you click on the PDF. A sample > > > properties.xml can be found in src/cocoon/types/collection > > > > > > Jasha > > > > > > -----Oorspronkelijk bericht----- > > > Van: [EMAIL PROTECTED] namens Wilson de Paula > > > Pedro Junior > > > Verzonden: wo 16-7-2008 11:00 > > > Aan: Hippo CMS development public mailinglist > > > Onderwerp: [HippoCMS-dev] PDF extractor > > > > > > Hi guys, > > > > > > I hope someone can help me with this one. > > > We have a dasl query which is used to search news articles, > > pdf's and > > > word documents. > > > The resultset must be sorted by date. News article and word has > > > already a publicationDate property where I can sort. > > > But the pdf don't. Anybody knows how I can use the extractor to > > > extract its creationdate and set as property publicationDate in the > > > http://hippo.nl/cms/1.0 namespace? > > > The property of those 3 items must have the same name, in > > order to the > > > dasl works. > > > > > > I have tried: > > > > > > <extractor classname="org.apache.slide.extractor.PDFExtractor" > > > uri="/files/default.preview/binaries" > > content-type="application/pdf"> > > > <configuration> > > > <instruction property="publicatiedatum" namespace=" > > > http://hippo.nl/cms/1.0" summary-information="4"/> > > </configuration> > > > </extractor> > > > > > > But I have no idea if I can use summary-information here. > > > > > > > > > And I have tried to use the ConstantExtractor to set a > > property from > > > the DAV > > > property: > > > <extractor classname="nl.hippo.slide.extractor.ConstantExtractor" > > > uri="/files/project.preview/binaries" > > content-type="application/pdf" > > > > <configuration> > > > <instruction property="publicatiedatum" namespace=" > > > http://hippo.nl/cms/1.0" value="DAV:name" /> > > > </configuration> > > > </extractor> > > > > > > Thanks in advance. > > > > > > Wilson > > > ******************************************** > > > Hippocms-dev: Hippo CMS development public mailinglist > > > > > > Searchable archives can be found at: > > > MarkMail: http://hippocms-dev.markmail.org > > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > > > > > > > > ******************************************** > > > Hippocms-dev: Hippo CMS development public mailinglist > > > > > > Searchable archives can be found at: > > > MarkMail: http://hippocms-dev.markmail.org > > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > > > > > > > ******************************************** > > Hippocms-dev: Hippo CMS development public mailinglist > > > > Searchable archives can be found at: > > MarkMail: http://hippocms-dev.markmail.org > > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > > Searchable archives can be found at: > MarkMail: http://hippocms-dev.markmail.org > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > > ******************************************** > Hippocms-dev: Hippo CMS development public mailinglist > > Searchable archives can be found at: > MarkMail: http://hippocms-dev.markmail.org > Nabble: http://www.nabble.com/Hippo-CMS-f26633.html > > > ******************************************** Hippocms-dev: Hippo CMS development public mailinglist Searchable archives can be found at: MarkMail: http://hippocms-dev.markmail.org Nabble: http://www.nabble.com/Hippo-CMS-f26633.html
