Hi Peter, Describe your method!
Cheers, Alec Taylor On Thu, Nov 10, 2011 at 1:51 AM, Peter A. Kerzum <[email protected]> wrote: > Hi! > > We use some approach based on character properties to extract meaningful title > from document text. Metadata usualy stores filename in title field. > > -- > Peter > > On Wednesday 09 November 2011 16:16:14 Alec Taylor wrote: >> On Wed, Nov 9, 2011 at 10:37 PM, Albert Astals Cid <[email protected]> wrote: >> > A Dimecres, 9 de novembre de 2011, Alec Taylor vàreu escriure: >> >> Incorrect, all getDocInfo tells you is what the meta info says, it >> >> doesn't analyse the actual document, whereas my pdftopdf will update >> >> the metadata with the appropriate info after PDF analysis >> > >> > Please do not top post, makes reading e-mail incredibly hard. >> > >> > And no it is not incorrect, if the metadata does not have a title, then >> > the document does not have a title as defined per the spec. >> > >> > Albert >> >> But maybe the document doesn't have a title, because it was grabbed >> from scanning the book, then OCRing it. So what I will facilitate is >> the generation of proper metadata (+ more) from a current PDF lacking >> such. >> >> So if the document does have a title, my pdftopdf tool will find it, >> and add it to the metadata. >> >> I will contribute pdftopdf to poppler. >> _______________________________________________ >> poppler mailing list >> [email protected] >> http://lists.freedesktop.org/mailman/listinfo/poppler > > _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
