Thank you for your advice David, I'm trying this also for sure! Van
-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of David Sewell Sent: Tuesday, July 28, 2009 5:37 PM To: General Mark Logic Developer Discussion Subject: Re: [MarkLogic Dev General] PDF conversion trial It's worth comparing ML's PDF-to-XML (and XHTML) conversion against the export facility in Adobe Acrobat 9, if you have it. I've recently been evaluating the two. Neither is perfect, and they differ in exactly where their strengths and weaknesses are. It is very difficult to get letter-perfect XML/XHTML conversion from PDF, if the source is complex, because the underlying PDF data has all sorts of font changes, typographic features, and other things that cause "interference" in the output. For example, in converting the PDF from a typeset book containing wide angle brackets (U+2329 / U+232A or similar), the Acrobat export consistently captured them with styled <span>s, while the MarkLogic export sometimes captured them and sometimes dropped them or substituted '( )'. On the other hand, MarkLogic normalized ligature "fi"correctly as "fi", but Acrobat inserts an extra space, "fi " for no good reason. MarkLogic's PDF conversion pipelines give you more options over how the output will be structured than Acrobat does. DS On Tue, 28 Jul 2009, Baranov, Ivan - Moscow wrote: > Hi All > > I've recently tried to convert PDF to XML using built-it function > xdmp:pdf-convert() and discovered that my company's license does not > allow this. Actually I have my own converter so I just wanted to try > if ML does it better or faster and now I'm curious about, is there any > way to acquire such functionality on a trial basis? > Thanks, > Van > -- David Sewell, Editorial and Technical Manager ROTUNDA, The University of Virginia Press PO Box 801079, Charlottesville, VA 22904-4318 USA Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903 Email: [email protected] Tel: +1 434 924 9973 Web: http://rotunda.upress.virginia.edu/
_______________________________________________ General mailing list [email protected] http://xqzone.com/mailman/listinfo/general
