Hello there, > Do you have examples that could become test cases? > > On Sat, Feb 13, 2010 at 2:32 AM, Villu Ruusmann > <[email protected]>wrote: > >> now I see approximately 10 >> % of my "text extraction" tests failing. The problem is related to >> incorrect text decoding (eg. there is gibberish like "b?fi??" instead >> of text). >> >
I have isolated the case as PDFBOX-619. Everything will be fine if this patch is applied to FontBox 1.0.1-SNAPSHOT. The discussion about font samples probably merits its own thread. Most of the time I'm digging copyrighted content (scientific articles), which embed copyrighted font programs. How would the ASF react if these font programs were extracted, placed under version control and used to develop test cases? VR
