[ https://issues.apache.org/jira/browse/PDFBOX-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tilman Hausherr updated PDFBOX-3127: ------------------------------------ Attachment: RAU4G6QMOVRYBISJU7R6MOVZCRFUO7P4-marked-1.png RAU4G6QMOVRYBISJU7R6MOVZCRFUO7P4.pdf > Text with vertical font not extracted correctly > ----------------------------------------------- > > Key: PDFBOX-3127 > URL: https://issues.apache.org/jira/browse/PDFBOX-3127 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Affects Versions: 1.8.10, 1.8.11, 2.0.0 > Reporter: Tilman Hausherr > Attachments: RAU4G6QMOVRYBISJU7R6MOVZCRFUO7P4-marked-1.png, > RAU4G6QMOVRYBISJU7R6MOVZCRFUO7P4.pdf > > > The attached file has a vertical font, although the text is horizontal. > Extraction with 1.8: > {quote} > NOTI CE OF PUBLI C HEARI NG > The Sout h Caroli na Depart ment of I nsurance will hol d a publi c > heari ng i n accordance wit h t he require ments of Secti on 38-3- > 110? 5? Thursday, April 29, 2010 at The Conf erence and Busi ness > Cent er at t he Grand Strand Ca mpus of t he Horry- Georgetown > Techni cal Coll ege, 950 Crabtree Lane, Myrtl e Beach, S. C., 29577 > fro m 5: 30 p. m.-7: 00 p. m. The purpose of t hi s heari ng i s t o provi > de > an opportunity t o di scuss and off er i nput concerni ng t he st atus of t > he > coastal property i nsurance market. The Conf erence Cent er i s l ocat ed > one mil e sout h of t he Myrtl e Beach I nt ernati onal Airport bet ween > Hi ghway 17 Busi ness and Hi ghway 17 Bypass. The t el ephone > nu mber f or t he Conf erence and Busi ness Cent er i s 843-477-2042. > {quote} > Extraction with 2.0: > {quote} > N O T I C E O F P U B L I C H E A R I N G > > T h e S o u t h C a r o l i n a D e p a r t m e n t o f I n s u r a n c > e w i l l h o l d a p u b l i c > h e a r i n g i n a c c o r d a n c e w i t h t h e r e q u i r e m e n > t s o f S e c t i o n 3 8 - 3 - > 1 1 0 ︵5 ︶ T h u r s d a y , A p r i l 2 9 , 2 0 1 0 a t T h e C o n f > e r e n c e a n d B u s i n e s s > C e n t e r a t t h e G r a n d S t r a n d C a m p u s o f t h e H o > r r y - G e o r g e t o w n > T e c h n i c a l C o l l e g e , 9 5 0 C r a b t r e e L a n e , M y r > t l e B e a c h , S . C . , 2 9 5 7 7 > f r o m 5 : 3 0 p . m . - 7 : 0 0 p . m . T h e p u r p o s e o f t > h i s h e a r i n g i s t o p r o v i d e > a n o p p o r t u n i t y t o d i s c u s s a n d o f f e r i n p u t > c o n c e r n i n g t h e s t a t u s o f t h e > c o a s t a l p r o p e r t y i n s u r a n c e m a r k e t . T h e C o > n f e r e n c e C e n t e r i s l o c a t e d > o n e m i l e s o u t h o f t h e M y r t l e B e a c h I n t e r n a > t i o n a l A i r p o r t b e t w e e n > H i g h w a y 1 7 B u s i n e s s a n d H i g h w a y 1 7 B y p a s s . > T h e t e l e p h o n e > n u m b e r f o r t h e C o n f e r e n c e a n d B u s i n e s s C e n > t e r i s 8 4 3 - 4 7 7 - 2 0 4 2 . > {quote} > A brute force change that uses the correct width, and that works only with > this file brings this: > {quote} > NOTICE OF PUBLIC HEARING > > The South Carolina Department of Insurance will hold a public > hearing in accordance with the requirements of Section 38-3- > 110 ︵5 ︶ Thursday, April 29, 2010 at The Conference and Business > Center at the Grand Strand Campus of the Horry-Georgetown > Technical College, 950 Crabtree Lane, Myrtle Beach, S.C., 29577 > from 5:30 p.m.-7:00 p.m. The purpose of this hearing is to provide > an opportunity to discuss and offer input concerning the status of the > coastal property insurance market. The Conference Center is located > one mile south of the Myrtle Beach International Airport between > Highway 17 Business and Highway 17 Bypass. The telephone > number for the Conference and Business Center is 843-477-2042. > {quote} > The problem is that the PDFTextStreamEngine doesn't work well with vertical > fonts. The red lines in the attached image show that the size is only half of > whats needed. It may be related to PDCIDFont.getDefaultPositionVector() but > changing that isn't enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org