[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-18 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932100#comment-16932100
 ] 

Tilman Hausherr commented on PDFBOX-4648:
-

No, you would have to use OCR. The problem occurs when creating the PDF. One 
could recreate the ToUnicode table but it would take hours and probably work 
only for that file.
https://stackoverflow.com/questions/39485920/how-to-add-unicode-in-truetype0font-on-pdfbox-2-0-0


> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf, image-2019-09-12-08-47-32-706.png, 
> image-2019-09-18-05-55-26-771.png
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-17 Thread wanling (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932083#comment-16932083
 ] 

wanling commented on PDFBOX-4648:
-

Do you know any way to solve this problem?



> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf, image-2019-09-12-08-47-32-706.png, 
> image-2019-09-18-05-55-26-771.png
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-17 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932024#comment-16932024
 ] 

Tilman Hausherr commented on PDFBOX-4648:
-

"511tm" is missing both in Adobe and in PDFBox. If you look at the font "F5" in 
PDFDebugger you'll see that the column "Unicode character" is missing.

!image-2019-09-18-05-55-26-771.png!

> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf, image-2019-09-12-08-47-32-706.png, 
> image-2019-09-18-05-55-26-771.png
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-17 Thread wanling (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931969#comment-16931969
 ] 

wanling commented on PDFBOX-4648:
-

sorry  ,I got  "SLIM CUT". this is a typo when writing here.I am concerned with 
'511tm',so I slightly overlooked it.

> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf, image-2019-09-12-08-47-32-706.png
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-17 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931619#comment-16931619
 ] 

Tilman Hausherr commented on PDFBOX-4648:
-

But you wrote that you got "SLM CUT" (instead of "SLIM CUT"). Or was this a 
typo when writing here? Do you get "SLM CUT" or "SLIM CUT" with text extraction 
from 2.0.16?

> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf, image-2019-09-12-08-47-32-706.png
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-17 Thread wanling (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931266#comment-16931266
 ] 

wanling commented on PDFBOX-4648:
-

My computer display is the same as yours. So far, no solution has been found.

> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf, image-2019-09-12-08-47-32-706.png
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-12 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928282#comment-16928282
 ] 

Tilman Hausherr commented on PDFBOX-4648:
-

The squares are Adobe only, so we can't do anything.

The missing "511 TM" is also missing on Adobe text extraction. This is because 
the font has no ToUnicode stream.

"SLIM CUT" appears fine here. Even if I use 2.0.4.

Please try again with 2.0.16, make sure you have a current java version on your 
computer, then download and run PDFDebugger and look for the font F4 in your 
file. Here's how it looks on my system:

!image-2019-09-12-08-46-39-391.png!

> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf, image-2019-09-12-08-46-39-391.png
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-12 Thread wanling (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928263#comment-16928263
 ] 

wanling commented on PDFBOX-4648:
-

I can export it like yours.but the part words is missing. 511 TM   SLIM CUT 
 is SLM CUT    .511tm is missing  .it is usefull.

> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-12 Thread wanling (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16928260#comment-16928260
 ] 

wanling commented on PDFBOX-4648:
-

Thanks for your answ

> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4648) OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not implemented in PDFBox and will be ignored

2019-09-11 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927838#comment-16927838
 ] 

Tilman Hausherr commented on PDFBOX-4648:
-

I assume this is a follow-up of PDFBOX-4647. You could have reopened the issue. 
Anyway, I have attached two text extractions, one by PDFBox and one by Adobe. 
What are you missing?

> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
> -
>
> Key: PDFBOX-4648
> URL: https://issues.apache.org/jira/browse/PDFBOX-4648
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Text extraction
>Affects Versions: 2.0.4
>Reporter: wanling
>Priority: Major
> Attachments: 5e214f828f164322a6600f183191dda5-Adobe.txt, 
> 5e214f828f164322a6600f183191dda5-PDFBox.txt, 
> 5e214f828f164322a6600f183191dda5.pdf
>
>
> No PostScript name information is provided for the font Arial-BoldMT
> OpenType Layout tables used in font ABCDEE+Times New Roman,Bold are not 
> implemented in PDFBox and will be ignored
>  No Unicode mapping for CID+47 (47) in font ABCDEE+Times New Roman,Bold
>  
> Adobe is normal but  pdfbox cann't see the _parts  not all_.  OCI cann‘t see 
> it  completely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org