[jira] [Commented] (PDFBOX-5124) Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) in embedded font stops parsing with EOFException

2021-03-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298959#comment-17298959
 ] 

ASF subversion and git services commented on PDFBOX-5124:
-

Commit 1887445 from Tilman Hausherr in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1887445 ]

PDFBOX-5123, PDFBOX-5124: gracefully recover from EOF in legacy version 0 
segment

> Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) 
> in embedded font stops parsing with EOFException
> 
>
> Key: PDFBOX-5124
> URL: https://issues.apache.org/jira/browse/PDFBOX-5124
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.22
>Reporter: Gábor Stefanik
>Priority: Major
> Attachments: PDFBOX-5124-new.txt, PDFBOX-5124-old.txt, 
> PDFBOX-5124.pdf-1-new.png, PDFBOX-5124.pdf-1-old.png, 
> SZAMLA-20190417-20190012706-ININET-BroadBitHungary-11646-HUF.pdf
>
>
> The attached document contains an incorrectly versioned 
> OS2WindowsMetricsTable. It's a version 0 table, but claims to be version 3. 
> Due to this, when we try to parse the new fields introduced in newer 
> versions, we hit an EOFException.
> Since this issue does occur in the wild, PDFBox should tolerate it, e.g. by 
> catching the EOFException and resetting the "version" variable to the highest 
> version that doesn't have the missing fields. (Note that the version 
> constants PDFBox checks against are wrong, but that's PDFBOX-5123.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5124) Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) in embedded font stops parsing with EOFException

2021-03-10 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298957#comment-17298957
 ] 

ASF subversion and git services commented on PDFBOX-5124:
-

Commit 1887444 from Tilman Hausherr in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1887444 ]

PDFBOX-5123, PDFBOX-5124: gracefully recover from EOF in legacy version 0 
segment

> Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) 
> in embedded font stops parsing with EOFException
> 
>
> Key: PDFBOX-5124
> URL: https://issues.apache.org/jira/browse/PDFBOX-5124
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.22
>Reporter: Gábor Stefanik
>Priority: Major
> Attachments: PDFBOX-5124-new.txt, PDFBOX-5124-old.txt, 
> PDFBOX-5124.pdf-1-new.png, PDFBOX-5124.pdf-1-old.png, 
> SZAMLA-20190417-20190012706-ININET-BroadBitHungary-11646-HUF.pdf
>
>
> The attached document contains an incorrectly versioned 
> OS2WindowsMetricsTable. It's a version 0 table, but claims to be version 3. 
> Due to this, when we try to parse the new fields introduced in newer 
> versions, we hit an EOFException.
> Since this issue does occur in the wild, PDFBox should tolerate it, e.g. by 
> catching the EOFException and resetting the "version" variable to the highest 
> version that doesn't have the missing fields. (Note that the version 
> constants PDFBox checks against are wrong, but that's PDFBOX-5123.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5124) Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) in embedded font stops parsing with EOFException

2021-03-09 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298521#comment-17298521
 ] 

Tilman Hausherr commented on PDFBOX-5124:
-

The release build (not the release) is planned for monday. I'm doing the 
regression tests to be sure that nothing bad happens. If I have the time to 
think about it, I'll also do a change for "version -1", i.e. an EOF at 
usLastCharIndex.

> Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) 
> in embedded font stops parsing with EOFException
> 
>
> Key: PDFBOX-5124
> URL: https://issues.apache.org/jira/browse/PDFBOX-5124
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.22
>Reporter: Gábor Stefanik
>Priority: Major
> Attachments: PDFBOX-5124-new.txt, PDFBOX-5124-old.txt, 
> PDFBOX-5124.pdf-1-new.png, PDFBOX-5124.pdf-1-old.png, 
> SZAMLA-20190417-20190012706-ININET-BroadBitHungary-11646-HUF.pdf
>
>
> The attached document contains an incorrectly versioned 
> OS2WindowsMetricsTable. It's a version 0 table, but claims to be version 3. 
> Due to this, when we try to parse the new fields introduced in newer 
> versions, we hit an EOFException.
> Since this issue does occur in the wild, PDFBox should tolerate it, e.g. by 
> catching the EOFException and resetting the "version" variable to the highest 
> version that doesn't have the missing fields. (Note that the version 
> constants PDFBox checks against are wrong, but that's PDFBOX-5123.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5124) Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) in embedded font stops parsing with EOFException

2021-03-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298517#comment-17298517
 ] 

ASF subversion and git services commented on PDFBOX-5124:
-

Commit 1887405 from Tilman Hausherr in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1887405 ]

PDFBOX-5123, PDFBOX-5124: gracefully recover from EOF

> Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) 
> in embedded font stops parsing with EOFException
> 
>
> Key: PDFBOX-5124
> URL: https://issues.apache.org/jira/browse/PDFBOX-5124
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.22
>Reporter: Gábor Stefanik
>Priority: Major
> Attachments: PDFBOX-5124-new.txt, PDFBOX-5124-old.txt, 
> PDFBOX-5124.pdf-1-new.png, PDFBOX-5124.pdf-1-old.png, 
> SZAMLA-20190417-20190012706-ININET-BroadBitHungary-11646-HUF.pdf
>
>
> The attached document contains an incorrectly versioned 
> OS2WindowsMetricsTable. It's a version 0 table, but claims to be version 3. 
> Due to this, when we try to parse the new fields introduced in newer 
> versions, we hit an EOFException.
> Since this issue does occur in the wild, PDFBox should tolerate it, e.g. by 
> catching the EOFException and resetting the "version" variable to the highest 
> version that doesn't have the missing fields. (Note that the version 
> constants PDFBox checks against are wrong, but that's PDFBOX-5123.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5124) Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) in embedded font stops parsing with EOFException

2021-03-09 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298519#comment-17298519
 ] 

ASF subversion and git services commented on PDFBOX-5124:
-

Commit 1887406 from Tilman Hausherr in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1887406 ]

PDFBOX-5123, PDFBOX-5124: gracefully recover from EOF

> Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) 
> in embedded font stops parsing with EOFException
> 
>
> Key: PDFBOX-5124
> URL: https://issues.apache.org/jira/browse/PDFBOX-5124
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.22
>Reporter: Gábor Stefanik
>Priority: Major
> Attachments: PDFBOX-5124-new.txt, PDFBOX-5124-old.txt, 
> PDFBOX-5124.pdf-1-new.png, PDFBOX-5124.pdf-1-old.png, 
> SZAMLA-20190417-20190012706-ININET-BroadBitHungary-11646-HUF.pdf
>
>
> The attached document contains an incorrectly versioned 
> OS2WindowsMetricsTable. It's a version 0 table, but claims to be version 3. 
> Due to this, when we try to parse the new fields introduced in newer 
> versions, we hit an EOFException.
> Since this issue does occur in the wild, PDFBox should tolerate it, e.g. by 
> catching the EOFException and resetting the "version" variable to the highest 
> version that doesn't have the missing fields. (Note that the version 
> constants PDFBox checks against are wrong, but that's PDFBOX-5123.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5124) Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) in embedded font stops parsing with EOFException

2021-03-09 Thread Jira


[ 
https://issues.apache.org/jira/browse/PDFBOX-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298398#comment-17298398
 ] 

Gábor Stefanik commented on PDFBOX-5124:


Unfortunately we have no control over what kind of messed up junk other PDF 
authoring tools put out. And as long as such junk continues to circulate, 
PDFBox will come across it, and users will blame PDFBox for rendering it 
"wrong", especially when other PDF readers render it correctly.

This is not like with HTML, where back in 2004, Firefox devs could rightfully 
say, "no, it's Internet Explorer that's rendering that page wrong". HTML can be 
fixed after it's originally made; with PDF, it's a lot more difficult.

> Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) 
> in embedded font stops parsing with EOFException
> 
>
> Key: PDFBOX-5124
> URL: https://issues.apache.org/jira/browse/PDFBOX-5124
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.22
>Reporter: Gábor Stefanik
>Priority: Major
> Attachments: PDFBOX-5124-new.txt, PDFBOX-5124-old.txt, 
> PDFBOX-5124.pdf-1-new.png, PDFBOX-5124.pdf-1-old.png, 
> SZAMLA-20190417-20190012706-ININET-BroadBitHungary-11646-HUF.pdf
>
>
> The attached document contains an incorrectly versioned 
> OS2WindowsMetricsTable. It's a version 0 table, but claims to be version 3. 
> Due to this, when we try to parse the new fields introduced in newer 
> versions, we hit an EOFException.
> Since this issue does occur in the wild, PDFBox should tolerate it, e.g. by 
> catching the EOFException and resetting the "version" variable to the highest 
> version that doesn't have the missing fields. (Note that the version 
> constants PDFBox checks against are wrong, but that's PDFBOX-5123.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5124) Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) in embedded font stops parsing with EOFException

2021-03-09 Thread Michael Klink (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298383#comment-17298383
 ] 

Michael Klink commented on PDFBOX-5124:
---

{quote}Since this issue does occur in the wild, PDFBox should tolerate it{quote}
There are so many issues occurring in the wild, is that really a reason to hail 
them?

> Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) 
> in embedded font stops parsing with EOFException
> 
>
> Key: PDFBOX-5124
> URL: https://issues.apache.org/jira/browse/PDFBOX-5124
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.22
>Reporter: Gábor Stefanik
>Priority: Major
> Attachments: PDFBOX-5124-new.txt, PDFBOX-5124-old.txt, 
> PDFBOX-5124.pdf-1-new.png, PDFBOX-5124.pdf-1-old.png, 
> SZAMLA-20190417-20190012706-ININET-BroadBitHungary-11646-HUF.pdf
>
>
> The attached document contains an incorrectly versioned 
> OS2WindowsMetricsTable. It's a version 0 table, but claims to be version 3. 
> Due to this, when we try to parse the new fields introduced in newer 
> versions, we hit an EOFException.
> Since this issue does occur in the wild, PDFBox should tolerate it, e.g. by 
> catching the EOFException and resetting the "version" variable to the highest 
> version that doesn't have the missing fields. (Note that the version 
> constants PDFBox checks against are wrong, but that's PDFBOX-5123.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5124) Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) in embedded font stops parsing with EOFException

2021-03-09 Thread Jira


[ 
https://issues.apache.org/jira/browse/PDFBOX-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297988#comment-17297988
 ] 

Gábor Stefanik commented on PDFBOX-5124:


No, that's exactly what I was expecting. The "slight glyph difference" is 
likely due to actually using the embedded font, and not falling back to 
Liberation Sans.

> Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) 
> in embedded font stops parsing with EOFException
> 
>
> Key: PDFBOX-5124
> URL: https://issues.apache.org/jira/browse/PDFBOX-5124
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.22
>Reporter: Gábor Stefanik
>Priority: Major
> Attachments: PDFBOX-5124-new.txt, PDFBOX-5124-old.txt, 
> PDFBOX-5124.pdf-1-new.png, PDFBOX-5124.pdf-1-old.png, 
> SZAMLA-20190417-20190012706-ININET-BroadBitHungary-11646-HUF.pdf
>
>
> The attached document contains an incorrectly versioned 
> OS2WindowsMetricsTable. It's a version 0 table, but claims to be version 3. 
> Due to this, when we try to parse the new fields introduced in newer 
> versions, we hit an EOFException.
> Since this issue does occur in the wild, PDFBox should tolerate it, e.g. by 
> catching the EOFException and resetting the "version" variable to the highest 
> version that doesn't have the missing fields. (Note that the version 
> constants PDFBox checks against are wrong, but that's PDFBOX-5123.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5124) Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) in embedded font stops parsing with EOFException

2021-03-08 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297866#comment-17297866
 ] 

Tilman Hausherr commented on PDFBOX-5124:
-

I did a quick test with this change (the real change will do proper logging and 
use EOF)
{code}
if (version >= 1)
{
try
{
codePageRange1 = data.readUnsignedInt();
codePageRange2 = data.readUnsignedInt();
}
catch (IOException ex)
{
version = 1;
ex.printStackTrace();
return;
}
}
{code}
Text extraction has no differences, rendering has a slight glyph difference. 
Were you expecting more than this?

> Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) 
> in embedded font stops parsing with EOFException
> 
>
> Key: PDFBOX-5124
> URL: https://issues.apache.org/jira/browse/PDFBOX-5124
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.22
>Reporter: Gábor Stefanik
>Priority: Major
> Attachments: 
> SZAMLA-20190417-20190012706-ININET-BroadBitHungary-11646-HUF.pdf
>
>
> The attached document contains an incorrectly versioned 
> OS2WindowsMetricsTable. It's a version 0 table, but claims to be version 3. 
> Due to this, when we try to parse the new fields introduced in newer 
> versions, we hit an EOFException.
> Since this issue does occur in the wild, PDFBox should tolerate it, e.g. by 
> catching the EOFException and resetting the "version" variable to the highest 
> version that doesn't have the missing fields. (Note that the version 
> constants PDFBox checks against are wrong, but that's PDFBOX-5123.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-5124) Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) in embedded font stops parsing with EOFException

2021-03-08 Thread Tilman Hausherr (Jira)


[ 
https://issues.apache.org/jira/browse/PDFBOX-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297647#comment-17297647
 ] 

Tilman Hausherr commented on PDFBOX-5124:
-

Yeah that makes sense.

> Improperly declared OS2WindowsMetricsTable version (v0 table declard as v3) 
> in embedded font stops parsing with EOFException
> 
>
> Key: PDFBOX-5124
> URL: https://issues.apache.org/jira/browse/PDFBOX-5124
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.22
>Reporter: Gábor Stefanik
>Priority: Major
> Attachments: 
> SZAMLA-20190417-20190012706-ININET-BroadBitHungary-11646-HUF.pdf
>
>
> The attached document contains an incorrectly versioned 
> OS2WindowsMetricsTable. It's a version 0 table, but claims to be version 3. 
> Due to this, when we try to parse the new fields introduced in newer 
> versions, we hit an EOFException.
> Since this issue does occur in the wild, PDFBox should tolerate it, e.g. by 
> catching the EOFException and resetting the "version" variable to the highest 
> version that doesn't have the missing fields. (Note that the version 
> constants PDFBox checks against are wrong, but that's PDFBOX-5123.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org