[jira] [Comment Edited] (PDFBOX-4304) Glyph Substitution Table lookup Cache doesn't clear by disabling a feature.

2018-09-04 Thread Aaron Madlon-Kay (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603801#comment-16603801
 ] 

Aaron Madlon-Kay edited comment on PDFBOX-4304 at 9/5/18 1:11 AM:
--

As I recall, when I wrote the GSUB code I put the cache in only partly for 
performance; there was also a correctness component to it.

(This is based on the last time I looked at the code, so I apologize if it's 
not quite accurate.)

When choosing a substitution, there is some dependence on context:

# The incoming Unicode characters's script influences the choice of LangSys 
which determines the available features
# Many Unicode characters have ambiguous script ("Default") which means we have 
to consider the surrounding text (it would be better to be able to set a 
default script/language for the whole document, but such a setting doesn't 
exist at the moment)
# The place where the script is determined 
({{GlyphSubstitutionTable.selectScriptTag}}) can't see the actual surrounding 
text, so it can only guess based on the last known-valid script used
# This means that without a cache to ensure consistency, an ambiguous-script 
character may be substituted differently throughout the document
# (I'm fuzzy on this point) I thought the way the Unicode map was created 
({{PDCIDFontType2Embedder.buildToUnicodeCMap}}) there was some need for a 
one-to-one correspondence

{quote}I think I've now understood the first part. You asking that the cache be 
reset when features are disabled or enabled, that makes sense, as long as 
getUnsubstitution() isn't used "too late".{quote}

I agree that resetting the cache makes sense, but I am also wary that the font 
needs to be stateless per the issues [~jahewson] found with my initial 
implementation. Unfortunately I haven't had time to see what changes were made 
to remove the statefulness.


was (Author: amake):
As I recall, when I wrote the GSUB code I put the cache in only partly for 
performance; there was also a correctness component to it.

(This is based on the last time I looked at the code, so I apologize if it's 
not quite accurate.)

When choosing a substitution, there is some dependence on context:

# The incoming Unicode characters's script influences the choice of LangSys 
which determines the available features
# Many Unicode characters have ambiguous script ("Default") which means we have 
to consider the surrounding text (it would be better to be able to set a 
default script/language for the whole document, but such a setting doesn't 
exist at the moment)
# The place where the script is determined 
({{GlyphSubstitutionTable.selectScriptTag}}) can't see the actual surrounding 
text, so it can only guess based on the last known-valid script used
# This means that without a cache to ensure consistency, an ambiguous-script 
character may be substituted differently throughout the document
# (I'm fuzzy on this point:) I thought the way the Unicode map was created 
({{PDCIDFontType2Embedder.buildToUnicodeCMap}}) there was some need for a 
one-to-one correspondence

{quote}I think I've now understood the first part. You asking that the cache be 
reset when features are disabled or enabled, that makes sense, as long as 
getUnsubstitution() isn't used "too late".{quote}

I agree that resetting the cache makes sense, but I am also wary that the font 
needs to be stateless per the issues [~jahewson] found with my initial 
implementation. Unfortunately I haven't had time to see what changes were made 
to remove the statefulness.

> Glyph Substitution Table lookup Cache doesn't clear by disabling a feature.
> ---
>
> Key: PDFBOX-4304
> URL: https://issues.apache.org/jira/browse/PDFBOX-4304
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.11
>Reporter: Ali Safe
>Priority: Major
> Attachments: FDK_aban.ttf
>
>
> When I want to use GlyphSubstitutionTable to find the substituted gid for a 
> specific glyph that have 3 forms of substitutions, I found the same gid for 
> each three forms.
> The font are a Persian font that have 3 substituted forms for some of it's 
> glyphs. I enabled the 'init', 'medi' and 'fina' features one by one and then 
> disable them. But all of these give me the same result.
> When I saw the GlyphSubstitutionTable class and getSubstitution(gid, 
> scriptTags, enabledFeatures) method in it, I saw a lookupCache that first 
> check for gid only, and if the gid existed returns the result, and if it's 
> not in lookupCache do other parsing and calculations. I think every time that 
> some features are disabled or enabled, this cache must be cleared. And also 
> the cache lookup must be a mapping of three of the function input argument, 
> because they are affect the 

[jira] [Commented] (PDFBOX-4304) Glyph Substitution Table lookup Cache doesn't clear by disabling a feature.

2018-09-04 Thread Aaron Madlon-Kay (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603801#comment-16603801
 ] 

Aaron Madlon-Kay commented on PDFBOX-4304:
--

As I recall, when I wrote the GSUB code I put the cache in only partly for 
performance; there was also a correctness component to it.

(This is based on the last time I looked at the code, so I apologize if it's 
not quite accurate.)

When choosing a substitution, there is some dependence on context:

# The incoming Unicode characters's script influences the choice of LangSys 
which determines the available features
# Many Unicode characters have ambiguous script ("Default") which means we have 
to consider the surrounding text (it would be better to be able to set a 
default script/language for the whole document, but such a setting doesn't 
exist at the moment)
# The place where the script is determined 
({{GlyphSubstitutionTable.selectScriptTag}}) can't see the actual surrounding 
text, so it can only guess based on the last known-valid script used
# This means that without a cache to ensure consistency, an ambiguous-script 
character may be substituted differently throughout the document
# (I'm fuzzy on this point:) I thought the way the Unicode map was created 
({{PDCIDFontType2Embedder.buildToUnicodeCMap}}) there was some need for a 
one-to-one correspondence

{quote}I think I've now understood the first part. You asking that the cache be 
reset when features are disabled or enabled, that makes sense, as long as 
getUnsubstitution() isn't used "too late".{quote}

I agree that resetting the cache makes sense, but I am also wary that the font 
needs to be stateless per the issues [~jahewson] found with my initial 
implementation. Unfortunately I haven't had time to see what changes were made 
to remove the statefulness.

> Glyph Substitution Table lookup Cache doesn't clear by disabling a feature.
> ---
>
> Key: PDFBOX-4304
> URL: https://issues.apache.org/jira/browse/PDFBOX-4304
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.11
>Reporter: Ali Safe
>Priority: Major
> Attachments: FDK_aban.ttf
>
>
> When I want to use GlyphSubstitutionTable to find the substituted gid for a 
> specific glyph that have 3 forms of substitutions, I found the same gid for 
> each three forms.
> The font are a Persian font that have 3 substituted forms for some of it's 
> glyphs. I enabled the 'init', 'medi' and 'fina' features one by one and then 
> disable them. But all of these give me the same result.
> When I saw the GlyphSubstitutionTable class and getSubstitution(gid, 
> scriptTags, enabledFeatures) method in it, I saw a lookupCache that first 
> check for gid only, and if the gid existed returns the result, and if it's 
> not in lookupCache do other parsing and calculations. I think every time that 
> some features are disabled or enabled, this cache must be cleared. And also 
> the cache lookup must be a mapping of three of the function input argument, 
> because they are affect the result of calculations. At least the lookupCache 
> must be a mapping of gid and enabledFeatures. 
> And when more than one feature are enabled, the lookup cache maps each gid to 
> only one substituted glyph, but in many languages there more than one 
> substitutions form for some glyphs. When I enable more than one features only 
> the last enabled feature will be affected. 
> I used this code and attached the mentioned font file...
> // Persian Beh Letter with code 1576 in the font
> // Enable init feature
> ttf.enableGsubFeature("init");
> CmapLookup cMapLookupInit = ttf.getUnicodeCmapLookup();
> int glyphIdInit = cMapLookupInit.getGlyphId(1576);
> ttf.disableGsubFeature("init");
> // Enable medi feature
> ttf.enableGsubFeature("medi");
> CmapLookup cMapLookupMedi = ttf.getUnicodeCmapLookup();
> int glyphIdMedi = cMapLookupMedi.getGlyphId(1576);
> ttf.disableGsubFeature("medi");
> // Now the glypIdMedi and glyphIdInit have same values...
> 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4106) Vertical text creation

2018-05-10 Thread Aaron Madlon-Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471307#comment-16471307
 ] 

Aaron Madlon-Kay commented on PDFBOX-4106:
--

Hi John. Thanks for raising these concerns, and sorry for having gotten us into 
this mess.

I am fine with any API changes deemed necessary. As for functionality, the only 
non-obvious thing that I want to mention is that it's important to be able to 
selectively enable/disable {{vrt2}}. The reason is that the substitutions 
performed by {{vrt2}} are not always more desirable than {{vert}}, and it is 
common to want to use only {{vert}} substitutions with a font that supports 
both.

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Assignee: Tilman Hausherr
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Fix For: 2.0.9, 3.0.0 PDFBox
>
> Attachments: 
> 0001-Add-OpenTypeScript-class-to-get-OT-script-tags-for-c.patch, 
> 0002-Optimize-Unicode-script-storage-and-lookup.patch, 
> 0003-Parse-GSUB-table.patch, 
> 0004-Abstract-cmap-lookup-into-an-interface.patch, 
> 0005-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0006-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0007-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> 0008-Implement-vertical-metrics-support-when-embedding-subsetting.patch, 
> FIX-0001-PDFBOX-4106-Remove-early-outs-leading-to-spurious-wa.patch, 
> FIX-0002-PDFBOX-4106-Document-GlyphSubstitutionTable-public-m.patch, 
> FIX-0003-PDFBOX-4106-Correct-deltaGlyphID-data-size.patch, 
> FIX-0004-PDFBOX-4106-Remove-unnecessary-vertical-displacement.patch, 
> FIX-0005-PDFBOX-4106-Remove-duplicate-DW2-creation.patch, 
> FIX-0006-PDFBOX-4106-Fix-non-embedded-vertical-font-rendering.patch, 
> FIX-0007-PDFBOX-4106-Fix-incorrect-parsing-of-W2-first-format.patch, 
> FIX-0008-PDFBOX-4106-Rename-misleading-field.patch, 
> FIX-0009-PDFBOX-4106-Allow-retrieving-vmtx-topSideBearing.patch, 
> FIX-0010-PDFBOX-4106-Correct-vmtx-embedding-for-proportional-.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-4113) Debugger file open dialog has incorrect filter on Mac

2018-02-17 Thread Aaron Madlon-Kay (JIRA)
Aaron Madlon-Kay created PDFBOX-4113:


 Summary: Debugger file open dialog has incorrect filter on Mac
 Key: PDFBOX-4113
 URL: https://issues.apache.org/jira/browse/PDFBOX-4113
 Project: PDFBox
  Issue Type: Bug
  Components: Swing GUI
Affects Versions: 2.0.8, 3.0.0 PDFBox
Reporter: Aaron Madlon-Kay
 Attachments: 0001-Fix-open-dialog-file-filter-on-Mac.patch

The file open dialog for Mac in the PDFDebugger tool has the file filter set up 
incorrectly. Instead of filtering the filename, it is filtering the directory 
name. Thus you can open any file in a _directory_ that ends with {{.pdf}}, but 
nothing else.

See also: [{{FilenameFilter.accept}} 
Javadoc|https://docs.oracle.com/javase/7/docs/api/java/io/FilenameFilter.html#accept(java.io.File,%20java.lang.String)]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4106) Vertical text creation

2018-02-17 Thread Aaron Madlon-Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368214#comment-16368214
 ] 

Aaron Madlon-Kay commented on PDFBOX-4106:
--

[~tilman] OK, sounds good. I have attached the fix patches, prefixed with 
{{FIX}}.

I added a test for proportional fonts that does indeed fail prior to the final 
patch.

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 
> 0001-Add-OpenTypeScript-class-to-get-OT-script-tags-for-c.patch, 
> 0002-Optimize-Unicode-script-storage-and-lookup.patch, 
> 0003-Parse-GSUB-table.patch, 
> 0004-Abstract-cmap-lookup-into-an-interface.patch, 
> 0005-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0006-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0007-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> 0008-Implement-vertical-metrics-support-when-embedding-subsetting.patch, 
> FIX-0001-PDFBOX-4106-Remove-early-outs-leading-to-spurious-wa.patch, 
> FIX-0002-PDFBOX-4106-Document-GlyphSubstitutionTable-public-m.patch, 
> FIX-0003-PDFBOX-4106-Correct-deltaGlyphID-data-size.patch, 
> FIX-0004-PDFBOX-4106-Remove-unnecessary-vertical-displacement.patch, 
> FIX-0005-PDFBOX-4106-Remove-duplicate-DW2-creation.patch, 
> FIX-0006-PDFBOX-4106-Fix-non-embedded-vertical-font-rendering.patch, 
> FIX-0007-PDFBOX-4106-Fix-incorrect-parsing-of-W2-first-format.patch, 
> FIX-0008-PDFBOX-4106-Rename-misleading-field.patch, 
> FIX-0009-PDFBOX-4106-Allow-retrieving-vmtx-topSideBearing.patch, 
> FIX-0010-PDFBOX-4106-Correct-vmtx-embedding-for-proportional-.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4106) Vertical text creation

2018-02-17 Thread Aaron Madlon-Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Madlon-Kay updated PDFBOX-4106:
-
Attachment: 
FIX-0001-PDFBOX-4106-Remove-early-outs-leading-to-spurious-wa.patch

FIX-0002-PDFBOX-4106-Document-GlyphSubstitutionTable-public-m.patch
FIX-0003-PDFBOX-4106-Correct-deltaGlyphID-data-size.patch

FIX-0004-PDFBOX-4106-Remove-unnecessary-vertical-displacement.patch
FIX-0005-PDFBOX-4106-Remove-duplicate-DW2-creation.patch

FIX-0006-PDFBOX-4106-Fix-non-embedded-vertical-font-rendering.patch

FIX-0007-PDFBOX-4106-Fix-incorrect-parsing-of-W2-first-format.patch
FIX-0008-PDFBOX-4106-Rename-misleading-field.patch
FIX-0009-PDFBOX-4106-Allow-retrieving-vmtx-topSideBearing.patch

FIX-0010-PDFBOX-4106-Correct-vmtx-embedding-for-proportional-.patch

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 
> 0001-Add-OpenTypeScript-class-to-get-OT-script-tags-for-c.patch, 
> 0002-Optimize-Unicode-script-storage-and-lookup.patch, 
> 0003-Parse-GSUB-table.patch, 
> 0004-Abstract-cmap-lookup-into-an-interface.patch, 
> 0005-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0006-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0007-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> 0008-Implement-vertical-metrics-support-when-embedding-subsetting.patch, 
> FIX-0001-PDFBOX-4106-Remove-early-outs-leading-to-spurious-wa.patch, 
> FIX-0002-PDFBOX-4106-Document-GlyphSubstitutionTable-public-m.patch, 
> FIX-0003-PDFBOX-4106-Correct-deltaGlyphID-data-size.patch, 
> FIX-0004-PDFBOX-4106-Remove-unnecessary-vertical-displacement.patch, 
> FIX-0005-PDFBOX-4106-Remove-duplicate-DW2-creation.patch, 
> FIX-0006-PDFBOX-4106-Fix-non-embedded-vertical-font-rendering.patch, 
> FIX-0007-PDFBOX-4106-Fix-incorrect-parsing-of-W2-first-format.patch, 
> FIX-0008-PDFBOX-4106-Rename-misleading-field.patch, 
> FIX-0009-PDFBOX-4106-Allow-retrieving-vmtx-topSideBearing.patch, 
> FIX-0010-PDFBOX-4106-Correct-vmtx-embedding-for-proportional-.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4106) Vertical text creation

2018-02-16 Thread Aaron Madlon-Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368124#comment-16368124
 ] 

Aaron Madlon-Kay commented on PDFBOX-4106:
--

[~tilman] {quote}No, I don't have a better solution. I've created a test that 
fails if somebody in the future isn't carefully enough.{quote}

I have the best solution: That fixup code is unnecessary; I added it before 
implementing subsetting, which I now understand obviates any need for fixing up 
at that point in the code. I have prepared a patch to remove it.

{quote}Can you explain the "Trying to un-substitute a never-before-seen gid" 
message one gets when using mode 3 in the example, i.e. disabling the two 
features? Why is this a "bad" thing / is it a bad thing?{quote}

As I intended it, trying to unsubstitute a gid that has never been substituted 
before indicates that something is going wrong: substitution is not really 
reversible, so without a cache you can't know the original for a given 
substitute. However, as you pointed out, due to the early outs in 
{{getSubstitution}} you will get spurious warnings. So I have removed the early 
outs so lookups are always cached, and added some documentation.

(What *really* should be done is that whoever called {{getSubstitution}} should 
keep a map of originals-to-substitutes, and {{getUnsubstitution}} and even 
{{CmapLookup.getCharCodes}} should probably never be called. Or, 
{{CmapSubtable}} needs to know about GSUB when it builds its gid-to-charcode 
map. All this seemed like too big of a change to tackle just yet.)

I am thinking about how best to test the existing changes and my new fixes. In 
the meantime, I wanted to ask: Are you OK with this ticket exploding with more 
patch attachments? Or might you prefer to grab from my [git 
branch|https://github.com/amake/pdfbox/tree/PDFBOX-4106]?

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 
> 0001-Add-OpenTypeScript-class-to-get-OT-script-tags-for-c.patch, 
> 0002-Optimize-Unicode-script-storage-and-lookup.patch, 
> 0003-Parse-GSUB-table.patch, 
> 0004-Abstract-cmap-lookup-into-an-interface.patch, 
> 0005-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0006-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0007-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> 0008-Implement-vertical-metrics-support-when-embedding-subsetting.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4106) Vertical text creation

2018-02-16 Thread Aaron Madlon-Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16367522#comment-16367522
 ] 

Aaron Madlon-Kay commented on PDFBOX-4106:
--

[~tilman] Thanks for the fixes.

I have just discovered that the vertical metrics handling is wrong for some 
proportional fonts. I am working on a fix.

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 
> 0001-Add-OpenTypeScript-class-to-get-OT-script-tags-for-c.patch, 
> 0002-Optimize-Unicode-script-storage-and-lookup.patch, 
> 0003-Parse-GSUB-table.patch, 
> 0004-Abstract-cmap-lookup-into-an-interface.patch, 
> 0005-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0006-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0007-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> 0008-Implement-vertical-metrics-support-when-embedding-subsetting.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4106) Vertical text creation

2018-02-15 Thread Aaron Madlon-Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366501#comment-16366501
 ] 

Aaron Madlon-Kay edited comment on PDFBOX-4106 at 2/16/18 1:19 AM:
---

[~tilman]
{quote}Re patch 6:
I removed "protected" thus making these fields package local, so we don't 
expose these previously private fields.

I'm wondering whether this commit is the best solution: from what I see, it 
"undoes" and/or "improves" something that is done in 
CIDFont.readVerticalDisplacements(). Had this been wrong before? Or was it 
wrong only for type 2 and not for type 0? If so, should there be an abstract 
method in the base class and a method in the subclasses?{quote}

Thanks for pointing this out. I am also not so happy with it; allow me to 
explain:

# Loading a TTF file as a vertical font: the {{PDCIDFont}} constructor is 
called as {{super}} from the {{PDCIDFontType2}} constructor; there we read 
vertical displacements from the CIDFont dictionary (DW2, W2 entries). In the 
case of a type 2 font being loaded from a file, there will be no DW2 or W2 so 
the result is that it loads nothing.
# Back in {{PDCIDFontType2}}, we must fix up the vertical displacement data 
based on the TTF. That is what {{fixVerticalDisplacements}} is doing.
# Vertical metrics are weird in that the TTF 'vhea' and 'vmtx' tables are never 
used by PDF readers; instead that info must be translated into DW2 and W2 
entries. After (2) the in-memory font is inconsistent in that it has vertical 
displacement data loaded but not present in the CIDFont dictionary. Thus we 
write the loaded data back into the CIDFont; that's 
{{freezeVerticalPositions}}. If we don't do this then e.g. when rendering to 
bitmap the font gets re-loaded from memory and the vertical positions will be 
wrong.

I wanted to simply override {{readVerticalDisplacements}}, but it would take 
significant changes to the initialization of both classes: we need the {{ttf}} 
and {{cid2gid}} fields to be initialized before we can fix up the 
displacements, but {{readVerticalDisplacements}} is called from the {{super}} 
constructor so these won't be initialized yet.

I chose to manipulate the vertical data fields directly rather than duplicate 
them and add getters because there's some extensive position-getting logic in 
the parent class that it seems I would need to duplicate.

If you have an idea for a better solution please let me know.


was (Author: amake):
[~tilman]
{quote}Re patch 6:
I removed "protected" thus making these fields package local, so we don't 
expose these previously private fields.

I'm wondering whether this commit is the best solution: from what I see, it 
"undoes" and/or "improves" something that is done in 
CIDFont.readVerticalDisplacements(). Had this been wrong before? Or was it 
wrong only for type 2 and not for type 0? If so, should there be an abstract 
method in the base class and a method in the subclasses?{quote}

Thanks for pointing this out. I am also not so happy with it; allow me to 
explain:

# Loading a TTF file as a vertical font: the {{PDCIDFont}} constructor is 
called as {{super}} from the {{PDCIDFontType2}} constructor; there we read 
vertical displacements from the CIDFont dictionary (DW2, W2 entries). In the 
case of a type 2 font being loaded from a file, there will be no DW2 or W2 so 
the result is that it loads nothing.
# Back in {{PDCIDFontType2}}, we must fix up the vertical displacement data 
based on the TTF. That is what {{fixVerticalDisplacements}} is doing.
# Vertical metrics are weird in that the TTF 'vhea' and 'vmtx' tables are never 
used by PDF readers; instead that info must be translated into DW2 and W2 
tables. After (2) the in-memory font is inconsistent in that it has vertical 
displacement data loaded but not present in the CIDFont dictionary. We write 
the loaded data back into the CIDFont; that's {{freezeVerticalPositions}}. If 
we don't do this then e.g. when rendering to bitmap the font gets re-loaded 
from memory and the vertical positions will be wrong.

I wanted to simply override {{readVerticalDisplacements}}, but it would take 
significant changes to the initialization of both classes: we need the {{ttf}} 
and {{cid2gid}} fields to be initialized before we can fix up the 
displacements, but {{readVerticalDisplacements}} is called from the {{super}} 
constructor so these won't be initialized yet.

I chose to manipulate the vertical data fields directly rather than duplicate 
them and add getters because there's some extensive position-getting logic in 
the parent class that it seems I would need to duplicate.

If you have an idea for a better solution please let me know.

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New 

[jira] [Commented] (PDFBOX-4106) Vertical text creation

2018-02-15 Thread Aaron Madlon-Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366501#comment-16366501
 ] 

Aaron Madlon-Kay commented on PDFBOX-4106:
--

[~tilman]
{quote}Re patch 6:
I removed "protected" thus making these fields package local, so we don't 
expose these previously private fields.

I'm wondering whether this commit is the best solution: from what I see, it 
"undoes" and/or "improves" something that is done in 
CIDFont.readVerticalDisplacements(). Had this been wrong before? Or was it 
wrong only for type 2 and not for type 0? If so, should there be an abstract 
method in the base class and a method in the subclasses?{quote}

Thanks for pointing this out. I am also not so happy with it; allow me to 
explain:

# Loading a TTF file as a vertical font: the {{PDCIDFont}} constructor is 
called as {{super}} from the {{PDCIDFontType2}} constructor; there we read 
vertical displacements from the CIDFont dictionary (DW2, W2 entries). In the 
case of a type 2 font being loaded from a file, there will be no DW2 or W2 so 
the result is that it loads nothing.
# Back in {{PDCIDFontType2}}, we must fix up the vertical displacement data 
based on the TTF. That is what {{fixVerticalDisplacements}} is doing.
# Vertical metrics are weird in that the TTF 'vhea' and 'vmtx' tables are never 
used by PDF readers; instead that info must be translated into DW2 and W2 
tables. After (2) the in-memory font is inconsistent in that it has vertical 
displacement data loaded but not present in the CIDFont dictionary. We write 
the loaded data back into the CIDFont; that's {{freezeVerticalPositions}}. If 
we don't do this then e.g. when rendering to bitmap the font gets re-loaded 
from memory and the vertical positions will be wrong.

I wanted to simply override {{readVerticalDisplacements}}, but it would take 
significant changes to the initialization of both classes: we need the {{ttf}} 
and {{cid2gid}} fields to be initialized before we can fix up the 
displacements, but {{readVerticalDisplacements}} is called from the {{super}} 
constructor so these won't be initialized yet.

I chose to manipulate the vertical data fields directly rather than duplicate 
them and add getters because there's some extensive position-getting logic in 
the parent class that it seems I would need to duplicate.

If you have an idea for a better solution please let me know.

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 
> 0001-Add-OpenTypeScript-class-to-get-OT-script-tags-for-c.patch, 
> 0002-Optimize-Unicode-script-storage-and-lookup.patch, 
> 0003-Parse-GSUB-table.patch, 
> 0004-Abstract-cmap-lookup-into-an-interface.patch, 
> 0005-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0006-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0007-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> 0008-Implement-vertical-metrics-support-when-embedding-subsetting.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4106) Vertical text creation

2018-02-14 Thread Aaron Madlon-Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363654#comment-16363654
 ] 

Aaron Madlon-Kay commented on PDFBOX-4106:
--

Actually when removing the {{UnicodeScript}} stuff I found that a full 
reimplementation based the actual Unicode 10 Script.txt provides significantly 
better coverage than JDK7's, which I guess is based on Unicode 6. So I think 
the JDK7 implementation can be thrown out, but I do have it in git if you 
really want it.

I have deleted my previous patches and uploaded a new set with the changes 
discussed earlier. However I am getting an internal error when I try to upload 
patch 8; here is a link to it on GitHub:
https://github.com/amake/pdfbox/commit/bf799ddf73db4f23cede0d8564221e1caa388223.patch

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 
> 0001-Add-OpenTypeScript-class-to-get-OT-script-tags-for-c.patch, 
> 0002-Optimize-Unicode-script-storage-and-lookup.patch, 
> 0003-Parse-GSUB-table.patch, 
> 0004-Abstract-cmap-lookup-into-an-interface.patch, 
> 0005-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0006-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0007-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4106) Vertical text creation

2018-02-14 Thread Aaron Madlon-Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Madlon-Kay updated PDFBOX-4106:
-
Attachment: 0001-Add-OpenTypeScript-class-to-get-OT-script-tags-for-c.patch
0002-Optimize-Unicode-script-storage-and-lookup.patch
0003-Parse-GSUB-table.patch
0004-Abstract-cmap-lookup-into-an-interface.patch
0005-Implement-GSUB-substitution-on-TrueTypeFont.patch
0006-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch
0007-Add-factory-methods-for-loading-TTF-as-vertical-font.patch

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 
> 0001-Add-OpenTypeScript-class-to-get-OT-script-tags-for-c.patch, 
> 0002-Optimize-Unicode-script-storage-and-lookup.patch, 
> 0003-Parse-GSUB-table.patch, 
> 0004-Abstract-cmap-lookup-into-an-interface.patch, 
> 0005-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0006-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0007-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4106) Vertical text creation

2018-02-14 Thread Aaron Madlon-Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Madlon-Kay updated PDFBOX-4106:
-
Attachment: (was: 
0006-Implement-vertical-metrics-support-when-embedding-su.patch)

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4106) Vertical text creation

2018-02-14 Thread Aaron Madlon-Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Madlon-Kay updated PDFBOX-4106:
-
Attachment: (was: 
0003-Implement-GSUB-substitution-on-TrueTypeFont.patch)

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4106) Vertical text creation

2018-02-14 Thread Aaron Madlon-Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Madlon-Kay updated PDFBOX-4106:
-
Attachment: (was: 
0005-Add-factory-methods-for-loading-TTF-as-vertical-font.patch)

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4106) Vertical text creation

2018-02-14 Thread Aaron Madlon-Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Madlon-Kay updated PDFBOX-4106:
-
Attachment: (was: 
0004-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch)

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4106) Vertical text creation

2018-02-14 Thread Aaron Madlon-Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Madlon-Kay updated PDFBOX-4106:
-
Attachment: (was: 0001-Parse-GSUB-table.patch)

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 0003-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0004-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0005-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> 0006-Implement-vertical-metrics-support-when-embedding-su.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4106) Vertical text creation

2018-02-14 Thread Aaron Madlon-Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Madlon-Kay updated PDFBOX-4106:
-
Attachment: (was: 0002-Abstract-cmap-lookup-into-an-interface.patch)

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 0003-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0004-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0005-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> 0006-Implement-vertical-metrics-support-when-embedding-su.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4106) Vertical text creation

2018-02-13 Thread Aaron Madlon-Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363453#comment-16363453
 ] 

Aaron Madlon-Kay commented on PDFBOX-4106:
--

ICLA is submitted.

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 0001-Parse-GSUB-table.patch, 
> 0002-Abstract-cmap-lookup-into-an-interface.patch, 
> 0003-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0004-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0005-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> 0006-Implement-vertical-metrics-support-when-embedding-su.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4106) Vertical text creation

2018-02-13 Thread Aaron Madlon-Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363421#comment-16363421
 ] 

Aaron Madlon-Kay commented on PDFBOX-4106:
--

[~tilman] Thanks for the comments.

{quote}I looked at the first patch; are you aware that UnicodeScript is a JDK7 
class? This would mean your improvement can't be used in PDFBox 2.*, only in 3. 
That one has no planned release date, it could be several months or several 
years.{quote}

I was aware of everything but the timeline for version 3. I have prepared an 
alternative implementation that can run on Java 6. Should I simply replace my 
patches?

{quote}I see you made an effort in patch 5 to make it non-breaking, but I 
suspect that in patch 4 the use of the new CmapLookup interface would be a 
breaking change, isn't it? That would be another reason it couldn't be used in 
2.*.{quote}

Yes, the way I added {{CmapLookup}} would be breaking. What if I did the 
following instead:
# Add new {{getUnicodeCmapLookup}} getters to {{TrueTypeFont}} that return 
{{CmapLookup}}
# Deprecate the existing {{getUnicodeCmap}} getters
# Almost all uses of {{CmapSubtable}} in other classes are local variables or 
private members that don't leak outside of the class; for these we simply move 
to {{CmapLookup}}
# The one exception is {{TrueTypeEmbedder}}, where it is a protected member. 
Here we mark the member deprecated and add {{CmapLookup}} as a separate member, 
which we use in all cases.

{quote}The sample code (...) is too difficult for an average user. I'd prefer 
to have something on a high level, e.g. an option in an additional vertical 
loader, or a method. The "natural" setting should be the default. (What would a 
newbie "vertical font user" expect? (2) or (3) in your sample PDF ?){quote}

Actually that is somewhat intentional: if you are using a vertical font you 
would always want vertical glyph substitution as far as I know (at least for 
Japanese, you always want (2) and never (3)). I added the 
{{disableGsubFeature}} method to complement the {{enable}} method, on the off 
chance that a power user (who would surely know which GSUB features they 
wanted) needed more control. 99% of the time you would not be finagling 
features manually at all.

 I will submit the ICLA shortly.

> Vertical text creation
> --
>
> Key: PDFBOX-4106
> URL: https://issues.apache.org/jira/browse/PDFBOX-4106
> Project: PDFBox
>  Issue Type: New Feature
>  Components: FontBox, Parsing, Writing
>Reporter: Aaron Madlon-Kay
>Priority: Major
>  Labels: embed, gsub, parsing, vertical
> Attachments: 0001-Parse-GSUB-table.patch, 
> 0002-Abstract-cmap-lookup-into-an-interface.patch, 
> 0003-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
> 0004-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
> 0005-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
> 0006-Implement-vertical-metrics-support-when-embedding-su.patch, 
> sample_code.txt, vertical.pdf
>
>
> I needed to output vertical Japanese text, but was stymied by several 
> limitations:
> * No API to load a TTF as Identity-V encoding
> * No support for 'vert' glyph substitution
> * No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but 
> not used at all)
> I have attached a series of patches that implement the above features. 
> Highlights:
> * The GSUB glyph substitution table is parsed (limitation: type 1 lookups 
> only; this is sufficient for many features including 'vert'/'vrt2' vertical 
> glyph substitution)
> * Cmap lookup makes use of GSUB when features are enabled on a TTF
> * 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and 
> are embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
> * An API has been added for loading a TTF as a vertical font, setting 
> Identity-V encoding and enabling 'vert'/'vrt2' substitution
> Each patch could approximately be split out into a separate ticket, if 
> desired.
> Also attached is some sample code that exercises these patches and 
> illustrates the effect of vertical glyph positioning. The sample output PDF 
> is also attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-4106) Vertical text creation

2018-02-12 Thread Aaron Madlon-Kay (JIRA)
Aaron Madlon-Kay created PDFBOX-4106:


 Summary: Vertical text creation
 Key: PDFBOX-4106
 URL: https://issues.apache.org/jira/browse/PDFBOX-4106
 Project: PDFBox
  Issue Type: New Feature
  Components: FontBox, Parsing, Writing
Reporter: Aaron Madlon-Kay
 Attachments: 0001-Parse-GSUB-table.patch, 
0002-Abstract-cmap-lookup-into-an-interface.patch, 
0003-Implement-GSUB-substitution-on-TrueTypeFont.patch, 
0004-Use-vhea-vmtx-to-fix-vertical-displacements-in-PCIDF.patch, 
0005-Add-factory-methods-for-loading-TTF-as-vertical-font.patch, 
0006-Implement-vertical-metrics-support-when-embedding-su.patch, 
sample_code.txt, vertical.pdf

I needed to output vertical Japanese text, but was stymied by several 
limitations:
* No API to load a TTF as Identity-V encoding
* No support for 'vert' glyph substitution
* No support for vertical metrics ('vhea' and 'vmtx' tables are parsed but not 
used at all)

I have attached a series of patches that implement the above features. 
Highlights:
* The GSUB glyph substitution table is parsed (limitation: type 1 lookups only; 
this is sufficient for many features including 'vert'/'vrt2' vertical glyph 
substitution)
* Cmap lookup makes use of GSUB when features are enabled on a TTF
* 'vhea' and 'vmtx' metrics are applied to PDCIDFont when appropriate, and are 
embedded/subsetted correctly through the DW2/W2 CIDFont dictionary
* An API has been added for loading a TTF as a vertical font, setting 
Identity-V encoding and enabling 'vert'/'vrt2' substitution

Each patch could approximately be split out into a separate ticket, if desired.

Also attached is some sample code that exercises these patches and illustrates 
the effect of vertical glyph positioning. The sample output PDF is also 
attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3281) HTML output wrongly specifies UTF-16 in header

2016-03-21 Thread Aaron Madlon-Kay (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205800#comment-15205800
 ] 

Aaron Madlon-Kay commented on PDFBOX-3281:
--

Snooping around the source I find that it really is hard-coded:

https://github.com/apache/pdfbox/blob/trunk/tools/src/main/java/org/apache/pdfbox/tools/PDFText2HTML.java#L75

> HTML output wrongly specifies UTF-16 in header
> --
>
> Key: PDFBOX-3281
> URL: https://issues.apache.org/jira/browse/PDFBOX-3281
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 2.0.0
> Environment: OS X 10.11.4, Java 1.8.0_73-b02
>Reporter: Aaron Madlon-Kay
> Attachments: testdoc.html, testdoc.pdf
>
>
> When running the command line {{ExtractText}} with the {{-html}} flag, the 
> output file always has the following meta tag specifying UTF-16 regardless of 
> the actual output encoding:
> {code:html}
> 
> {code}
> This causes editors that respect the meta tag (emacs, etc.) to garble the 
> file content.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3281) HTML output wrongly specifies UTF-16 in header

2016-03-21 Thread Aaron Madlon-Kay (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Madlon-Kay updated PDFBOX-3281:
-
Attachment: testdoc.pdf
testdoc.html

Test PDF and result of extraction via
{code}
java -jar pdfbox-app-2.0.0.jar ExtractText -html testdoc.pdf 
{code}

> HTML output wrongly specifies UTF-16 in header
> --
>
> Key: PDFBOX-3281
> URL: https://issues.apache.org/jira/browse/PDFBOX-3281
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 2.0.0
> Environment: OS X 10.11.4, Java 1.8.0_73-b02
>Reporter: Aaron Madlon-Kay
> Attachments: testdoc.html, testdoc.pdf
>
>
> When running the command line {{ExtractText}} with the {{-html}} flag, the 
> output file always has the following meta tag specifying UTF-16 regardless of 
> the actual output encoding:
> {code:html}
> 
> {code}
> This causes editors that respect the meta tag (emacs, etc.) to garble the 
> file content.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-3281) HTML output wrongly specifies UTF-16 in header

2016-03-21 Thread Aaron Madlon-Kay (JIRA)
Aaron Madlon-Kay created PDFBOX-3281:


 Summary: HTML output wrongly specifies UTF-16 in header
 Key: PDFBOX-3281
 URL: https://issues.apache.org/jira/browse/PDFBOX-3281
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 2.0.0
 Environment: OS X 10.11.4, Java 1.8.0_73-b02
Reporter: Aaron Madlon-Kay


When running the command line {{ExtractText}} with the {{-html}} flag, the 
output file always has the following meta tag specifying UTF-16 regardless of 
the actual output encoding:

{code:html}

{code}

This causes editors that respect the meta tag (emacs, etc.) to garble the file 
content.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org