[ 
https://issues.apache.org/jira/browse/PDFBOX-5808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841201#comment-17841201
 ] 

Fabrice Calafat commented on PDFBOX-5808:
-----------------------------------------

Thank you for all the updates! Your solution for CompoundCharacterTokenizer is 
indeed much much better

I was looking at [this 
test|https://github.com/apache/pdfbox/blob/58fb817db797ac9674214467a6db2208f00502ed/fontbox/src/test/java/org/apache/fontbox/ttf/gsub/CompoundCharacterTokenizerTest.java#L155],
 the ouput is _100, which I guess is ok since the split happening in 
org.apache.fontbox.ttf.gsub.GlyphArraySplitterRegexImpl#convertGlyphIdsToList 
will cover it

Do you think the ouput should respect the _{_}<glyph id>{_}_ still?

> Add support for GSUB Lookup Type 3
> ----------------------------------
>
>                 Key: PDFBOX-5808
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5808
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: FontBox
>    Affects Versions: 3.0.2 PDFBox
>            Reporter: Fabrice Calafat
>            Priority: Major
>         Attachments: screenshot-1.png, screenshot-2.png
>
>
> Add support for the lookup type 3, Alternate Substitution when handling GSUB:
> [https://learn.microsoft.com/en-us/typography/opentype/spec/gsub#AS]
> The first available substitution glyph can be used (as done in other 
> libraries)
>  
> Also, the current implementation of CompoundCharacterTokenizer doesn't 
> account for collision in ligatures
> For example, if a font supports ligatures for _att_ and {_}en{_}, the current 
> implementation will not tokenize properly for the word _attention._ This is 
> because the regex implementation doesn't allow for a proper split
>  
> I'll open a proposed implementation for the above



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to