It seems we should look at this. I apparently never saw or read (acc. to my mailer) previous email where you identified a problem with PangoGlyphLayout else we'd likely have looked at it when you sent it, as I agree this sounds like a major performance issue.

-phil.

On 3/18/18, 4:19 AM, Itai wrote:
Hello,

In hopes of getting this bug fixed, I have made changes to `PangoGlyphLayout` so that is only allocates the FT2 FontMap once, and uses a `PlatformImpl.FinishListener` to unref it when the JavaFX platform exits. Attached is the modified version of the file. In my personal tests, depending on hardware used, there is a speedup of between x2 and x10 in layout times, and scrolling large Lists/Tables feels as snappy with CTL languages as with any other language. For anyone wanting to test it, remember this bug only affects Linux, and only CTL languages (such as Arabic, Farsi, Hebrew and Hindi).

Regards,
Itai.

On Sun, Jan 8, 2017 at 11:39 PM, Itai <itai...@gmail.com <mailto:itai...@gmail.com>> wrote:

    I think I have found two problems. The first, and probably most
    critical one, is that a new PangoFontMap is created for every call
    of PangoGlyphLayout#layout. It is not entirely clear from the
    Pango documentation what the lifetime or intended usage of a
    PangoFontMap is, but I have found this comment in [1]:

    "But note that a PangoFontMap is a big expensive object. So, you
    *really* want to be using only one for your entire program.
    Frequently calling pango_ft2_font_map_new() is going kill
    the performance of your application."

    This seems to imply PangoFontMap is intended as a global (per
    display?) font cache. Indeed, creating only one PangoFontMap seems
    to improve performance drastically, although I'm not sure what is
    the best way to handle this object (i.e. when and how it should be
    re-used and freed), as it should probably (?) be held for the
    entire lifetime of the JavaFX application.

    The second problem is probably less significant, but could still
    theoretically hurt performance - it has to do with the usage of
    g_list_nth_data, which as per [2] has O(N) complexity, and is
    called once per item in the list, which yields O(N^2) complexity.
    Replacing it with linked-list traversal with g_list_next should
    reduce this back to O(N).

    I hope this information is clear enough. As I said I lack the
    overall understanding of the JavaFX platform to know where and how
    to manage a global object, such as PangoFontMap should apparently
    be, so I refrain from posting any patch that I know would be wrong.

    Regards,
    Itai.

    [1]:
    https://mail.gnome.org/archives/gtk-list/2005-April/msg00105.html
    <https://mail.gnome.org/archives/gtk-list/2005-April/msg00105.html>
    [2]:
    https://developer.gnome.org/programming-guidelines/stable/glist.html.en
    <https://developer.gnome.org/programming-guidelines/stable/glist.html.en>

    On Sun, Jan 8, 2017 at 8:08 PM, Itai <itai...@gmail.com
    <mailto:itai...@gmail.com>> wrote:

        Thank you for the link, it's an interesting read indeed!

        I wasn't really skipping layout, just using the much simpler
        layout used by Latin scripts, but you are correct that this
        will break for anything more complex - this has nothing to do
        with BiDi though, more to do with complex layout elements
        (like diacritic or cantillation marks for Hebrew, or general
        Arabic/Farsi text).
        Indeed, this can't be a general solution, but I guess I was
        driven by frustration.

        I have tried some more configurations though, and found that
        on Windows the loss of performance is much less noticeable,
        which seems to mean that the problem is either:
        1. Pango is inherently slow / inherently slow when laying out
        BiDi text.
        2. JavaFX uses Pango in a sub-optimal / redundant way.
        3. The JNI / native calls to Pango are done in a sub-optimal way.

        Option 1 can be easily debunked, as general Gnome/GTK
        applications run as smoothly with BiDi text as with Latin /
        LTR text.
        For options 2 and 3 I guess some more digging into the code
        must be done. My understanding is that JNI calls are not
        likely to incur performance loss to such a degree, unless very
        large amounts of memory are copied back and forth between Java
        and native code, so I'll start by reading into the Pango
        documentation and understanding the logic of PangoGlyphLayout.

        If you have any input on this or believe my assumptions or
        conclusions are wrong I'd be glad to hear. I realize you are
        all busy with the upcoming 9 release, so I'll try to get as
        detailed a result as I can.

        Regards,
        Itai.

        On Wed, Jan 4, 2017 at 8:44 PM, Phil Race
        <philip.r...@oracle.com <mailto:philip.r...@oracle.com>> wrote:

            You can't skip layout just because it is bidi ..
            where here you are apparently implicitly meaning Hebrew.
            This might be apparently working but may not always work even
            for Hebrew and will be a disaster for Arabic.

            Here is a web page which talks about OTL (OpenType Layout)
            for Hebrew :
            https://www.microsoft.com/typography/OpenTypeDev/hebrew/intro.htm
            <https://www.microsoft.com/typography/OpenTypeDev/hebrew/intro.htm>
            I can't say offhand why this might be exclusive to FX.
            That test case would be handy.
            So this needs more analysis even if you found a way to
            limit this to
            specifically Latin+Hebrew.

            -phil.


            On 01/04/2017 10:32 AM, Itai wrote:
            Some quick-and-dirty thing I hacked now and seems to
            improve the performance drastically is something like:

            if (complex but not bidi) {
               use GlyphLayout.
            } else if (bidi) {
               use java.text.Bidi.reorderVisually to get visual glyph
            order, then use same implementation as non-bidi
            non-complex layout
            } else {
               ...
            }

            Very minimal tests show it working correctly, and
            performance is 8-10 times faster (on par with non-bidi
            text).
            Do you think this solution makes sense? Can you see any
            obvious pitfalls?
            If it seems OK I'll try some more tests and then work it
            into something clean enough to submit as a patch suggestion.


            On Wed, Jan 4, 2017 at 7:48 PM, Itai <itai...@gmail.com
            <mailto:itai...@gmail.com>> wrote:

                Thanks for replying.
                I think I understand what you're saying about the
                cache. As for complexity - I'm mostly working with
                text which is only in Hebrew, which isn't complex as
                far as I understand the definition (no glyph "fusing"
                as in Arabic or Farsi). I can work with minor
                performance drops, but when the same window takes
                more than 10 times to show if it has Hebrew labels is
                a lot more than minor - and this is exclusive to
                JavaFX, so it's not like this problem is unsolvable.

                Perhaps the caching is indeed not the correct
                solution, but maybe there can be a way to simplify
                the layout in non-complex BiDi cases? Or optimize
                PangoGlyphLayout.layout?

                Thank you again for replying, I really hope this
                issue can see some improvement.

                On Wed, Jan 4, 2017 at 7:26 PM, Philip Race
                <philip.r...@oracle.com
                <mailto:philip.r...@oracle.com>> wrote:

                    The cache is a heuristic optimisation and whether
                    it helps depends on how well that cache is used.
                    It is a time-space trade-off and I'd expect it to
                    show up as helping more in micro-benchmarks or
                    text-intensive benchmarks which use the same text
                    broken in the same way.
                    Complex text layout is inherently slower and if
                    you are doing a lot of it .. it will be slow .. and
                    unless it is repeated a cache won't help.
                    During start-up I'd *expect* that there isn't a
                    lot of re-use going on.

                    You would need to profile how often  the same
                    text (and attributes) are passed through this code.
                    If you could provide us a test case we could
                    examine it too.

                    If it were a real use case, then we'd move on to
                    examine the feasibility of caching ...

                    -phil.



                    On 1/4/17, 9:19 AM, Itai wrote:

                        Recently JDK-8129582 [1] started really
                        affecting me, with startup speed
                        and overall responsiveness becoming really bad.

                        Digging into it, I have found most time is
                        wasted in
                        com.sun.javafx.text.GlyphLayout.layout (as
                        represented by PangoGlyphLayout
                        on my Linux machine), which in turn is called
                        by com.sun.javafx.text.PrismTextLayout.shape,
                        which has:

                             if (run.isComplex()) {
                                     /* Use GlyphLayout to shape
                        complex text */
                                     layout.layout(run, font, strike,
                        chars);
                             } else {
                                     ...
                                     if (layoutCache == null) {
                                      ...
                                      } else {
                                       ...
                                      }
                             }

                        which to my very naive reading seems as if
                        while non-complex (with all BiDi
                        text considered complex) glyph runs are
                        cached, complex runs are never
                        cached, which forces re-calculation every time.

                        I'm trying to read and understand this part
                        better, but could it be
                        possible that this is the issue? How feasible
                        would it be to have a layout
                        cache for complex runs, or at least
                        non-complex BiDi runs?

                        Thanks,
                        Itai.

                        [1]:
                        https://bugs.openjdk.java.net/browse/JDK-8129582
                        <https://bugs.openjdk.java.net/browse/JDK-8129582>







Reply via email to