Re: [Freetype-devel] Re: GSOC - Distance Fields

2020-06-20 Thread Werner LEMBERG

> Systems without an FPU are vastly less common than they were 20
> years ago.  They still exist, and is a defendable position to want
> FreeType to continue to work on those systems, [...]

Honestly, I think this an issue even today – just think of the
Internet of Things stuff, which demands that even the sheets of my
toilet paper can communicate somehow...

>   - I have a *very* hard time imagining any system that has a
> programmable GPU, but no FPU.  As such, I find it completely
> nonsensical to ban using float for the SDF generation.

Here you definitely have a point.

> I strongly advise that you reconsider this.

We discussed that already, didn't we?

> And many other decisions that seem to be stuck in 20 years ago.  I'm
> working on writing a full assessment of FreeType as a project and
> will share in a new thread when that is ready.

Thanks for that.  I fully agree that a modern redesign of FreeType
would be a good thing.

> In the meantime, I like to see Anuj's time be spent in producing a
> **solid** SDF implementation, instead of fighting barriers that are
> not technically justified.

I beg to differ.  A GSoC project is not only to implement one thing in
one way.  It is an opportunity to learn.  Anuj is now seeing both
sides of the fixed-point mathematics medal, so to say; as soon as he
will have mastered the project he knows *exactly* when to use it and
when to avoid.

Deriving an alternative implementation using standard floating point
mathematics will be an easy exercise then, AFAICS.


Werner


Re: [Freetype-devel] Re: GSOC - Distance Fields

2020-06-20 Thread Werner LEMBERG
>>> Also I'm surprised that you haven't put the code get_min_distance
>>> code for each edge type into a function.  Would you prefer that I
>>> comment re these on github?
>>
>> I don't think that will be necessary.  I will fix that while adding
>> in FreeType.  That repository is just for testing and I might
>> delete it in the future.
> 
> I understand.  I find it using functions generously makes one's code
> develop and flow better.

I agree with Behdad.  Using functions, even for stuff that gets called
only once, greatly enhances readability of code, among other things.
So please proceed as you plan :-)


Werner



Re: Logging Library-GSOC

2020-06-20 Thread Behdad Esfahbod
On Sat, Jun 13, 2020 at 10:59 PM Werner LEMBERG  wrote:

>
> > I find it a *very* bad idea to have code in FreeType that would
> > write to a file.  Specially bad if that can be controlled by an
> > env-var.
>
> Interesting.  Please explain.  Do you fear security issues?
>

Sorry, forgot to reply to this.  Yes, security.  I give you a couple of
examples:

  - On a security-sensitive system, even an extra open file handle is an
extra tool an attacker can use to mount their attack.  Read this epic story:

https://threatpost.com/pinkiepie-strikes-again-compromises-google-chrome-pwnium-contest-hack-box-101012/77098/
https://blog.chromium.org/2012/05/tale-of-two-pwnies-part-1.html
https://blog.chromium.org/2012/06/tale-of-two-pwnies-part-2.html

  - A few years ago there was an attempted attack on ChromeOS.  Some
ChromeOS UI components were using GTK+, which uses Pango.  Pango used to
have a config file.  The default search path for the config file looked in
certain directories on the user's home-directory, which is writable to the
user.  It could also be controlled via an environment variable.  The config
file could reconfigure where Pango module map was loaded from.  The Pango
module map pointed to binary modules to load for shaping certain scripts.
Pango would dlopen() such modules and look for a specifically named symbols
and call those.  That's already problematic, because it means that an
attacker who finds a way to write files to user's home directory would be
able to make the system UI load custom code.  On top of that, note that
that on Linux / ELF as well as most other systems, binary modules can
execute code at runtime linking stage.  That means, such attack could run
arbitrary code on processes that just linked to GTK+ and initialized it but
not even display any text.

So when I hear about a system, no matter what it is, that will write to
file, controlled via env-vars, and configured through a config file, my
immediate response is:

  - That sounds like hell of a lot of security vectors, for...
  - A feature that is not even justified to begin with!  FreeType clients
what to read text.  None of this contributes to any of that.

While you may be tempted to argue why it won't be a problem for FreeType
(for example because it will be disabled by default), I accept that.  But
that is in conflict with what Armin's stance is on this.  I'll expand in
the other thread.  I'm still hopeful to be able to write that today.


Re: Logging Library-GSOC

2020-06-20 Thread Behdad Esfahbod
On Sat, Jun 13, 2020 at 10:59 PM Werner LEMBERG  wrote:

>
> > I find it a *very* bad idea to have code in FreeType that would
> > write to a file.  Specially bad if that can be controlled by an
> > env-var.
>
> Interesting.  Please explain.  Do you fear security issues?
>
> > I still think what's desired can be done best by just revamping and
> > writing custom code in FreeType itself.
>
> Yes, this is another possibility.  We haven't yet decided how to
> proceed; Priyesh is still in the investigation phase.


Those words are not consistent with your actions.  I'll reply in other
thread I specifically started about this topic earlier.

-- 
behdad
http://behdad.org/


Re: [Freetype-devel] Re: GSOC - Distance Fields

2020-06-20 Thread Behdad Esfahbod
Hi Anuj,

On Tue, Jun 16, 2020 at 9:43 PM Anuj Verma  wrote:

> Hello Behdad,
>
> > First, let me congratulate you.  This is a very thorough and impressive
> piece of work for such a short time period.
>
> Thank you for that. Viktor's paper did help me a lot while writing the
> code. I guess you have already read his paper,
> but in case anyone is interested in reading it check this out:
> https://github.com/Chlumsky/msdfgen. It contains all
> the relevant information for generating SDF from outlines. Although I'm
> not using the full potential of the paper currently.
>

I have.  In 2018 Nicolas Rougier and I presented a mini-course at SIGGRAPH
that covered all the SDF-based experiments of the past 10 / 15 years.  If
you haven't reviewed those, I suggest you do.  Unfortunately there's no
video and the slides are low resolution:


https://www.slideshare.net/NicolasRougier1/siggraph-2018-digital-typography

In particular, I advise that you read the 2018-Langyel paper, which is
what's in sluglibrary.com:

  http://www.terathon.com/i3d2018_lengyel.pdf

Because if you follow my line of reasoning in my previous messages and the
rest of this message, I'm advising that you implement what basically will
be Lengyel's algorithm on the CPU.


> - I highly suggest you stick to float internally [...]
>
> I still think `float' is a better option for generating SDF, especially in
> lower resolution glyphs where fixed-point produces
> kind of straight lines instead of smooth curves(which is not noticable if
> you look at it briefly). But my concern is that
> FreeType doesn't use floats at the moment and I don't think it will be a
> good idea to add support for floats just for the
> sake of this project.
> It's a tradeoff between one thing or the other and I can't decide which
> would be the best considering the current state of
> library. I would say that I'm inclined on using fixed-point integers just
> because FreeType doesn't use them.
> Also, why doesn't FreeType use floats? Is it just because of platform
> which doesn't have floating point type? or are there
> more reasons? This question has been in my mind for quite some time.
>

I addressed that in my reply to Werner a few minutes ago.



> > Have you measured performance?  I'm fairly sure the float can be made
> both more robust and faster.
>
> I just did, here are the results with compiler optimization turned on
> using chrono library:
>
> A) Line Segment: ~0.026 microseconds
> B) Conic Bezier: ~0.168 microseconds
> C) Cubic Bezier: ~0.469 microseconds
>
> [I have also attached the gprof output in case you are interested. Note
> that the gprof output is without any compiler optimization]
> To compare it to fixed-point check here:
> https://lists.nongnu.org/archive/html/freetype-devel/2020-06/msg00095.html
>

So basically you already showed that using fixed is slower and introduces
artifacts.  To me that was unnecessary as both were very well-understood to
anyone who has worked in this field.  But now that you have, I strongly
advise you stick to float.


> - Your Newton-Raphson is solid and your performance numbers look
> amazing.  I think you should stick with this approach instead of
> subdividing.
> > As was suggested by others, do experiment with Raphson on your quadratic
> as well.
>
> Yes, will try to use Raphson on quadratic, although I don't think it will
> be better than solving the cubic equation. And I will stick to it
> until I find something even faster.
>

I take that back for now.  It might be workable, but stick with what you
have.  We can find other ways to make your cubic-solving faster.  More
about my position-reversal on Raphson below.



> > * Currently you abandon as soon as factor falls outside of [0,1].
> That's wrong.  Factor might go out but come back in in the next iteration.
>
> I was doing that initially, but I saw that the factor goes `out' and when
> it come back `in' it has the same value as the previous `in' value.
> This causes 2 extra passes of a fairly expensive iteration, so I decided
> to break if it goes outside the range. But, yeah I will see if
> that is wrong and decide accordingly after further testing.
>

I thought about that a lot and did some research.  This is a good summary:


https://math.stackexchange.com/questions/2432348/what-is-stopping-criteria-for-newtons-method/2433475#2433475

Basically, if the solution is pulled out of [0,1] range, you can clamp
them.  If they keep pulling out again right after clamping, you can
discard.  That post also suggests that what you are seeing was caused by
your fixed-point limitations.  Were you testing with float or fixed?  This
is the part I'm referring to:
  "Occasionally, it is helpful to remember that Newton's method exhibits
one sided convergence in the limit, i.e. if the root is a simple, then the
residuals 푓(푥푛)f(xn) eventually have the same sign. Deviation from this
behavior indicates that you are pushing against the limitations imposed by
floating point arithmetic."


Re: [Freetype-devel] Re: GSOC - Distance Fields

2020-06-20 Thread Behdad Esfahbod
Werner!

On Wed, Jun 17, 2020 at 12:23 AM Werner LEMBERG  wrote:

> > Also, why doesn't FreeType use floats?  Is it just because of
> > platform which doesn't have floating point type?
>
> Yes.  The original intention of FreeType was to provide support for
> embedded devices, which usually are systems that have CPUs with very
> limited capabilities and a tiny amount of memory.  This goal hasn't
>

Reminds me of Colbert on Bush: "The greatest thing about this man is that
he’s steady. You know where he stands,” Colbert said about Bush. “He
believes the same thing Wednesday that he believed on Monday — no matter
what happened Tuesday.  Events can change; this man’s beliefs never will."

FreeType was designed in the 90s.  Back then there were embedded systems
that did not have an FPU.  There also existed embedded systems that did not
allow a library to have static writable data segment.  Both of those
limitations were ingrained into FreeType design.  Both are *still* actively
defended.  In the meantime the world has changed...

Systems without an FPU are vastly less common than they were 20 years ago.
They still exist, and is a defendable position to want FreeType to continue
to work on those systems, however:

  - Compilers and kernels have stepped up to provide floating-point
emulation libraries which work transparently to the client code.  That is,
introducing limited use of float in FreeType is by no means an impediment
to those building the library for systems without FPU.  Even if that was
not true, the SDF module can be easily disabled,

  - I have a *very* hard time imagining any system that has a programmable
GPU, but no FPU.  As such, I find it completely nonsensical to ban using
float for the SDF generation.

In HarfBuzz we started from the same position, exactly because of precedent
in FreeType.  But when we got to variable fonts, we acknowledged the shift
in the scene and just used float internally.  Broke nothing... And sure
there are users of FreeType who won't ever want HarfBuzz.  I'm not ruling
those use-cases out categorically.  But my arguments above address those
situations as well.

I strongly advise that you reconsider this.  And many other decisions that
seem to be stuck in 20 years ago.  I'm working on writing a full assessment
of FreeType as a project and will share in a new thread when that is
ready.  In the meantime, I like to see Anuj's time be spent in producing a
**solid** SDF implementation, instead of fighting barriers that are not
technically justified.

b



-- 
behdad
http://behdad.org/


Re: [Freetype-devel] Re: GSOC - Distance Fields

2020-06-20 Thread Behdad Esfahbod
On Tue, Jun 16, 2020 at 9:54 PM Anuj Verma  wrote:

>
> > Also I'm surprised that you haven't put the code get_min_distance code
> for each edge type into a function.
> > Would you prefer that I comment re these on github?
>
> I don't think that will be necessary. I will fix that while adding in
> FreeType. That repository is just for
> testing and I might delete it in the future.
>

I understand.  I find it using functions generously makes one's code
develop and flow better.  In this case I wanted to quickly test your
Newton-Raphson in isolation.  Sure, I know how to isolate it. :)

-- 
behdad
http://behdad.org/


Re: [Freetype-devel] Re: GSOC - Distance Fields

2020-06-20 Thread Behdad Esfahbod
On Wed, Jun 17, 2020 at 7:22 PM Alexei Podtelezhnikov 
wrote:

> Hi Anuj,
>
> Please let me finish my thoughts below...
>
> >> Each curved segment has a large number of neighboring grid points.
> >> each of which has a unique nearest projection on the curve. The curve
> >> is naturally sampled by these projection points a very large number of
> >> times and quite uniformly. Therefore, why not divide the curve into a
> >> large number of  segments to begin with and then just find whatever
> >> point is close to each grid? It could be a lot faster to find the
> >> distance this way.
>
> ...
> It is at this point I am asking why not just split the curve using De
> Casteljau's algorithm recursively a large number of times and
> calculate the distance field for a slightly jagged line to begin with.
>

Because then instead of one curve you have tens of tiny lines to walk
over.  The speed doesn't work.


> The distance field will do the magic regardless and thread the
> boundary smoothly through the grid...
>
> On Wed, Jun 17, 2020 at 1:08 AM Anuj Verma  wrote:
> > I guess this is similar to the Euclidean distance transform algorithm.
> > http://webstaff.itn.liu.se/~stegu/JFA/Danielsson.pdf
>
> No, I do not think this is it.
>
> > As I said before I will not leave out this option, I will try to
> implement this
> > and then we can compare the performance.
> [skip]
> > I don't find anything offending in your suggestions.
>
> ;)
>
>

-- 
behdad
http://behdad.org/


Re: [Freetype-devel] Re: GSOC - Distance Fields

2020-06-20 Thread Alexei Podtelezhnikov
On Sat, Jun 20, 2020 at 7:38 AM Anuj Verma  wrote:
> Secondly, I did try subdividing the conic curves, for start I simply divided
> them into equal parts and used that to generate SDF. I saw that it require
> at least 32 divisions to produce a decent SDF, which in itself is quite slower
> than simply solving the cubic equation (and there can be many curves in the
> glyph).
> Moreover, I overlooked the fact that around the corners there is a corner 
> check
> involved, which is done to determine the sign correctly around corners. So,
> subdividing the curve also increases that which is around ~0.13 microseconds
> for each pixel around the corner.
>
> Looking at this I now think that it's not worth splitting the curve into 
> lines for
> generating the SDF.

Hi Anuj,

Thank you for the analysis. What you describe make sense if you:

foreach gridpoint
   foreach curve or line
  do work

Then, of course, you increase the inner loop by subdividing. I
actually think that it is faster

foreach line
   for proximal gridpoints
  do work

where proximal is at most 8 grid units away. The rest is truncated
(clamped) at 8. I am choosing 8 because this is probably enough (or
not).

As you walk along the (subdivided) path, you can optimize and update
distance for the points ahead (along the line) only, without looking
behind as those distances increase. The sign is tentative and flips on
updating the grid depending if it is to the right or to the left of
the line. You sort of sweep the grid proximal to the path. There is
another optimization possible, as you move along the subdivided curve,
you can only update grid points in a "orthogonal/normal" sector
roughly the size of the small turn corner, which is rather small.

Does it make sense?

Alexei



Re: [Freetype-devel] Re: GSOC - Distance Fields

2020-06-20 Thread Anuj Verma
Hello Alexei,

First thing, here is the result of all the curves after using square
distances:

A) Line Segment: ~0.10 microseconds
B) Conic Curves: ~0.75 microseconds
C) Cubic Curves: ~0.71 microseconds

For comparison, the previous result with `FT_Vector_Length':

A) Line Segment: ~0.32 microseconds
B) Conic Bezier: ~1.08 microseconds
C) Cubic Bezier: ~1.25 microseconds

Secondly, I did try subdividing the conic curves, for start I simply divided
them into equal parts and used that to generate SDF. I saw that it require
at least 32 divisions to produce a decent SDF, which in itself is quite
slower
than simply solving the cubic equation (and there can be many curves in the
glyph).
Moreover, I overlooked the fact that around the corners there is a corner
check
involved, which is done to determine the sign correctly around corners. So,
subdividing the curve also increases that which is around ~0.13
microseconds
for each pixel around the corner.

Looking at this I now think that it's not worth splitting the curve into
lines for
generating the SDF. But I think for optimization it can be used, we can use
a coarse grid and subdivided curves to quickly check which line is closer
to the coarse grid, and then for the pixels in the coarse grid we only check
those curves. This will also not require the corver check since we are only
interested in absolute values. I think this can be faster but can't say
anything
for sure without profiling.

One more thing, shouldn't it be `minmum level of warnings.' ?
http://git.savannah.gnu.org/cgit/freetype/freetype2.git/tree/src/smooth/ftgrays.c#n441

Thanks,
Anuj


RE: Logging Library-GSOC

2020-06-20 Thread armin
Hi Priesh,

>> In previous mails, Armin suggested to move some of the FreeType's
>> logging functionality to the external logger but according to my
>> analysis, none of the external logging libraries that I have
>> explored exactly matches the logging architecture of FreeType
>> (i.e. logging based on debug level of components and debug levels of
>> trace calls).  According to me, this is not possible with dlg
>> library [...]
>
> This is not a problem at all.

Don't worry about that -- I was just thinking out loud about things we could do 
:)

> (For Armin){
> I have looked into log4c, [...] but I am not sure about it on windows [...]
> }

I do like that notation style ;)  Apart from that I for sure have not used it 
on Windows;  don't worry about that either and thanks for double-checking! :)

As you are now moving into the implementation phase of your project I would 
quickly like to recall my message from about a month ago:  
https://lists.nongnu.org/archive/html/freetype-devel/2020-05/msg00165.html.  
Don't worry about exact naming and we might need more functions and/or more 
arguments but outlining the general idea:  I would very much appreciate it if 
we could find a way to wrap FT logging into a public interface that works 
something along these lines.  So basically there's a default log callback 
implemented internally that listens to environment variables (and/or other, 
more dynamic inputs) and then there's a way to overwrite this callback from the 
outside by providing (a) custom function(s).  All details are obviously up for 
discussion :)