Re: Overlap oversampling

2020-07-05 Thread Nikolaus Waxweiler
I have 0 idea actually. I think macOS triggers different rendering 
automatically when it encounters any OVERLAP_* thingy.







Re: Overlap oversampling

2020-07-05 Thread Alexei Podtelezhnikov
Hi Nikolaus 

> Random thought: Test with Cascadia Code 
> (https://github.com/microsoft/cascadia-code/releases/tag/v2007.01). The 
> (variable) font is very Lego-like in its construction.

You have seen a lot of these fonts. How often the variation fonts set 
OVERLAP_SIMPLE or OVERLAP_COMPOSITE even though it is not required? I wonder if 
those flags are sufficiently respected to trigger FT_OUTLINE_OVERLAP. What 
would be equivalent on the CFF2 side?

Thanks,
Alexei




Re: Overlap oversampling

2020-07-05 Thread Nikolaus Waxweiler
Random thought: Test with Cascadia Code 
(https://github.com/microsoft/cascadia-code/releases/tag/v2007.01). The 
(variable) font is very Lego-like in its construction.







Re: Overlap oversampling

2020-07-03 Thread Alexei Podtelezhnikov
Hi guys,

After some more thinking I committed thw oversampling implementation.
It requires new FT_OUTLINE_OVERLAP to use this method consciously.

Alexei

PS: To quickly test it use this one-liner:

diff --git a/src/smooth/ftsmooth.c b/src/smooth/ftsmooth.c
index b32629205..e6d5d04f4 100644
--- a/src/smooth/ftsmooth.c
+++ b/src/smooth/ftsmooth.c
@@ -494,7 +494,7 @@
 if ( mode == FT_RENDER_MODE_NORMAL ||
  mode == FT_RENDER_MODE_LIGHT  )
 {
-  if ( outline->flags & FT_OUTLINE_OVERLAP )
+  if ( outline->flags & FT_OUTLINE_OVERLAP || 1 )
 error = ft_smooth_raster_overlap( render, outline, bitmap );
   else
   {



Re: Overlap oversampling

2020-06-29 Thread Alexei Podtelezhnikov
David,

On Mon, Jun 29, 2020 at 6:58 PM David Turner  wrote:
> So, could have a deep look at the patches here. They're pretty neat. I'll 
> just recommend documenting the subtle computations in ft_smooth_slow_spans() 
> a little better, and avoid branches altogether, by using bit twiddling to 
> perform saturated addition instead (removing branches from loops is always 
> best for performance). I.e. something like the following:

I completely agree with sum | -(sum>>8). Another idea is sum - (sum>>8).

> Can you tell me how to actually test that the code works as expected though?

The proof-of-concept patch is set up to replace normal rendering with
oversampling when SCALE is 2 or 4. I was using this font:
https://github.com/adobe-fonts/source-serif-pro/tree/release/VAR

I suggested FT_RENDER_MODE_SLOW to explicitly discourage its use for
good fonts. We can do FT_RENDER_MODE_OVERSAMPLE instead. Werner also
suggested using OVERLAP_SIMPLE and OVERLAP_COMPOUND flags to trigger
this mode but they may be unset or unavailable. I think this mode
should be explicit.

Regards,
Alexei



Re: Overlap oversampling

2020-06-29 Thread David Turner
So, could have a deep look at the patches here. They're pretty neat. I'll
just recommend documenting the subtle computations in
ft_smooth_slow_spans() a little better, and avoid branches altogether, by
using bit twiddling to perform saturated addition instead (removing
branches from loops is always best for performance). I.e. something like
the following:

  /* This function averages inflated spans in direct rendering mode.
   * It assumes that coverage spans are rendered in a SCALE*SCALE
   * inflated pixel space, and computes the contribution of each
   * span 'sub-pixel' to the target bitmap's pixel. I.e.:
   *
   *  If (x, y) are a pixel coordinates in inflated space, then
   *  (xt := x/SCALE, yt := y/SCALE) are the pixel coordinates in the target
   *  bitmap, where '/' denotes integer division.
   *
   *  Let's define GRIDSIZE := SCALE * SCALE, then if `c` is the 8-bit
coverage
   *  for (x, y) in inflated space, then its contribution to (xt, yt) would
be
   *  ct := c // GRIDSIZE, where '//' denotes division of real numbers (i.e.
   *  without truncation to a lower fixed or floating point precision).
   *
   *  Since these can only be stored on 8-bit target bitmap pixels, there
are
   *  at least two ways to approximate the sum:
   *
   * 1) Compute `ct := FLOOR(c // GRIDSIZE)`, which means that if all
   *pixels in inflated space have full coverage (i.e. value 255),
then
   *their contribution sums will be GRIDSIZE * FLOOR(255 /
GRIDSIZE),
   *which will be 252 (for SCALE == 2), or 240 (for SCALE == 4).
   *
   *A later passe will be needed to scale the values to the 0..255
   *range.
   *
   * 2) Compute `ct := ROUND(c // GRIDSIZE)`, in which case the total
   *contribution sum may reach 256 for both `SCALE == 2` and
   *`SCALE == 4`, which cannot be stored in an 8-bit pixel byte of
the
   *target bitmap. To deal with this, perform saturated arithmetic
to
   *ensure that the value never goes over 255. This avoids an
   *additional rescaling step, and is implemented below.
   */
  static void
  ft_smooth_slow_spans( int y,
int count,
const FT_Span*  spans,
TOrigin*target )
  {
unsigned char*  dst = target->origin - ( y / SCALE ) * target->pitch;
unsigned intx;

for ( ; count--; spans++ )
{
  unsigned coverage = (spans->coverage + GRIDSIZE / 2) / GRIDSIZE;


  for ( x = 0; x < spans->len; x++ )
  {
/* The following performs a saturated addition of d[0] + coverage */
unsigned char*  d = [(spans->x + x) / SCALE];
unsigned int  sum = d[0] + coverage;


d[0] = (FT_Byte)(d | -(sum >> 8));
  }
}
  }

Here's a Compiler Explorer link  that
compares the two implementations.


Can you tell me how to actually test that the code works as expected though?

Thanks

- David

Le mar. 23 juin 2020 à 20:16, David Turner  a écrit :

>
>
> Le mar. 23 juin 2020 à 05:42, Alexei Podtelezhnikov 
> a écrit :
>
>> Hi again,
>>
>> The oversampling is implemented though inflating the outline and then
>> averaging the increased number of cells using FT_RASTER_FLAG_DIRECT
>> mechanism. The first two patches set the stage by splitting the code
>> paths for LCD rendering out of the way and trying
>> FT_RASTER_FLAG_DIRECT for FT_RENDER_MODE_LCD. The third one implements
>> oversampling by replacing the normal rendering with oversampling if
>> SCALE is 2 or 4 (as opposed to 1). Again the proposal is to have it as
>> FT_RENDER_MODE_SLOW eventually. The slightly complicated averaging of
>> cells is due to 255/4+255/4+255/4+255/4 = 252 instead of 255, so we
>> have to do rounding, yet avoid overflowing.
>>
>> Thanks, I'll take a look at your patches.
>
> However, please don't call it FT_RENDER_MODE_SLOW, the fact that it is
> slow is an implementation detail, and we could very well replace this with
> a different algorithm in the future (maybe slow, maybe not). So something
> like FT_RENDER_MODE_OVERLAPPED_OUTLINES seems more appropriate, since it
> describes why you would want to use this mode, instead of what its
> performance profile is :-)
>
> Comments?
>>
>> Alexei
>>
>


Re: Overlap oversampling

2020-06-23 Thread David Turner
Le mar. 23 juin 2020 à 05:42, Alexei Podtelezhnikov  a
écrit :

> Hi again,
>
> The oversampling is implemented though inflating the outline and then
> averaging the increased number of cells using FT_RASTER_FLAG_DIRECT
> mechanism. The first two patches set the stage by splitting the code
> paths for LCD rendering out of the way and trying
> FT_RASTER_FLAG_DIRECT for FT_RENDER_MODE_LCD. The third one implements
> oversampling by replacing the normal rendering with oversampling if
> SCALE is 2 or 4 (as opposed to 1). Again the proposal is to have it as
> FT_RENDER_MODE_SLOW eventually. The slightly complicated averaging of
> cells is due to 255/4+255/4+255/4+255/4 = 252 instead of 255, so we
> have to do rounding, yet avoid overflowing.
>
> Thanks, I'll take a look at your patches.

However, please don't call it FT_RENDER_MODE_SLOW, the fact that it is slow
is an implementation detail, and we could very well replace this with a
different algorithm in the future (maybe slow, maybe not). So something
like FT_RENDER_MODE_OVERLAPPED_OUTLINES seems more appropriate, since it
describes why you would want to use this mode, instead of what its
performance profile is :-)

Comments?
>
> Alexei
>


Re: Overlap oversampling

2020-06-22 Thread Alexei Podtelezhnikov
Hi again,

The oversampling is implemented though inflating the outline and then
averaging the increased number of cells using FT_RASTER_FLAG_DIRECT
mechanism. The first two patches set the stage by splitting the code
paths for LCD rendering out of the way and trying
FT_RASTER_FLAG_DIRECT for FT_RENDER_MODE_LCD. The third one implements
oversampling by replacing the normal rendering with oversampling if
SCALE is 2 or 4 (as opposed to 1). Again the proposal is to have it as
FT_RENDER_MODE_SLOW eventually. The slightly complicated averaging of
cells is due to 255/4+255/4+255/4+255/4 = 252 instead of 255, so we
have to do rounding, yet avoid overflowing.

Comments?

Alexei


0002-DIRECT.patch
Description: Binary data


0001-SPLIT.patch
Description: Binary data


0003-SLOW.patch
Description: Binary data


Overlap oversampling

2020-06-22 Thread Alexei Podtelezhnikov
Hi everybody,

This is a proof of concept for oversampling to decrease artifacts in
rendering overlaps. The attached images show how the artifact visible
at the top of B, E, F, T decreases as we increase oversampling from
1x1, to 2x2, to 4x4. The price is doubling and quadrupling the
rendering time. The proposal is to have it available as
FT_RENDER_MODE_SLOW specifically for fonts that need it.

I am seeking feedback if 2x2 is good enough, or 4x4 is necessary. I
will also reply with a set of patches to try it and comment on it.

Regards,
Alexei