Re: [ft-devel] Known problem with FT_StreamRec::base being non-NULL also for file-based streams?

2010-03-11 Thread mpsuzuki
On Wed, 10 Mar 2010 03:50:17 -0500
Behdad Esfahbod beh...@behdad.org wrote:
Well, in short, all the hb_blob_t in HarfBuzz is about communicating to
harfbuzz what it can do with the memory backing the font file.  There are
three different cases we are interested in:

  - The memory is read-only; harfbuzz will make a copy if it needs to modify 
 it.

  - The memory is writable and it is ok to write to it.  harfbuzz will not
make a copy.

  - The memory is read-only, but can be made writable using mprotect() or
similar (win32, ...) functionality.

Currently the hb-ft glue layer assumes that font data is mmap()ed or are
otherwise mprotect()able.

Thanks. I understand the info needed by HB is whether HB can
modify the memory image without duplication, or HB should copy
the memory image before duplication.

Just I've posted a proof of concept to know who allocated the
buffer in FT_Stream, but it is not best solution for this task.

I and Werner agreed that the easiest way to guarantee the
origin of buffer as mmap()ed or malloc()ed is the font
image preparation in HB/Pango side. But, taking a glance
on Pango, I guess, there might be some delay between the
invocation of FT_New_Face() and HB blob creation.
The duplication of unwritable font image to writable buffer
occurs for all faces? Or, the duplication occurs when the
first modification is tried (to fix OpenType bug in runtime)?

If latter scenario is correct - when Pango is going to
create FT_Face object, Pango cannot know if the duplication
will occur in future, so, my proposal (HB/Pango side font
image preparation) will cause unwanted memory consumation
by loading all faces to writable memory. It won't be good
idea.

# In Pango library, when PangoFT2Font-face is created once,
# it should not be changed anymore? If replacing the face is
# permitted, I want to create the earliest unwritable face
# from mmap()ed image, then replace it by writable face with
# malloc()-and-read() image when Pango/HB tries to modify it.
# Pango has an API to expose PangoFT2Font-face to the client
# (pango_ft2_font_get_face()), but it is classified as deprecated
# interface. I wish Pango library is changing to hide raw
# FT_Face object from Pango client.

 This fails for examples when:

  - Font data is in ROM.  In this case mprotect() will fail and harfbuzz will
make a copy of the memory.  Not a huge problem.

Indeed. If FT2 could mmap() readonly font file successfully,
mprotect() will fail.

  - FreeType malloc()ed the font data.  In this case, mprotect() is not
necessary and will probably affect memory beyond the font data (since mprotect
works on whole pages).

Umm, I think, mprotect() for malloc()ed memory causes
undefined result.

http://www.opengroup.org/onlinepubs/95399/functions/mprotect.html
The behavior of this function is unspecified
if the mapping was not established by a call to mmap().

To avoid such ambiguity, we should know if the buffer
is mmap()ed or malloc()ed, before mprotect() - am I
misunderstanding?

  - Font data is coming from the user.  In this case it may not be desirable
to modify the data.

Indeed. Could you tell me which function is used to push
user-provided font data? Is it in cairo layer?

Adding API to FTStream to be able to detect the above cases, specially the
user-provided data, would be useful.

Again thank you for comment about the additon of new API.

As I've sketched, it is possible to get detailed info of
FT_Stream object. My current sketch is huge, and I have
a few issues to be discussed for further improvement of
FT_Stream object.

Another idea is an addition of the arguments for FT_Open_Face(),
to specify 3 scenarios for font loading.

1) only mmap() is tried.
2) only malloc() + read() is tried.
3) mmap() is tried, then, malloc() + read() is tried (current behaviour)

By using 1) and 2), HB/Pango can distinguish the buffer is
mmap()ed or malloc()ed exactly. I will post a patch for FT2
and Pango for further discussion. Please wait a few days...

Regards,
mpsuzuki


___
Freetype-devel mailing list
Freetype-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/freetype-devel


Re: [ft-devel] Known problem with FT_StreamRec::base being non-NULL also for file-based streams?

2010-03-10 Thread Behdad Esfahbod
On 03/05/2010 02:26 AM, mpsuz...@hiroshima-u.ac.jp wrote:
 
 Checking the source code, I wonder if I should also check for (face_flags 
 FT_FACE_FLAG_EXTERNAL_STREAM) to detect whether it's an mmapped stream or the
 user provided it (and hence we cannot mprotect).  The docs say: Don't read 
 or
 test this flag.
 
 Please let me know more detail about what the information
 is needed at the part. Yet I'm not understanding the idea
 of blob in HarfBuzz.

Well, in short, all the hb_blob_t in HarfBuzz is about communicating to
harfbuzz what it can do with the memory backing the font file.  There are
three different cases we are interested in:

  - The memory is read-only; harfbuzz will make a copy if it needs to modify it.

  - The memory is writable and it is ok to write to it.  harfbuzz will not
make a copy.

  - The memory is read-only, but can be made writable using mprotect() or
similar (win32, ...) functionality.


HarfBuzz only makes changes to the font data if it detects corrupt fonts.  The
changes are NOT meant to be written back to the font file.

Currently the hb-ft glue layer assumes that font data is mmap()ed or are
otherwise mprotect()able.  This fails for examples when:

  - Font data is in ROM.  In this case mprotect() will fail and harfbuzz will
make a copy of the memory.  Not a huge problem.

  - FreeType malloc()ed the font data.  In this case, mprotect() is not
necessary and will probably affect memory beyond the font data (since mprotect
works on whole pages).

  - Font data is coming from the user.  In this case it may not be desirable
to modify the data.

Adding API to FTStream to be able to detect the above cases, specially the
user-provided data, would be useful.

Thanks,
behdad


___
Freetype-devel mailing list
Freetype-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/freetype-devel


Re: [ft-devel] Known problem with FT_StreamRec::base being non-NULL also for file-based streams?

2010-03-06 Thread Werner LEMBERG

 To guarantee the memory buffer is obtained by mmap() in FT2,
 including previous releases, the most stable way would be mmap() in
 FT2 client and pass the memory image to FT2.

To me, this sounds reasonable.  However, you probably has special
constraints...


Werner


___
Freetype-devel mailing list
Freetype-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/freetype-devel


Re: [ft-devel] Known problem with FT_StreamRec::base being non-NULL also for file-based streams?

2010-03-04 Thread mpsuzuki
On Mon, 01 Mar 2010 22:29:18 -0500
Behdad Esfahbod beh...@behdad.org wrote:

On 03/01/2010 09:18 PM, mpsuz...@hiroshima-u.ac.jp wrote:
 How about using
 
 if ( face-stream-read == NULL )
 
 instead of
 
 if ( face-stream-base != NULL )

Yes, that's what I'm planning to do instead.

Thanks!

Checking the source code, I wonder if I should also check for (face_flags 
FT_FACE_FLAG_EXTERNAL_STREAM) to detect whether it's an mmapped stream or the
user provided it (and hence we cannot mprotect).  The docs say: Don't read or
test this flag.

Please let me know more detail about what the information
is needed at the part. Yet I'm not understanding the idea
of blob in HarfBuzz.

Although the name FT_FACE_FLAG_EXTERNAL_STREAM looks like as
if it is fixed just before the creation of FT_Face object,
sometimes FreeType2 copies the (partial) content of the buffer
passed by the client to internally allocated buffer, and
clears the flag.

In builds/unix/ftsystem.c, if mmap() for specified file is
failed, FT2 allocates the buffer and copies the content of the
file. In both scenarios (mmap() ok, and mmap() not ok),
FT_FACE_FLAG_EXTERNAL_STREAM is not changed. So it is impossible
to distinguish whether ( FT_FACE_FLAG_EXTERNAL_STREAM == 0 )
and ( read == NULL ) is mmap()-ed or malloc()-ed stream.

# Worse scenarios would be found for MacOS-specific font.
# If FT2 client make a memory image of sfnt-wrapped PS Type1
# and pass it to FT2, FT2 allocates new buffer internally 
# and copies TYP1 table only. Then FT_FACE_FLAG_EXTERNAL_STREAM
# is turned off.

I think the request of info about the origin of font memory
image is reasonable, but current FT2 is designed to provide
an abstract FT_Stream object and does not expect FT2 clients
to touch the internal of FT_Stream, even some internals are
exposed. As you know, FT2 clients can receive FT_Stream object,
but cannot create FT_Stream object by public API.

# I'm not sure if 2 public APIs FT_Stream_OpenLZW() and
# FT_Stream_OpenGzip() are useful for FT2 clients.

To guarantee the memory buffer is obtained by mmap() in FT2,
including previous releases, the most stable way would be
mmap() in FT2 client and pass the memory image to FT2. 

If you don't want to such client-side mmap(), it is possible
to detect if the stream is mmap()-ed, but the code would be
quite complexed.

--
1) create a temporary writable font.

2) open the temporary font.

3) to assure the stream is memory-based,
   check FT_Stream-read is NULL.

4) to assure the memory is mmap()-ed,
   modify a content of temporary font and
   check the change is reflected to memory image.

5) memorize the address of FT_Stream-close, as
   the address of ft_close_stream_by_munmap().
   
6) destroy FT_Face object and erase the temporal
   font.
--

I think HarfBuzz developers don't want to insert such ugly
code.

In summary, it is quite difficult to check if the stream
is mmap()-ed by existing releases of FT2. If new API is
added in future release of FT2, does it help HarfBuzz?

Regards,
mpsuzuki

P.S.
I've ever worked for a sample implementation of new service
getting the pathname from FT_Stream. Similar method can be
applicable to provide a method to check if the memory of
the stream is internally mmap()-ed.


___
Freetype-devel mailing list
Freetype-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/freetype-devel


Re: [ft-devel] Known problem with FT_StreamRec::base being non-NULL also for file-based streams?

2010-03-02 Thread mpsuzuki
On Tue, 2 Mar 2010 09:30:39 +0200
Tor Lillqvist t...@iki.fi wrote:

 Indeed. If you know the function calling the final
 FT_Stream_EnterFrame() and exposes internal base
 value to FT2 client, please let me know.

Sorry, I don't. As the enter/exit functions are called dozens of times
before something odd happens and the exit function isn't called (and
then later the Pango function is called), it would take quite some
time to debug... Maybe next weekend. A time-machine in the debugger
would be nice;)

Thanks, I will try to reproduce the issue by myself.
If you have more important task, please do it. Maybe,
in later, I will have to ask your (or original bug
reporter's) environment in detail.

Regards,
mpsuzuki


___
Freetype-devel mailing list
Freetype-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/freetype-devel


Re: [ft-devel] Known problem with FT_StreamRec::base being non-NULL also for file-based streams?

2010-03-02 Thread Ralph Giles
On Mon, Mar 1, 2010 at 11:30 PM, Tor Lillqvist t...@iki.fi wrote:

 Sorry, I don't. As the enter/exit functions are called dozens of times
 before something odd happens and the exit function isn't called (and
 then later the Pango function is called), it would take quite some
 time to debug... Maybe next weekend. A time-machine in the debugger
 would be nice;)

GDB 7.x has something like this, for some targets:

http://sourceware.org/gdb/current/onlinedocs/gdb/Reverse-Execution.html

I don't know if windows is one of those.

 -r


___
Freetype-devel mailing list
Freetype-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/freetype-devel


[ft-devel] Known problem with FT_StreamRec::base being non-NULL also for file-based streams?

2010-03-01 Thread Tor Lillqvist
Hi,

I noticed a problem in the Pango FreeType backend on Windows, where it
would see a FT_StreamRec with a non-NULL base field, and thus assume
it is a memory-based stream as the comment in ftsystem.h promises (and
that base points to the whole font file mapped into memory).

I did some debugging, but did not really understand the root cause.
Anyway, what seems to be going on is that FreeType in
FT_Stream_EnterFrame() sets base to non-NULL, and then in a
corresponding FT_Stream_ExitFrame() it is set back to NULL. Presumably
this is supposed to all happen just internally, and the calling code
outside of FreeType is not supposed to see a FT_StreamRec with
non-NULL base? These functions are called repeatedly in pairs,
occasionally several enter followed by one exit, hmm. But then at
one point the exit function isn't called, and a non-NULL base
escapes into the upper levels. Any idea what could be happening? Has
there been problems like this reported for other platforms where
memory-mapped font files aren't used?

Actually, it would be fairly trivial to add support for memory mapped
font files on Windows, too. The Win32 API for that it a bit more
verbose, but not really more complex, than the Unix mmap() co. (But
that doesn't make the problem, if it is a problem in FreeType, go away
for platforms that are neither Windows nor Unix.)

--tml


___
Freetype-devel mailing list
Freetype-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/freetype-devel


Re: [ft-devel] Known problem with FT_StreamRec::base being non-NULL also for file-based streams?

2010-03-01 Thread Behdad Esfahbod
On 03/01/2010 09:18 PM, mpsuz...@hiroshima-u.ac.jp wrote:
 How about using
 
 if ( face-stream-read == NULL )
 
 instead of
 
 if ( face-stream-base != NULL )

Yes, that's what I'm planning to do instead.

Checking the source code, I wonder if I should also check for (face_flags 
FT_FACE_FLAG_EXTERNAL_STREAM) to detect whether it's an mmapped stream or the
user provided it (and hence we cannot mprotect).  The docs say: Don't read or
test this flag.

behdad


___
Freetype-devel mailing list
Freetype-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/freetype-devel


Re: [ft-devel] Known problem with FT_StreamRec::base being non-NULL also for file-based streams?

2010-03-01 Thread Tor Lillqvist
 Indeed. If you know the function calling the final
 FT_Stream_EnterFrame() and exposes internal base
 value to FT2 client, please let me know.

Sorry, I don't. As the enter/exit functions are called dozens of times
before something odd happens and the exit function isn't called (and
then later the Pango function is called), it would take quite some
time to debug... Maybe next weekend. A time-machine in the debugger
would be nice;)

--tml


___
Freetype-devel mailing list
Freetype-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/freetype-devel