Hi Fang,
not bad for the first try, we talk about different concepts here
(flash format, text layouting, geometry, ...) so its easy to mix
them up. I occasionally do since international text support is
a very complicated matter :)
For starters its very helpful to understand the flash file format
a little. That is our target. To give a very basic overview, in
flash you have two things. Definitions of formated geometry (for
example a circle filled with the color blue). If you define a
geometry in a flash file, nothing will be visible. For that you
need the second, a reference to that geometry. So if you like
to have a flash file with hundreds of blue circles, you only need
to define a blue circle once and reference it a hundred times.
While you reference it you are able to transform the geometry.
This means you can individually translate, rotate and scale it
before you display it.
There is a special case for text in flash. You can have a glyph
table that contains all glyphs for one font that is used in the
flash file. That is defined once. Now each text references that
glyph table and the characters in the text are stored as index into
that table. This is how we do it for latin text that is written
from left to right (like this mail).
For right to left (for example hebrew) or complex text we export
the geometry for complete text runs. Now in the flash project,
Writer::Impl_writeText() gets a text run as input. That can be
more than a single character and less than the whole text of a
textbox. A text run is a coherent part from the whole text that
has the same formating and is placed on the same line. Look
at the following
My Dear,
thats great!
Imagine that "Dear" is bold, the rest is plain. This gives us
the four text runs "My ", "Dear", "," and "thats great!"
Now this is latin text but if we just imagine its complex text
(for example chinese simplified) then Writer::Impl_writeText()
will be called four times and will define & reference four
polypolygons it gets from OutputDevice::GetTextOutline().
As you can see the first text run "My " would contain at least
two glyphs, one for "M" and one for "y". But you get only one
PolyPolygon. Since a single Glyph can be constructed with multiple
polygons( for example "i" need at least two ), we can not
assume that the PolyPolygon contains one Polygone for each glyph.
Now my idea was to cache the PolyPolygon so the define is only
exported if the same PolyPolygone was not yet exported.
The cach would contain the PolyPolygon for comparison and the
shape id we get when we define a geometry in the flash file.
So caching would work someting like
for each PolyPolygon in the cache
if that PolyPolygon is same as PolyPolygon from
current text run, use flash that flash id.
if no PolyPolygon was found in cache then
Export PolyPolygon from current text run to flash
Put that PolyPolygon into the cache togehter with
the new flash id.
Reference the found or the newly created flash id.
Now this has the drawback that we would cache text run geometry,
not single glyph geometry. As you can imagine, single glyphs would
cache better then whole text runs.
I talked with Herbert Duerr about this ( he is our text and font Guru )
and he told me that we can use another method from the vcl output
device.
Instead of using OutputDevice::GetTextOutline(), which gives us
one PolyPolygon for the complete text run, we should use
OutputDevice::GetTextOutlines(). Now that will give us a vector
with one PolyPolygon for each glyph. Now we can add a cache for
each PolyPolygon in that vector.
One pitfall here is that the PolyPolygons for the glyphs are already
translated. Take the text run "aa" as an example. Also we have the
same glyph twice, the two PolyPolygons from the returned vector are
not yet identical. Thats because the second "a" is translated so its
position is after the first "a". So for caching we first need to
decompose the translation from the PolyPolygon. (I'm very sorry,
but yes this all is very complicated :)
Now a second thing that Herbert suggested me. There is a version
of OutputDevice::GetTextOutlines() that use the new B2DPolyPolygon
from the basegfx project instead of the old PolyPolygon from the
tools project. The advantages of the B2DPolyPolygon is that it works
with doubles and still contains the perfect curves for the text.
The old PolyPolygon (that is currently used all over the flash export)
is not as smooth.
So another step that we could take is add export of B2DPolyPolygon to
the flash export, that would increase the rendering quality.
After that hunk of information, now to your questions
FangYaqiong wrote:
> 1. Each letter or text is exported as a polygon, something like a
> graphics which is combined by lines or curves. The implement of the
> polygon which we want to translate to is in vcl and we can call for it
> in filter.
Basically yes.
> 2. The difference between latin and non latin is that each character of
> latin has the same polygon to present, but the polygon which is the
> presentation of the non latin character is different in different position.
More like in latin each character always has the same glyph. For non
latin it can but must not have the same glyph.
> So another problem is that we need to export the glyph of each latin
> character only once, and if we export it at second time we can get the
> glyph from a cache.
Yes, furthermore we currently only export the glyphs we use. So there
already is a cache for latin glyphs and since there is a one to one
relation from character to glyph for latin text, the character is used
to access the glyph in the cache.
> If we want to export the glyph of a non latin
> character, we should export it each time we use it. So we should build
> up a cache for non latin character. And the flow chart is as follows.
We currently export it each time we use it. The wording here should be
"We should only export each glyph once if we use it, not each time it
is used"
> 3. What you have done is appending the export of the non latin character
> and what remained to be done is that a cache for the glyphs of the non
> latin characters is necessary. Is that right?
Yes thats correct.
> And I still have some questions to this task.
> 1. Have you built up a CWS for this project and how can I get the source
> code from the CVS?
No there is neither a CWS nor an Issue for this already. So feel free to
write one and create a CWS yourself,
> 2. What's the meaning of "right to left case"?
This mail is written left to right. For example Hebrew or Arabic write
their letters right to left. Then Chinese is Top to Bottom I think.
I don't know if there is a Bottom to Top case but I bet there is one.
> 3. Why the glyph of each non latin character is different in different
> position but the glyph of each latin character is the same?
Don't ask me. I was born with only 26 characters and only 3 "Umlaute".
I don't know why other languages write characters different, depending
on the word their in, the position or maybe if the writer is male or
female. Mybe Herbert can give some examples, I will ask him.
> 4. There are too many kinds of glyphs to present the non latin as the
> glyph of each non latin character is different in different position, is
> it need to build a cache to store all these glyphs?
See above, the glyph can be different, it must not be. So for complex it
is possible that there are more glyphs than characters. But surely the
number of different glyphs is finite.
> 5. Is the source code you tell me in the last email is added in
> filter\source\flash\swfiwriter1.cxx?
Yes, not sure which version it is added but src680m202 should contain
it. If not you can update it from cvs.
> 6. Whether the glyph of the non latin character should be generated
> every time we export or it should be exported each time it is used. What
> I understood is described as a flow chart and if it is wrong please let
> me know.
Ok about the flow chart:
"Whether is laton or non latin(right to left case)"
Just a comment here, if you do use flow charts you should make that
assertion clear. For me this sentence confused me which is the yes and
which the no part. So I would suggest using something like "character
is latin and non right to left".
The box "If the glyph for the character is in the cache" has three
outputs, whats the third doing? :) A it is not a flow chart here but
kind of an orgchart.
So as I understand it that flow chart is a very basic overview of what
we do now. The flaw is that at the end not the glyph is exported but
a flash text chunk that reference the glyphs ( or currently just a
polypolygon for complex text which is not stored as a flash text chunk).
The cache for latin glyphs are exported later in the code when the fonts
are exported.
Regards,
Christian
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]