On Mon, Jun 09, 2003 at 11:16:27PM +0900, Matthew Huggett wrote:
> 
> >Recently, I've made the 'unwise' decision to start studying Japanese next
> >year,

Unwise? Only if you don't really want to do it, or if you are laboring
under illusions--left over from the 80s--that it will guarantee you a
lucrative and glamorous career in international trade ;-)

But anyway, I am also interested in using ConTeXt for Japanese, and
would be glad to contribute what I can to this effort.

> I asked about Japanese a while back.  Hans requested more information on 
> encodings, fonts, etc.  I don't know enough about these things or 
> ConTeXt to know what is needed exactly.

I don't know much about ConTeXt internals, but do know something about
"these things," so I may be able to help. Was Hans' request on the
mailing list? If you know when it was posted, perhaps I can look it up.

> Typesetting Japanese could be more complicated than Chinese because of 
> the concurrent use of four writing systems:

On Mon, Jun 09, 2003 at 06:33:49PM +0200, Tim 't Hart wrote:
> 
> Unicode wasn't that popular because Unix-like operating systems used EUC as
> encoding, and Microsoft used their own invented Shift-JIS encoding.

There were also cultural/political reasons, with perhaps a touch of Not
Invented Here syndrome. But that's a different story.

> So there
> is still a lot of digital text out there written in these encodings, and a
> lot of tools still use it. But I think that if you want to write new texts,
> using Unicode shouldn't be a problem for most users. I guess that most
> editors supporting Asian encodings also make it possible to save in UTF-8. I
> think nowadays it's easier to find a Unicode enabled editor than it is to
> find a Shift-JIS/EUC editor! (Well, on Windows anyway...).

Yes, recent Windows versions (starting with NT 4.0 in the business
series, and ... not sure ... ME? in the consumer series) use some form
of Unicode as their base encoding, so I think it is now the norm for
Windows text editors to support UTF-8 ... I'm pretty sure TextPad does,
for example.

> Since ConTeXt
> already supports UTF-8, I don't see a reason to make thinks more difficult
> than they already are by writing text in other encodings.

On the face of it that makes sense. But I don't think it's safe to make
a blanket assumption that the text in a ConTeXt document will originate
with the creator of the document, or that it will be newly written.
Also, UTF-8 support is still a bit half-baked on Unix/Linux systems.

> When I look at the source of the Chinese module, the most difficult part for
> me to understand is the part about font encoding, the enco-chi.tex file, and
> the use of \defineuclass in that file. I guess it has to do something with
> mapping the written text to the font.

Most likely. I might be able to glean something useful from that file.
I'll take a look when I can find the time.

> I guess that if you want to make a proper Japanese module, you'll need to
> support JIS or Shift-JIS encoded fonts.

This would be a good idea for Type 1 font support. It seems to me that
almost all recent Japanese TrueType fonts have a Unicode CMap.

> But on the other hand, maybe we
> don't need to support that since there are a lot of Japanese Unicode fonts
> available. I use WinXP, and there we have msmincho.ttc and msgothic.ttc,
> which are both Unicode fonts.

Can PDFTeX handle TTC files? I know ttf2afm/ttf2pk can process them, but
I have tried 2 or 3 times to include a Japanese TTC font directly in a
PDFTeX document, but was never able to make it work.

> And Cyberbit is a Unicoded font as well. Commercially available fonts by
> Dynalab (Dynafont Japanese TrueType collection is quite cheap and very good)
> are also Unicode fonts. Again, I don't think we should make it difficult for
> ourselves by trying to support non-Unicode fonts while unicoded Japanese
> fonts are easy to use and widely available.

Well, it can be done in stages. I think that any serious attempt to
support Japanese in ConTeXt should encompass all common encodings. But
I don't see anything wrong with starting out Unicode-only.

> > Typesetting Japanese could be more complicated than Chinese because of
> > the concurrent use of four writing systems 
> 
> The fact that Japanese uses four writing systems is not really a problem.

Maybe it's not a big problem. But it is certainly more complex than
chinese, since there is a mixture of proportional and fixed-width
characters, and the presence of Kana and Romaji complicate the
line-breaking rules.

> > I guess I need to track down a few sample documents.  I tried to turn up 
> > some info on Japanese typesetting rules but had no luck.

What would a good sample consist of? I can probably find something.

> The only info I got is from Ken Lunde's CJKV book, where he mentions some
> rules about CJK line breaking.

Yes, Lunde is good, but he doesn't go into enough detail to serve as an
implementor's guide. I've also searched for more info on this subject;
my impression is that besides Lunde's books there is really nothing
available in English. I could probably make some sense out of the
Japanese works that are available, but it would take up much more time
than I have.

> With the ConTeXt example that I posted yesterday, I am already able to write
> Japanese in UTF-8, use a Unicoded Japanese font in ConTeXt, and get Japanese
> output. I hope the hard part is already behind me! :-) The only thing that
> still puzzles me is how I can add interglyph space so that TeX can break the
> lines. If someone can help, I would really appreciate it!

Sorry, no idea. But it sounds like you've made an admirable effort so
far. I was working along similar lines a couple of years ago, but was
never able to produce anything useful. Guess you're a better TeXnician
than I.

-- 
Matt Gushee                 When a nation follows the Way,
Englewood, Colorado, USA    Horses bear manure through
[EMAIL PROTECTED]           its fields;
http://www.havenrock.com/   When a nation ignores the Way,
                            Horses bear soldiers through
                                its streets.
                                
                            --Lao Tzu (Peter Merel, trans.)
_______________________________________________
ntg-context mailing list
[EMAIL PROTECTED]
http://www.ntg.nl/mailman/listinfo/ntg-context

Reply via email to