On Sun, Nov 14, 2010 at 02:11:41PM +0100, Xavier R wrote:
> Also, I want to translate DCSS in french (a huge work, I know, but im ready
> :] )
> is this possible/viable/cool/etc... ?

I don't know what Raphael wrote since that was in some exceptionally heathen
language :p, but here's my take:

Translating Crawl would require two things before the actual translation:
* Unicode support (or actually, any charset above ASCII)
* parametrizing strings

The former is in the works, by me, currently paused.  The reason for
postponing this project is that actual gameplay features need both bug and
balance testing, while technical stuff needs to be tested only for bugs.
I may resume this effort if there's a reason to do so, of course.
The problem that needs to be handled is: a typical Unix user will operate
using Unicode encoded in UTF-8, and expect it to be used everywhere.  A
typical Windows user connecting to a server like CAO typically uses CP437
due to it being traditionally used for roguelikes in the past.  A local
French-speaking Windows user will have his system in CP1252.  Crawl would
need to handle all of these and convert when needed.  Fortunately, this work
is more than halfway done already.

The second part would require coming up with a way to build messages in a
way that meet a given language's grammar.  For example:
"The orc reflects the poison arrow off its shield!"

In English, the variable parts are: "The", "orc", "poison arrow", "its",
"shield".  There aren't that many dependencies, all the logic is:
* does the actor require an article?  Not if he has a proper name.
* what is the actor's gender?

Our code is more complex due to capitalization needed, but I have an idea
how to simplify that[1].  With no heed for capitalization, a parametrized
message would be:

"$ARTICLE(definite, %1) %1 reflects the %2 off $PRONOUN(possessive, %1) %3!"
(of course, the actual markup would be much terser)

$ARTICLE(x, y) would provide the definite article for word x -- which can be:
* upgrading "a"/"an" to "the" ("an orc")
* "the" ("the royal jelly")
* null article ("Agnes")

This may be an overkill for English, but things get more complex in
inflected languages -- for example in Polish, a verb changes according to
the actor's grammatical gender -- "odbił" masculine but "odbiła" feminine.
Unlike other text games I've played, Crawl has no group monsters so at least
the grammatical number is spared for livings -- but not for items.

The Polish translation would be:
"$NOUN(%1, nominative, singular) $VERB("reflect", $GENDER(%1)) $NOUN(%2,
accusative, singular) od swej $NOUN(%3, genitive, singular)!"

$VERB(x, y) would provide the declination of word x appropriate for gender y.
$NOUN(x, y, z) would take word x, place it in case y and number z.


The actual markup would probably use one-letter symbols, not use parentheses
unless needed, guess their arguments -- usually a verb needs to match the
noun before it, and the number is typically singular, and different
pronouns/cases/etc could just use separate symbols.



After coming up with such a syntax, the biggest part would be going through
all the source and replacing all strings this way.  If marked in a
consistent way (like, _("foo") as gettext uses), we would be able to run
some automated tool and gather a list of messages that do need to be
translated.


This would require step three: actual translation.  This includes:
* messages
* verbs/etc used in the above
* monsters, items, etc
* speech and descriptions

The last part is already in a database, so it requires no code changes. 
Just loads and loads of actually translating.


In other words, unlike translating some random desktop program, translating
Crawl would be a huge task.  If you are still interesting, say so and we'll
help you.



[1]. Since all translateable strings would need to go through a new separate
function, its semantics may be different from what we have.  Since 99.9%
messages are capitalized -- or perhaps even 100% as I haven't found an
example to the contrary, the function may always capitalize its input unless
specifically directed not to.  This would remove the need for manually
capitalizing pronouns and names at the start of a sentence.

-- 
1KB             // Microsoft corollary to Hanlon's razor:
                //      Never attribute to stupidity what can be
                //      adequately explained by malice.

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
Crawl-ref-discuss mailing list
Crawl-ref-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/crawl-ref-discuss

Reply via email to