To: [EMAIL PROTECTED] On the Jakarta General list, we've been discussing the possibility of introducing an "Internationalization" project into incubation. It seems the consensus is that it should be targeted for a top-level programming-language-independent and spoken-language-independent Apache project, rather a Jakarta subproject.
(To anyone on the JG list: I used a blind CC so that this is the only message on [EMAIL PROTECTED] which should be CCd to JG. You can set up message filters on "[i18n]" on both lists to follow the discussions in either place....) A preliminary organization of the project based on the JG discussions is included in my message below. I don't mind "spearheading" the incubation myself. Is there anyone else interested whom we can add to the list of contributors (see A through F below)? Is there anything else we should consider before requesting entry into incubation? TIA. Robert Simpson -------- Original Message -------- Subject: Re: [i18n] Internationalization subproject sponsor? Date: Sun, 13 Jul 2003 21:32:36 +0100 From: robert burrell donkin <[EMAIL PROTECTED]> Reply-To: "Jakarta General List" <[EMAIL PROTECTED]> To: "Jakarta General List" <[EMAIL PROTECTED]> On Monday, July 7, 2003, at 01:14 PM, Robert Simpson wrote: <snip> > I am surprised there isn't more interest in a common internationalization > framework within Jakarta. But then I have been assuming that there are > non-English-speaking "members" in Jakarta, not just "committers" and > other users of the code. i think that there several jakarta members who are not native english speakers. as Tetsuya Kitahata pointed out there are far fewer members than committers and i'm not sure whether there are any jakarta members who are native speakers of non-latin languages. it takes a lot of energy to spearhead an incubation and it's a big commitment for a member to make. but i don't think that the member would have to come from jakarta (even if that's where those people involved with the product hope that it will end up). i wonder whether you might have more luck finding a sponsor over in xml-land. since many of their products are multi-language a common i18n framework may be of more pressing importance than here. i also have an idea that there are members whose native languages are non-latin. i like the idea of an apache wide i18n project along the lines suggested by Tetsuya Kitahata. - robert -------- Original Message -------- Subject: Re: [i18n] Internationalization subproject Date: Sat, 12 Jul 2003 08:55:00 -0400 Reply-To: "Jakarta General List" <[EMAIL PROTECTED]>,[EMAIL PROTECTED] To: Jakarta General List <[EMAIL PROTECTED]> References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> WRT Santiago's point about keeping the different translations in sync, the solution is to have each word/phrase in (1) or each section in (2) identified in the XML with a version number. Then it would be a simple matter to have a program compare the two documents, and indicate where the translation needs to be updated (the program could even provide an initial translation of the section via machine translation, to be refined by the human translator). The XML should also indicate who made each change and whether a change was prompted by a need to change the document (additions to content, for example) or as a translation of another version. That way, no particular translation would have to be the "primary" document, and any conflicts could be identified and handled. For example, a Spanish-speaking person could add a missing section to the Spanish translation of a document, and that section could then be translated back into the original and other translations. This arrangement could also handle "proposed" additions (the XML equivalent of "I, a Spanish translator, propose to add a new section here"), which could be commented on (ex: "that section would be better placed over there") and/or voted on by translators of other languages, etc.... Am I getting the feeling right that the Internationalization project would be ultimately targeted for a top level, multiple-programming-language Apache project? If so, I think the best approach would be to get the Java support done first, to demonstrate its viability and usefulness. But still, from the start, the intent should be to design with language-independence as the ultimate goal. So, in summary, the organization of the project would be: 1. code common to both (1) and (2) 1.1 code This would include any code that supports both (2) and (3), such as the code to do comparisons between translations 1.1.1 any programming-language-neutral stuff (configuration files, XML, etc) 1.1.2 Java 1.1.2.1 source code 1.1.2.1.1 source code contributors (committers) 1.1.3+ other programming languages, similarly 2. user interface internationalization (words and phrases) 2.1 code This would include the code to generate programming-language-specific resources, and provide access to those resources 2.1.1 any programming-language-neutral stuff (configuration files, XML, etc) 2.1.2 Java 2.1.2.1 source code 2.1.2.1.1 source code contributors (committers) 2.1.2.2 resources (translations, generated from XML) 2.1.3+ other programming languages, similarly 2.1.3+.1 source code for other programming languages 2.1.3+.2 resources for other programming languages (translations, generated from XML) 2.2 language translations (programming-language-neutral) 2.2.1 any spoken-language-neutral stuff (all-language distribution files, JUnit tests for file verification, etc) 2.2.2 English language translations (initial "source" translations) 2.2.2.1 XML format 2.2.2.1.1 English language translators (committers) 2.2.2.2 English user references 2.2.2.2.1 XML formatted user reference (generated, XSL-FO?) 2.2.2.2.2 HTML formatted user reference (generated, possibly with a doclet) 2.2.2.2.3 PDF formatted user reference (generated, possibly from XML user reference using Apache XML-FOP) 2.2.3+ other spoken languages, similarly 3. internationalization of complete documents 3.1 code This would include code or tools (possibly making use of other Apache code) to generate specific document file formats 3.1.1 any programming-language-neutral stuff (configuration files, XML, etc) 3.1.2 Java 3.1.2.1 source code 3.1.2.1.1 source code contributors (committers) 3.1.3+ other programming languages, similarly 3.1.3+.1 source code for other programming languages 3.2 language translations (programming-language-neutral) 3.2.1 any spoken-language-neutral stuff (all-language distribution files, JUnit tests for file verification, etc) 3.2.2 English language translations (initial "source" translations) 3.2.2.1 XML format (based on XSL-FO?) 3.2.2.1.1 English language translators (committers) 3.2.2.2 HTML format (generated) 3.2.2.3 PDF format (generated, possibly using Apache XML-FOP) 3.2.2.4+ other document file formats (generated) 3.2.3+ other spoken languages, similarly The main difference between sections (2) and (3) is that (2) is organized primarily by programming language, with the programming-language-specific resources as part of the first subsection (2.1) keeping the second section (2.2) programming-language-neutral, while (3) is organized primarily by spoken language, with the programming-language-independent file formats as part of the second subsection (3.2), keeping them separate from the programming-language-specific stuff in the first subsection (3.1). I'd be willing to work on the common code and user interface code, and it looks like there is a good starting list for the language translators. Is there anyone willing to drive the second part, the internationalization of complete documents? I can also be update the proposal as indicated above, and then let it be reviewed/modified here, or in CVS somewhere. In your replies to the mailing list, please indicate in which of the following ways you might be willing to contribute: A) committer for code for internationalization of user interface and possibly common code B) committer for code for internationalization of complete documents and possibly common code C) language translation (either or both UI or documents) D) sponsor entry of Java version of Internationalization subproject into Jakarta E) incorporate internationalization into another Apache/Jakarta sub/project (please specify) F) none of the above Robert Simpson Santiago Gala wrote: > Robert Simpson escribió: > > Santiago Gala, > > > > As far a document and resource translation, I'm not sure if you are > > referring to machine translation, or human translation. My focus has > > been on human translation, mainly because machine translation is > > still pretty far from perfect. There could be APIs for interfaces to > > various machine translation tools, such as Systransoft, but I think > > that should be a later, secondary priority. Even if there was > > support for machine translation, I would prefer that it could be > > augmented by human proofreading and revision. So it's probably just > > as easy to let the language translator use whatever machine > > translation tool s/he prefers. > > > > David Taylor has already anwered WRT code. > > I was thinking mostly about having a "pool" of people who can translate > and are more or less "cross project". For instance, I can translate > English to Spanish, and I'm a committer in Jetspeed, but I could also > translate, say, parts of the tomcat documents that I'm reading, or some > XML stuff I'm interested into. Or even docs for Apache modules. > > The good part is that it would help the whole community, both WRT > translation efforts and WRT crosspollination, as these kind of people > will "see" beyond their small project(s). Also, it oculd bring new kinds > of developers (Today I heard in the radio, coming home, that 72% od > people in Spain cannot speak *any* foreign language. We are a bad sample > but in most of Europe, less than 50% people speaks English.) > > The problem is that I can't see clearly how to implement such a > crosscutting service/project, in ways that would not be difficult to > impossible to manage. Specially since we should keep source control on > both the original doc and the translations in sync. > > Any ideas? > > Regards > -- > Santiago Gala > High Sierra Technology, S.L. (http://hisitech.com) > http://memojo.com?page=SantiagoGalaBlog --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]