Re: [NTG-context] Design for Translation

2009-03-12 Thread Mari Voipio

Mari Voipio wrote:
(BTW, if you'd like my 
editing instructions, I have them somewhere in rtf format.)


It turns out I was actually smart enough to add these into the version 
control when I revised the file some time ago. I checked the file and 
there's nothing that couldn't be published, so for now you can fetch the 
pdf version of my editing instructions at 
http://www.kpatents.com/pdf/support/manual_editing_instructions.pdf. 
The file will eventually disappear, but not this week.



These instructions are written for a Windows dummy who's at best 
edited HMTL manually and at worst barely manages to open and save a file.



The rtf version of the instructions is available at request, if somebody 
has a use for it. The only thing that I ask for is that I get your 
finished instruction file/presentation/whatever in exchange (pdf is 
fine) as that'll help me improve mine.




Regards,

Mari

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] Design for Translation

2009-03-12 Thread Mari Voipio

John Devereux wrote:

First thing to remember is that I started the ConTeXt project many
years (5???) ago and the program *and* its documentation have evolved
a lot since then.


Is there anything in particular new that you think might help?


Compared with the situation a few years back, WinConTeXt MkIV is a 
breeze to install and update. And it works without fiddling, no font 
installation, no funny coding in the format file, nothing. Even the 
Cyrillic version worked when I just got the file encoding correct.


Even if they'll never compile the file, installing WinConTeXt is 
probably the easiest way to get SciTe with correct syntax highlighting. 
Except that you need to set UTF8 as default by yourself and not all 
users can get that far; the more patient ones will, though, if 
instructions are clear (go to menu x, click on y, write or copy to the 
file exactly the following line, save, close SciTe, reopen.)




Daydreaming towards off-topic direction:

This would actually be really handy for people who need to edit .tex but 
never compile: a download-and-click-to-install SciTecumConTeXt package 
that had .tex defaulted to ConTeXt and had appropriate highlighting, but 
nothing else. Encoding could probably also be defaulted to UTF8 at this 
point or the installation could ask whether encoding is Windows standard 
or UTF and then put the default in as needed.
If such a package existed, I could just dump the installation package 
and the files to be edited on a USB stick or CD and (snail)mail the 
whole thing to the person doing the translation. Most Windows users can 
handle the installation bit, if it is like installing for example Adobe 
Acrobat Reader.
I.e. I'm looking for something like the Notepad related free/shareware 
html editors that highlight but don't have much brain otherwise - while 
I use SciTe for my html work, my colleague has been happy with 
Notepad-something-or-other (and that finally rescued me and our code 
from the clutches of FrontPage).




2 out of 7 is not what I was hoping for... I assume these were your
agents or similar (rather than professional translators)? If so that
is how we were hoping to do it too.


Correct. One of those two has some background in programming, so he even 
managed the first chapter with Notepad (or something similar), also he 
whined that it was difficult without highlighting. The other one was an 
elderly consultant who may have dealt with computers in the time when 
text editing/layouting still involved similar coding. Or then he was 
just used to taking on any job that falls his way, at least he managed 
beautifully.


All or most of the others are marketing people who take one look at the 
editing instructions and give up. This happened even in-house where I 
would have been easy to reach for any help and even offered to install 
the system on his computer. No hope (I was reasonably annoyed with this 
one, cutting-and-pasting that language took me like three weeks...).


The funniest (but sad) thing is that after they give up on .tex, I also 
offer to convert the graphics into formats that will be easy to insert 
into Word if they want to do a translation of their own, but they've 
*never* taken me up on that offer yet. Instead they apparently just cut 
and paste the graphics from our official pdf and the result is usually 
not as good quality as it would have been with the stuff I send. The one 
I'm now working on is pretty  sad. Although not as sad as the 20 meg 
version I received a year earlier that had tracking on, too (thankfully 
Word2007 now has an easy cleaning function with which I got rid of the 
10 megs of revisions).




 I was going to try saving the

tex original as .doc - word seems to open it OK - and then saving
*their* end product as encoded text/UTF8. Has anyone tried this?


I think I tried this with the Russian test file and it worked, but I can 
retry (I've got Word 2003). The very important bit is to choose encoded 
text as save format - it allows you to do plain text + UTF8, but the 
file just isn't saved as UTF8 (confusing or what). Been there, done 
that
Also, the tip I got here for a character converter, charsc, was pretty 
good. Can't quite figure out the command line version, but I can get it 
to work if I know the original encoding (it couldn't guess at 
Windows-Cyrillic, IsoLatin1 went better). It is downloadable at 
http://www.kalytta.com/tools.php.



Also, a caveat about Word. As my editing instructions say, you really 
have to have all AutoXxx features turned off or your tex code can get 
pretty fishy. In the worst case the autocorrect features do things to 
your parentheses and even in the best case you get to fight with things 
like ... turning into real ellipsis and possibly getting mussed up later 
in conversion. If you can talk your translators into using WordPad 
instead (if they are not up to trying SciTe), it will make your life 
easier in the long run. *you* can still open the resulting 

Re: [NTG-context] Design for Translation

2009-03-12 Thread John Devereux
Mari Voipio mari.voi...@iki.fi writes:

 John Devereux wrote:
 But I would really appreciate any insights anyone may have.

 I've got some experience on this. I'm sure my way is not optimal, but
 at least it is an experience.

Hi Mari, thank you very much for such a long and detailed answer! Your
experience very much reflects how I can see things going.

 First thing to remember is that I started the ConTeXt project many
 years (5???) ago and the program *and* its documentation have evolved
 a lot since then.

Is there anything in particular new that you think might help?

 The other thing is that I didn't expect to have to deal with
 translations. The documentation (and the instrument the manual is for)
 were both supposed to exist in English only. Yeah. Sure. (We don't do
 consumer electronics, so regulations are a bit different than for
 stuff you buy in a shop.)

 So over these years I've dealt with repeating please send us the
 manual as Word file for translation queries. Every time I've
 explained in words of one syllable that there's is and will not be a
 Word file, that the distributed manual file has origins in a totally
 different system. We stopped using Word when the file grew so big that
 Word just couldn't cope and when most of the figures to be included
 were pdf anyway and thus easily incorporated into ConTeXt files.
 I always offer to send the potential translators the files and the
 editing instructions and say that I can do a pdf out of the translated
 files any time, for example after each chapter. (BTW, if you'd like my
 editing instructions, I have them somewhere in rtf format.)

Certainly, if it is convenient, thank you.

 The reactions to the above information vary. A South American
 professional translator took the files without whining and turned in
 the translated Spanish text with only *two* messed up codes - which is
 a lot less of a mess than I do when editing. The French gave up
 directly; they supposedly have a Word version of their own of the
 manual and so do the Poles. Three other languages were written in Word
 (or similar) and I had all the fun in cutting-and-pasting the text
 into the ConTeXt files. Italian was first done with cut-and-paste
 method, but then needed so much work that to my surprise they edited
 the ConTeXt files for me with a very good result - that's probably the
 most accurate translation of the whole lot.

2 out of 7 is not what I was hoping for... I assume these were your
agents or similar (rather than professional translators)? If so that
is how we were hoping to do it too.

 The Russian version of our manual is in the works. They wanted to do
 it all by themselves, but I haven't heard anything since I debugged
 their last file (encoding problem, Win-cyrillic to UTF). I hope that
 means everything is under control there... They are basically working
 on a pared-down duplicate system so we can easily exchange files.

 I should add that except for the South American translator and the
 Russians, the other persons are not IT people, nerds or not
 necessarily even that computer litterate (if their usage of Word is
 anything to go by). If your translators are used to structural
 coding (html, for example) and especially if they already use suitable
 editors, you'll have a lot less problems.

Not much chance of that I'm afraid. Although there's no reason I could
not tell them to use Scite or similar. I was going to try saving the
tex original as .doc - word seems to open it OK - and then saving
*their* end product as encoded text/UTF8. Has anyone tried this?

 Then the practical aspect. What I had from the beginning is a system
 where each chapter of the manual is a file of its own - makes it much
 easier to handle. Most of the formatting and setups is in the main
 file, so the chapters just contain list of figures and then the text
 itself. This makes them much easier to edit and handle.

I was going to have a single environment file, which the translators
never see, then a *single* document file. But perhaps separate
chapters would be better... My document is not so big, maybe 30 pages
of text (plenty of screen captures too). So perhaps my document is
like one of your chapters. But there could be several other documents
in the pipeline, so am trying to come up with a workable approach.

 When I started getting the languages, I made subdirectories for them,
 one per language. This is where I put the tex files for that language
 + all the figures that have translated text in them; ConTeXt will look
 first in the same directory and then further afield, so if there's a
 translated figure, it will get used. If not, the figures of the
 English manual are used. That way I don't have to repeat anything that
 has no text in it - and the manuals compile from the beginning, first
 English everywhere and then little by little with translations.

 I have a main format file for each language. This is because of the
 language settings (hyphenation, labels), but 

[NTG-context] Design for Translation

2009-03-11 Thread John Devereux
Hi,

I am wondering how best to go about creating an evolving document that
will need translation.

This is a manual that will probably need to be produced - and
maintained - in around 6 languages.

For what it is worth, I have come up with two approaches, which
follow. 

But I would really appreciate any insights anyone may have.

...


1) Just translate the file
--

- Obey formatting rules in document tex file so as to most easily
  visually separate commands from text.

- Distribute the files for translation with instructions.

This will work until we need to modify the file, then how to
communicate the modifications? An english diff? 

I think this may be the best solution, but there is also
 
2) Using blocks
---

The excursion manual briefly describes an alternative (which I
think may be too cumbersome). The idea would be to totally separate
the text from any tex commands (except for a single type of begin/end
sequence). Conceptually :-


manual-env.tex:

\defineblock[EN,de,it]
\setupblock[EN][file=EN]
\setupblock[DE][file=DE]
\setupblock[IT][file=IT]

\doifmode[EN]{\def\lang{EN}}
\doifmode[DE]{\def\lang{DE}}
\doifmode[IT]{\def\lang{IT}}


manual.tex:

\environment manual-env.tex

\useblocks[\lang][installation-1]
\useblocks[\lang][installation-2]

EN.tex:

\beginEN[installation-1]
This is how to install the product type 1
\endEN

\beginEN[installation-2]
This is how to install the product type 2
\endEN

DE.tex:

\beginDE[installation-1]
(german text here)
\endDE

\beginDE[installation-2]
(german text here)
\endDE

IT.tex:

\beginIT[installation-1]
(italian text here)
\endIT

\beginIT[installation-2]
(italian text here)
\endIT



-- 

John Devereux
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___


Re: [NTG-context] Design for Translation

2009-03-11 Thread Mari Voipio

John Devereux wrote:

But I would really appreciate any insights anyone may have.


I've got some experience on this. I'm sure my way is not optimal, but at 
least it is an experience.



First thing to remember is that I started the ConTeXt project many years 
(5???) ago and the program *and* its documentation have evolved a lot 
since then.
The other thing is that I didn't expect to have to deal with 
translations. The documentation (and the instrument the manual is for) 
were both supposed to exist in English only. Yeah. Sure. (We don't do 
consumer electronics, so regulations are a bit different than for stuff 
you buy in a shop.)



So over these years I've dealt with repeating please send us the manual 
as Word file for translation queries. Every time I've explained in 
words of one syllable that there's is and will not be a Word file, that 
the distributed manual file has origins in a totally different system. 
We stopped using Word when the file grew so big that Word just couldn't 
cope and when most of the figures to be included were pdf anyway and 
thus easily incorporated into ConTeXt files.
I always offer to send the potential translators the files and the 
editing instructions and say that I can do a pdf out of the translated 
files any time, for example after each chapter. (BTW, if you'd like my 
editing instructions, I have them somewhere in rtf format.)


The reactions to the above information vary. A South American 
professional translator took the files without whining and turned in the 
translated Spanish text with only *two* messed up codes - which is a lot 
less of a mess than I do when editing. The French gave up directly; they 
supposedly have a Word version of their own of the manual and so do the 
Poles. Three other languages were written in Word (or similar) and I had 
all the fun in cutting-and-pasting the text into the ConTeXt files. 
Italian was first done with cut-and-paste method, but then needed so 
much work that to my surprise they edited the ConTeXt files for me with 
a very good result - that's probably the most accurate translation of 
the whole lot.


The Russian version of our manual is in the works. They wanted to do it 
all by themselves, but I haven't heard anything since I debugged their 
last file (encoding problem, Win-cyrillic to UTF). I hope that means 
everything is under control there... They are basically working on a 
pared-down duplicate system so we can easily exchange files.


I should add that except for the South American translator and the 
Russians, the other persons are not IT people, nerds or not necessarily 
even that computer litterate (if their usage of Word is anything to go 
by). If your translators are used to structural coding (html, for 
example) and especially if they already use suitable editors, you'll 
have a lot less problems.




Then the practical aspect. What I had from the beginning is a system 
where each chapter of the manual is a file of its own - makes it much 
easier to handle. Most of the formatting and setups is in the main file, 
so the chapters just contain list of figures and then the text itself. 
This makes them much easier to edit and handle.


When I started getting the languages, I made subdirectories for them, 
one per language. This is where I put the tex files for that language + 
all the figures that have translated text in them; ConTeXt will look 
first in the same directory and then further afield, so if there's a 
translated figure, it will get used. If not, the figures of the English 
manual are used. That way I don't have to repeat anything that has no 
text in it - and the manuals compile from the beginning, first English 
everywhere and then little by little with translations.


I have a main format file for each language. This is because of the 
language settings (hyphenation, labels), but also because the English 
manual is letter size and most translators prefer A4. Sometimes also one 
language only needs small adjustments (like we have no index in 
Italian), so I find it easier to keep all the layout stuff separate one 
language from another. However, the main layout had already been the 
same for two years when the first translation came along, otherwise I 
might move some information (like heading formatting) into a shared 
formatting/setup/layout file for easier changes.






So, how do I keep all of this up to date?

I don't. Fully. But if I could devote most of my working hours into 
that, I maybe could... Won't happen this decade, I think.


One thing that helps is version control (SVN) that keeps all the files 
and I try to document very carefully in the log files what I've done. As 
I usually check in all the files before leaving work, I still remember 
what was done and the log is reasonably good. SVN also means that I can 
diff with an earlier version of the file and see what changes were made, 
this is also handy.


Another thing is that any changes I can make myself, I'll do all