Hi list, 

here is a multilingual utility for reading email attachments 
with MS-Word documents. The stress is on multilingual. If you 
read only English or Western languages you will not need it. 

-----------------------------------

DOC2TXT

Reading multilingual email attachments in MS-Word format


Installation 
------------

1. Get the utility CATDOC.EXE from the internet and copy it 
    into your path

    http://www.ice.ru/~vitus/catdoc/ver-0.9.html

2. Add to MIME.CFG the line

    file/.doc             >TXT|@call doc2txt.bat $1 $2

3. Put the batch file DOC2TXT.BAT to a directory in your path

    @echo off
    cls
    echo Conversion of MS-Word document
    echo ------------------------------
    echo (W)estern Europe - cp1252
    echo (C)entral Europe - cp1250
    echo (R)ussian - cp1251
    echo (U)nicode 16bit
    echo.
    choice /c:wcru
    for %%f in (1 2 3 4) do if errorlevel %%f goto %%f
    goto end
    :1 WE
    catdoc.exe -a -scp1252 -d8859-2 %1 >%2
    goto end
    :2 CE
    catdoc.exe -a -scp1250 -d8859-2 %1 >%2
    goto end
    :3 Russian
    catdoc.exe -a -scp1251 -d8859-2 %1 >%2
    goto end
    :4 Unicode
    catdoc.exe -a -u -d8859-2 %1 >%2
    :end
   
   If you do not use Arachne with the ISO8859-2 fonts, you have to 
   replace the parameter -d8859-2 with another one, eg. -d8859-1 or
   -dcp1250. 


Usage
-----

On trying to open an email attachment that is (correctly) named 
FILE.DOC you will be asked to select a Windows codepage. If the email 
is sent to you from the West (Canada, Australia, Britain) 
answer 'W', if it comes from the East (Russia) press the key 'R',
if the author is from the center (Czech Republic) you will need 'C',
and if you suspect the document to be saved in the 16bit Unicode 
format you have to type 'U'. If you are not satisfied with the first 
result simply press Arachne's hot key 'R' for reload and have 
another try. 


Customization
-------------

Catdoc is highly customizable. You will want to use it not only 
for Arachne. Read the documentation. 


Restrictions
------------

This utility will display only the text of word documents. At the end
there will be some garbage output. Catdoc is still under development 
and will hopefully keep up with future changes in DOC file format. 
DOS binaries are not available for the latest (Linux) version. 

-------------------------------------

In Czech republic people like even more to attach RTF to their 
emails. Complaining to them is useless. They will not understand what 
you mean. You *have* to read their attachments. The multilingual 
problem is the same. As far as I know there is no DOS utility for 
RTF2HTM that copes with this problem. 

Does anybody know of a RTF2TXT utility that

- runs on DOS command line (or even produces standard output?)
- can read the latest RTF specifications
- is able to read codepage information from file header
- produces 8bit code

???

Then it would be easy to apply the above principle to 
a multilingual RTF2TXT. 

Regards
Christof Lange

_______________________________________________

 Christof Lange <[EMAIL PROTECTED]>
 Prokopova 4, 130 00 Praha 3, Czech Republic
 phone: (+420-2) 22 78 18 00 / 22 78 20 02, telefax: 22 78 18 01 
 http://www.volny.cz/cce.zizkov


Reply via email to