Hi list,
here is a multilingual utility for reading email attachments
with MS-Word documents. The stress is on multilingual. If you
read only English or Western languages you will not need it.
-----------------------------------
DOC2TXT
Reading multilingual email attachments in MS-Word format
Installation
------------
1. Get the utility CATDOC.EXE from the internet and copy it
into your path
http://www.ice.ru/~vitus/catdoc/ver-0.9.html
2. Add to MIME.CFG the line
file/.doc >TXT|@call doc2txt.bat $1 $2
3. Put the batch file DOC2TXT.BAT to a directory in your path
@echo off
cls
echo Conversion of MS-Word document
echo ------------------------------
echo (W)estern Europe - cp1252
echo (C)entral Europe - cp1250
echo (R)ussian - cp1251
echo (U)nicode 16bit
echo.
choice /c:wcru
for %%f in (1 2 3 4) do if errorlevel %%f goto %%f
goto end
:1 WE
catdoc.exe -a -scp1252 -d8859-2 %1 >%2
goto end
:2 CE
catdoc.exe -a -scp1250 -d8859-2 %1 >%2
goto end
:3 Russian
catdoc.exe -a -scp1251 -d8859-2 %1 >%2
goto end
:4 Unicode
catdoc.exe -a -u -d8859-2 %1 >%2
:end
If you do not use Arachne with the ISO8859-2 fonts, you have to
replace the parameter -d8859-2 with another one, eg. -d8859-1 or
-dcp1250.
Usage
-----
On trying to open an email attachment that is (correctly) named
FILE.DOC you will be asked to select a Windows codepage. If the email
is sent to you from the West (Canada, Australia, Britain)
answer 'W', if it comes from the East (Russia) press the key 'R',
if the author is from the center (Czech Republic) you will need 'C',
and if you suspect the document to be saved in the 16bit Unicode
format you have to type 'U'. If you are not satisfied with the first
result simply press Arachne's hot key 'R' for reload and have
another try.
Customization
-------------
Catdoc is highly customizable. You will want to use it not only
for Arachne. Read the documentation.
Restrictions
------------
This utility will display only the text of word documents. At the end
there will be some garbage output. Catdoc is still under development
and will hopefully keep up with future changes in DOC file format.
DOS binaries are not available for the latest (Linux) version.
-------------------------------------
In Czech republic people like even more to attach RTF to their
emails. Complaining to them is useless. They will not understand what
you mean. You *have* to read their attachments. The multilingual
problem is the same. As far as I know there is no DOS utility for
RTF2HTM that copes with this problem.
Does anybody know of a RTF2TXT utility that
- runs on DOS command line (or even produces standard output?)
- can read the latest RTF specifications
- is able to read codepage information from file header
- produces 8bit code
???
Then it would be easy to apply the above principle to
a multilingual RTF2TXT.
Regards
Christof Lange
_______________________________________________
Christof Lange <[EMAIL PROTECTED]>
Prokopova 4, 130 00 Praha 3, Czech Republic
phone: (+420-2) 22 78 18 00 / 22 78 20 02, telefax: 22 78 18 01
http://www.volny.cz/cce.zizkov