On Wednesday 04 May 2005 16:03, Martin Michlmayr wrote:
> Package: antiword
> Version: 0.35-1
> Severity: normal
> I created a simply document with OpenOffice.Org 1.1.2 and saved it as
> a Word document. When I tried to look at it with antiword, I only got
> the error:
> | I'm afraid the text stream of this file is too small to handle.
> This is both with antiword 0.35 and 0.36.1 and with documents stored
> with OpenOffice.Org as Word 6.0, Word 95 and Word 97/2000/XP. I tried
> the file in MS Word 2003 SP1 on Windows XP and it loaded without any
> problems, so this seems to be a problem in antiword.
> An example file is attached below. It's a fairly simple file -
> basically a bullet list with a number of items.
> (Thanks for antiword, by the way. I'm a text-based user and really
> appreciate not having to load OOo just to view DOC documents people
> send to me.)
This is not a bug, it is a missing feature.
Let me explain.
Inside a Word file the text is stored in a so called text stream. There are
two possible text streams: a small block text stream and a large block text
stream. The small blocks are 64 bytes in size, the large blocks are 512
bytes in size. Because the difference in size Antiword would need two
different methods for reading those two text streams. The method for
reading that small block text stream has not been implemented yet. The
result is that Word files with no large block text stream can no be read by
Antiword. Such Word file are mostly smaller than about 12 kilobytes and
have less than 1024 bytes of text.
The reason for not implementing the missing fearture is simple. Word
documents that use the small block text stream can not be produced by Word
for Windows (all versions), but only Word for Mac. And now by OpenOffice.
Note that these documents can be read by all versions of Word.
Adri van Os
The Antiword Team [EMAIL PROTECTED]
http://www.winfield.demon.nl/index.html for version 0.36 (16 Oct 2004)
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]