Hi, I have looked at problems with two .abw files that gave me `AbiWord cannot open foo.abw. It appears to be a bogus or invalid document' messages.
The first document was created by importing a .doc file and saving it as .abw with AbiWord 0.99.3. In detail, this (shortened) .abw file produces the error message: ---------------------------------------->8---------- <?xml version="1.0"?> <!DOCTYPE abiword PUBLIC "-//ABISOURCE//DTD AWML 1.0 Strict//EN" "http://www.abisource.com/awml.dtd"> <abiword xmlns="http://www.abisource.com/awml.dtd" xmlns:awml="http://www.abisource.com/awml.dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:dc="http://purl.org/dc/elements/1.1/" version="0.99.3" fileformat="1.0" styles="unlocked"> <!-- ===================================================================== --> <!-- This file is an AbiWord document. --> <!-- AbiWord is a free, Open Source word processor. --> <!-- You may obtain more information about AbiWord at www.abisource.com --> <!-- You should not edit this file by hand. --> <!-- ===================================================================== --> <pagesize pagetype="Letter" orientation="portrait" width="8.500000" height="11.000000" units="in" page-scale="1.000000"/> <section props="page-margin-right:1.2500in; section-restart-value:1; section-space-after:0.0000in; page-margin-header:0.5000in; page-margin-left:1.2500in; page-margin-footer:0.5000in; page-margin-top:1.0000in; page-margin-bottom:1.0000in"> <p props="text-align:left; line-height:1.5; keep-with-next:yes"><c props="lang:de-DE; font-weight:bold; ����p@�[email protected]:Arial">A headline</c></p> </section> </abiword> --------8<------------------------------------------ The significant position is `<p... ����p@�[email protected]:Arial ..' ^^^^^^^^^^^^^^^^^^ When I remove the part before `font-family:', AbiWord is able to open the document. The second document has been created by writing plain text into AbiWord and saving as .abw; the following file is an example: ---------------------------------------->8---------- <?xml version="1.0"?> <!DOCTYPE abiword PUBLIC "-//ABISOURCE//DTD AWML 1.0 Strict//EN" "http://www.abisource.com/awml.dtd"> <abiword xmlns="http://www.abisource.com/awml.dtd" xmlns:awml="http://www.abisource.com/awml.dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svg="http://www.w3.org/2000/svg" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:dc="http://purl.org/dc/elements/1.1/" version="0.99.3" fileformat="1.0" styles="unlocked"> <!-- ===================================================================== --> <!-- This file is an AbiWord document. --> <!-- AbiWord is a free, Open Source word processor. --> <!-- You may obtain more information about AbiWord at www.abisource.com --> <!-- You should not edit this file by hand. --> <!-- ===================================================================== --> <styles> <s type="P" name="Normal" basedon="" followedby="Current Settings" props="font-family:Times New Roman; margin-top:0pt; font-variant:normal; margin-left:0pt; text-indent:0in; widows:2; font-style:normal; font-weight:normal; text-decoration:none; color:000000; line-height:1.0; text-align:left; margin-bottom:0pt; text-position:normal; margin-right:0pt; bgcolor:transparent; font-size:12pt; font-stretch:normal"/> </styles> <pagesize pagetype="A4" orientation="portrait" width="210.000000" height="297.000000" units="mm" page-scale="1.000000"/> <section props="page-margin-footer:0.5in; page-margin-header:0.5in"> <p style="Normal"><c props="lang:de-DE">A normal Line</c></p> </section> </abiword> --------8<------------------------------------------ Here, the character with ASCII value 31 at `A normal Line' makes ^ AbiWord show the error message mentioned above. In the (longer) original text there were some of these characters. It seems to me like a hyphenation permission mark, but I don't know whether AbiWord inserted it or whether it has been inserted by the one who edited the text (if this is technical possible); some pieces have probably copied by X-Window-Drag-And-Drop from a Webbrowser. Similar to the first case, Abiword will be able to read the file when the characters with ASCII value 31 (037) are removed. I don't know whether this behaviour should be considered of as being a bug, maybe the current version of AbiWord wouldn't create such invalid files. Do you have an idea why AbiWord fails on reading these lines or why it saved files containing these `invalid' characters? As Tim in http://bugzilla.abisource.com/show_bug.cgi?id=1665 mentioned, it would help if AbiWord would tell the position of where it found the invalid content. -- Michael ----------------------------------------------- To unsubscribe from this list, send a message to [EMAIL PROTECTED] with the word unsubscribe in the message body.
