At 12:58 am 06/04/2007, David Hislop wrote:
At 07:39 pm 05/04/2007, Ariya Hidayat wrote:
 It's recognising the OLE container and the WP magic and header, but getting
confused somewhere in the parser. It's also not recognising a "normal"
WordPerfect file. If I get a chance over the weekend I'll sort those out.

Could you elaborate on this?
I'd be happy to help once more information is available.

OK: first, I'm a C++ newbie, so please excuse the ignorance. Dealing with the second issue, not recognising a "normal" WordPerfect file:

I've partially solved this issue. Turns out that when readsome() is called, the input buffer really is empty. Changing readsome() to read() allows the file to be read and the WPC magic recognised. It's reading some garbage at the start of the file, so probably not positioning correctly, but at least that's the same problem as the document in an OLE container.

I've no real idea why readsome() isn't finding data in the buffer here yet it works in wps2text. Possibly something I've done that isn't exactly right. WPDocument::isFileFormatSupported copies the WPSInputStream input to WPXInputStream document if it's not an OLE stream, and does
  header = WPXHeader::constructHeader(document)
Perhaps that's it.

Anyway, now I'll try sorting out the positioning issue.

regards - David


1. WPSFileStream::isOLEStream() works fine called from WPDocument::isFileFormatSupported . d->buffer << d->file.rdbuf() reads a chunk of the file into d->buffer, and the WP magic is obvious in bytes 2-4 of _Stringbuffer .

2. Problems occur later in WPDocument::isFileFormatSupported calling WPXHeader::constructHeader. The call to readU8 in the line fileMagic[i] = readU8(input) throws an exception because numBytesRead is zero. Tracing the input->read call to WPSFileStream::read and then into the istream code, the d->file structure appears to have no characters available to read. There is a statement _Num = _Myios::rdbuf()->in_avail() that sets _Num to zero  because deep in the streambuf code's _Gnavail() routine _IGcount points to a zero value. Looking at the code and comment, it seems that _IGcount stores the number of available elements in the read buffer. Note that I changed the call to readsome() to a call to _Readsome_s() to avoid warnings, but readsome() calls _Readsome_s() directly anyway, so it shouldn't make any difference. And I tried it with readsome() as well, and it exhibited seemingly identical behaviour.

3. Comparing d->buffer and d->file after step 1 above, d->file doesn't seem to have the same data in its buffer that d->buffer does. Maybe it's not supposed to?

Sorry, all that seems quite obscure, but that's where I've go to.

I probably ought to try this on MinGW to see if it works there. Unfortunately that won't help me, as I'm trying ultimately to "fix" a Windows COM dll.


 The argv[1] would have gone unnoticed, but I tried it first with the --info
argument. I was convinced I had some seriously deep bug, started debugging,
and only got suspicious down in libwps::StorageIO::load(), where the bufsize
= buf.tellg() statement returned 0xFFFFFFFF.

 Should it have tripped out somewhere before then with a non-existent file
error?

Hmm, I guess I need to modify POLE to behave gracefully in case the
file does not exist (or tellg returns bogus value). But in the mean
time, I think you can add the sanity check first before feeding the
filename to the stream.

Yes, I'll check and not pass an obviously bad filename. In the indexer application it shouldn't be a problem: the string with the filename in it has come from a search of the filesystem.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Libwpd-devel mailing list
Libwpd-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libwpd-devel

Reply via email to