Great info!

What happens at SAVE? Isn't the complete unzipped file needed to do
the ZIP-analysis? Is there a way to ZIP on the fly? Does OOo implement
the ZIP algorithm itself, using the fact that it knows about the tags
it'll use from the beginning?
(I expect that auto save will be very expensive otherwise)

/$

2005/12/8, Mathias Bauer <[EMAIL PROTECTED]>:
> Randomthots wrote:
>
> > Bingo!! By jove he's got it! That's what I've been trying to get across
> > unsuccessfully. The size of the tags *does* make a difference if it
> > makes the file so big that it won't fit into RAM anymore. That's what my
> > disc thrashing comment was meant to convey, but I guess folks didn't
> > make the connection.
>
> Sorry for coming in late ;-) but I think that some technical background
> could help to clear things up.
>
> Whatever influence the size of the tags has it doesn't matter for the
> size on disk because it is compressed. Compression definitely doesn't
> care wether the tokens it compresses have 1 or 10 bytes length. In a zip
> compressed file 1000 isolated "a" tokens nearly basically consume the
> same compressed size as 1000 "abcdefghijklmnopqrstuvwxyz" tokens.
>
> Besides that OOo never reads the complete file into memory (so its total
> size has no influence on speed per se) and it also never has any
> complete uncompressed xml stream in memory (so also the size of an
> uncompressed stream does not have an influence on speed per se).
>
> Reading from a compressed stream is highly optimized in a way that only
> a buffer of a certain size is filled by decompressing a part of the
> stream and moving on if the end of the buffer is reached. This of course
> only works if no seeking back is necessary to read the stream but this
> is true for all the xml streams in the file. This is different for other
> streams like images that for this reason are copied completely into
> memory or a temporary file on disk.
>
> Of course the size of any stream has an "indirect" influence on speed
> because it might take longer to process it, but only because a large xml
> stream contains more information than a small one, not because it
> consumes more bytes on disk or in memory.
>
> > Remember, this was a big file. 63,260 rows by 7 columns. That's 442,820
> > instances of the 80 bytes of "taggage" surrounding each cell (35 MB,
> > total) plus the tags at the start of each row (times 63,260) plus all
> > the header information. Apparently with 256 MB RAM I simply ran out of
> > room loading the ods which didn't happen with the csv (or the xls).
> > Honestly, I haven't tried loading it since I stuck another 512 MB in
> > this thing. I'm sure that would make a big difference.
>
> So possibly the size of the Calc document created from the file (the
> memory consumption of Calc itself) caused the swapping you experienced
> but not the xml content itself that (as outlined above) never is read
> into memory as a whole. This is what Daniel tried to point out: it's
> Calc itself that consumes the memory, not the bytes of the file.
>
> Best regards,
> Mathias
>
> --
> Mathias Bauer - OpenOffice.org Application Framework Project Lead
> Please reply to the list only, [EMAIL PROTECTED] is a spam sink.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to