Daniel Carrera wrote:

Randomthots wrote:

So if I have two files... same format... but one is twice as big as the other... the bigger file isn't going to take longer to load?


Irrelevant example. The fact that a bigger file loads slower doesn't mean that the fault is on the size of the tag. There are several things that increase with the size of the file. For example, the number of elements, the complexity of the tree, the amount of content, etc. All of those can cause a slowdown and they are unrelated to the size of the tag.

I was speaking in general terms. Get away from ods and xml for a second and consider two files, jpegs, for example. The bigger file will take longer to process simply because it will take more cycles to work your way through it.



Please understand the difference between data and data structures. If you open a CSV file and immediately save it as OpenDocument you are saving it into a more complex data structure. Just like an n-ary tree is a more complex data structure than a two dimmensional array, regardless of what data you store in them.

I looked at the content.xml. Once you got past the namespace declarations and such, the overall structure was pretty much like this:

<sheet>
        <row>
                <cell>
                <cell>
                .
                .
                .
        <row>
        .
        .
        .
<sheet>
.
.
.

Very much like a table structure in html. I was sort of surprised that there wasn't any indication of row or cell addresses. And other than the style information, which just took on the defaults anyway, it was hard to see where the xml added much information. I understand that it can have more information, but the overall architecture will be the same for any spreadsheet. It can't delve off into 16 dimensions for example, so it can't actually be any deeper than the above.



And are you telling me that the cell, sheet, chart, etc. objects in working memory... the stuff you are actually manipulating when you work with the spreadsheet... aren't the same regardless of the format of the original data file?


I fail to see what this has to do with your argument.

Just that in one case you start with a 2 or 3 MB data file and in the other you start with a 45 MB xml, but you end up with precisely the same information content to manipulate. Now after I add a couple of formulas, pretty it up, draw a graph or two, then csv doesn't work anymore; obviously odf is capable of representing much more than csv.



Statistically, it would be unlikely if the we were talking about a difference a couple MB. But 45 MB is a substantial fraction of 256.


But here's where you're making silly claims. The fact that unzipping the file produces a 45MB XML set of files doesn't mean that when it's loaded into memmory it will actually take up 45MB. It won't.

It has to if you don't write the unzipped file to disc first. Where else is it going to go?

When you load an
XML file into memmory, XML tags are replaced by a pointer structure.

But not until you actually have the unzipped XML to start chewing on.

This goes back to the example of compiled software. It's just like, when you compile software, variable names are replaced by pointers and the size of the binary is not affected by the size of the variables. In a similar way, when you read an XML file, the tags are replaced by pointers, and the size of the XML tag does not affect the size of the binary data stored in RAM.

But before you can get to the binary data you have to have the raw XML to process.


Please read up on data structures. Find out what an n-ary tree is and what an array is.

I know what n-ary trees and arrays are. I was working with them (arrays anyway) on what passed for a desktop computer back in '85. 16K of RAM, no hard drive, and just a BASIC interpreter in ROM.


Another question: Is the XML processed in a serial fashion? Is it necessary to hold the entire file in memory to parse it?


In theory it's not necessary, but in practice most content is in the same place (content.xml) which puts a bit of a limit on how you can optimize the parsing. For example, if all you wanted was to extract the author of the document, I could write a program that could get that information lighting fast, regardless of the size of your document. But most of the time that's not what you want, you want to actually load the document contents into the application.

So you finally admit that the raw XML (content.xml, which is like 99% of this file) file has to reside in memory while you build the internal data structure that the program actually uses? That 45 MB has to sit there while the program does whatever it does, walking through it, to get to the point where you can use it. Finally, then it can unload that from memory.


If I had the time I would. Unfortunately, I have to study for certification exams and wade through some mostly useless labs for Advanced Switching and Network Security classes. You see I'm not technically illiterate;


What year are you in?

Depends on how you count, I suppose. End of the second year of classes. I attended during the summer of '04 and I'll be done with classes in May. In the end I'll have a Master's in Telecommunications and Information Networking plus Cisco Network Professional, Wireless, and Network Security certs. Bring it on Verizon! :)


telling me how silly and stupid I am.


I never said you're stupid. I said you said some very silly things.

Still unnecessary and not very nice. For example, Ian made a comment the other day that calendars and email don't have much to do with each other. I could have said that was silly given that Sunbird, Evolution, and Outlook all have a button or menu item that says "Send Invitation by Email". You know, for people that aren't on you Ical server. But I resisted. Ooops... sorry, I guess I just did it, huh?


I'm not sure I like you very much anymore.


My goal in life is not that you like me or dislike me.


Then it's kind of hard to fail in that regard, huh.  ;)

--

Rod


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to