Daniel Carrera wrote:
Randomthots wrote:
So if I have two files... same format... but one is twice as big as
the other... the bigger file isn't going to take longer to load?
Irrelevant example. The fact that a bigger file loads slower doesn't
mean that the fault is on the size of the tag. There are several things
that increase with the size of the file. For example, the number of
elements, the complexity of the tree, the amount of content, etc. All of
those can cause a slowdown and they are unrelated to the size of the tag.
I was speaking in general terms. Get away from ods and xml for a second
and consider two files, jpegs, for example. The bigger file will take
longer to process simply because it will take more cycles to work your
way through it.
Please understand the difference between data and data structures. If
you open a CSV file and immediately save it as OpenDocument you are
saving it into a more complex data structure. Just like an n-ary tree is
a more complex data structure than a two dimmensional array, regardless
of what data you store in them.
I looked at the content.xml. Once you got past the namespace
declarations and such, the overall structure was pretty much like this:
<sheet>
<row>
<cell>
<cell>
.
.
.
<row>
.
.
.
<sheet>
.
.
.
Very much like a table structure in html. I was sort of surprised that
there wasn't any indication of row or cell addresses. And other than the
style information, which just took on the defaults anyway, it was hard
to see where the xml added much information. I understand that it can
have more information, but the overall architecture will be the same for
any spreadsheet. It can't delve off into 16 dimensions for example, so
it can't actually be any deeper than the above.
And are you telling me that the cell, sheet, chart, etc. objects in
working memory... the stuff you are actually manipulating when you
work with the spreadsheet... aren't the same regardless of the format
of the original data file?
I fail to see what this has to do with your argument.
Just that in one case you start with a 2 or 3 MB data file and in the
other you start with a 45 MB xml, but you end up with precisely the same
information content to manipulate. Now after I add a couple of formulas,
pretty it up, draw a graph or two, then csv doesn't work anymore;
obviously odf is capable of representing much more than csv.
Statistically, it would be unlikely if the we were talking about a
difference a couple MB. But 45 MB is a substantial fraction of 256.
But here's where you're making silly claims. The fact that unzipping the
file produces a 45MB XML set of files doesn't mean that when it's loaded
into memmory it will actually take up 45MB. It won't.
It has to if you don't write the unzipped file to disc first. Where else
is it going to go?
When you load an
XML file into memmory, XML tags are replaced by a pointer structure.
But not until you actually have the unzipped XML to start chewing on.
This goes back to the example of compiled software. It's just like, when
you compile software, variable names are replaced by pointers and the
size of the binary is not affected by the size of the variables. In a
similar way, when you read an XML file, the tags are replaced by
pointers, and the size of the XML tag does not affect the size of the
binary data stored in RAM.
But before you can get to the binary data you have to have the raw XML
to process.
Please read up on data structures. Find out what an n-ary tree is and
what an array is.
I know what n-ary trees and arrays are. I was working with them (arrays
anyway) on what passed for a desktop computer back in '85. 16K of RAM,
no hard drive, and just a BASIC interpreter in ROM.
Another question: Is the XML processed in a serial fashion? Is it
necessary to hold the entire file in memory to parse it?
In theory it's not necessary, but in practice most content is in the
same place (content.xml) which puts a bit of a limit on how you can
optimize the parsing. For example, if all you wanted was to extract the
author of the document, I could write a program that could get that
information lighting fast, regardless of the size of your document. But
most of the time that's not what you want, you want to actually load the
document contents into the application.
So you finally admit that the raw XML (content.xml, which is like 99% of
this file) file has to reside in memory while you build the internal
data structure that the program actually uses? That 45 MB has to sit
there while the program does whatever it does, walking through it, to
get to the point where you can use it. Finally, then it can unload that
from memory.
If I had the time I would. Unfortunately, I have to study for
certification exams and wade through some mostly useless labs for
Advanced Switching and Network Security classes. You see I'm not
technically illiterate;
What year are you in?
Depends on how you count, I suppose. End of the second year of classes.
I attended during the summer of '04 and I'll be done with classes in
May. In the end I'll have a Master's in Telecommunications and
Information Networking plus Cisco Network Professional, Wireless, and
Network Security certs. Bring it on Verizon! :)
telling me how silly and stupid I am.
I never said you're stupid. I said you said some very silly things.
Still unnecessary and not very nice. For example, Ian made a comment the
other day that calendars and email don't have much to do with each
other. I could have said that was silly given that Sunbird, Evolution,
and Outlook all have a button or menu item that says "Send Invitation by
Email". You know, for people that aren't on you Ical server. But I
resisted. Ooops... sorry, I guess I just did it, huh?
I'm not sure I like you very much anymore.
My goal in life is not that you like me or dislike me.
Then it's kind of hard to fail in that regard, huh. ;)
--
Rod
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]