gregarican wrote: > Am I missing something? I don't read where the poster mentioned the > operation as being CPU intensive. He does mention that the entirety of > a 10 GB file cannot be loaded into memory. If you discount physical > swapfile paging and base this assumption on a "normal" PC that might > have maybe 1 or 2 GB of RAM is his assumption that out of line?
Indeed. The complaint is fairly obvious from the title of the thread. Now, if the complaint was specifically about the size of the minidom representation in memory, perhaps a more efficient representation could be chosen by using another library. Even so, the size of the file being processed is still likely to be pretty big, considering various observations and making vague estimates: http://effbot.org/zone/celementtree.htm For many people, an XML file of, say, 600MB would still be quite a load on their "home/small business edition" computer if you had to load the whole file in and then work on it, even just as a text file. Of course, approaches where you can avoid keeping a representation of the whole thing around would be beneficial, and as mentioned previously in a thread on large XML files, there's always the argument that some kind of database system should be employed to make querying more efficient if you can't perform some kind of sequential processing. Paul -- http://mail.python.org/mailman/listinfo/python-list