Xpath looks pretty sweet, anyone used it for simillar data sizes? On 06/01/06, ryanm <[EMAIL PROTECTED]> wrote: > > I find myself in a situation where I need to build a tool to analyse > > lots of xml data. Thousands of records containing a lot of strings as > > well as numericals. > > > When I found myself in this situation I did 2 things: > > 1. Don't use XML, it is way too heavy for this much data. I found that by > using a double-delimeted or fixed-width data format, the file size was > reduced by as much as 70%. In the end, I went with fixed width because I > could parse it faster (by avoiding calling split() thousands of times). > > Now, I still used the XML object, but instead of letting it parse the > file, I overwrote the onData event and used my own parsing function, which > generated objects directly instead of parsing it out to an XML object. > Essentially, the XML object just read the data in and dumped it to my > parsing function. > > 2. Don't try to parse it all at once. What I did was dump it all into a > buffer when it was loaded, and then fire off a parsing function that parsed > 250 records per frame. I found that number through trial and error, you can > find your own balance. The important thing was, the application didn't stop > functioning while the records were being parsed, you could go to other areas > of the app and use it normally, and when you went to the section that > required the data, you got a progress bar showing how many records had been > parsed. > > My parsing function was semi-complicated. It took the whole dataset in > as a string and split it on my record delimiter, and this array became my > buffer. This way I knew how many records there were to parse, and > approximately how long it would take to parse them. It then sliced 250 > records off the top of the buffer on every frame and passed them to the > serialization function, which took them, serialized them, and inserted them > into my "database" object. My parsing function also built several indexes > while it was parsing the records, to make lookups faster once the database > was ready. My application was a database of hotels, which were sortable by a > number of criteria, so the parsing routine looked for those attributes of > each hotel as it parsed, and when it saw a new value for one of those > criteria, it made a new entry in the appropriate index for it. > > I made very heavy use of the object collection syntax, for example: > > Index["Location"]["USA"]["Texas"]["Dallas"] > > ...referred to an array of hotel ids which were in Dallas, Texas, USA, > which could be used to find a hotel like this: > > // 0 is the first index in the array of ids > hotelID = Index["Location"]["USA"]["Texas"]["Dallas"][0]; > return(Database[hotelID]); > > In the end, it took about 5 times as much code to import, parse, and > index the database than the whole rest of the application, but it worked, it > was relatively fast, and it met the requirements I was given. I would've > preferred for it to work from a web server, selecting what I needed from the > database, but the client required that it work offline from a database that > shipped with the cd, as well as be able to download an updated database from > their website, and this was the best solution I could find in Flash that > worked on both PC and Mac (no 3rd party wrappers). Unfortunately it had to > parse the whole database every time you ran the app, but it would get the > newest version from the web if you were online and it gave you the option to > store it (in an ungodly-sized shared object) if you wanted to. > > Anyway, that's how I did it, whether or not it was successful is a > matter of opinion. ;-) > > ryanm > > _______________________________________________ > Flashcoders mailing list > [email protected] > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders >
-- Jonathan Clarke 1976 Ltd http://19seventysix.co.uk e: [EMAIL PROTECTED] m (UK): +44 773 646 1954 m (Barbados): +1246 259 9475 _______________________________________________ Flashcoders mailing list [email protected] http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

