Since it's a CD-ROM, you can get away with a lot of preprocessing on the data. Do all the really heavy lifting before you deliver the project, so that the "live" Flash app doesn't have to.
One straightforward thing you can do is parse the entire collection XML, keeping every article every word appears in, and dumping this list (sorted) as a plaintext file. Something like: aardvark: i302a27, i322a41, i412a2 anchovy: i210a9, i289a31 bezier: i123a4 Where the list format is i<issueNum>a<articleNum>. Then, at run time, your Flash app can (relatively) quickly load your index file into a huge sorted list. Finally, when a search term is entered, you can do a quick binary search on the search term and find all relevant articles. You'll probably need to write your preprocessor in another language, since Flash can't write to local files. You could conceivably write your preprocessor using AIR and Actionscript. Also note that I'm not guaranteeing that a simple list index is best -- I'm just providing one implementation idea off the top of my head that demonstrates the use of preprocessing the data, which I think you must do regardless of your final indexing strategy. On 2/21/08, Glen Pike <[EMAIL PROTECTED]> wrote: > The system can use AS3 - as it is a CDROM. > > I asked about the data size - at the moment, a sample XML file, > generated by an automatic tool is about 500k, gulp. > That means, 6MB per year, 60MB per decade at the moment. > > I have asked to see the file, because there may be a lot of rubbish that > can be eliminated - I hope so.. > > > Glen > > > > > Merrill, Jason wrote: > > First questions to get out of the way is which version of Actionscript > > and potentially how much data (in k)? > > > > Jason Merrill > > Bank of America > > GT&O L&LD Solutions Design & Development > > eTools & Multimedia > > > > Bank of America Flash Platform Developer Community > > > > > > Are you a Bank of America associate interested in innovative learning > > ideas and technologies? > > Check out our internal GT&O Innovative Learning Blog and & subscribe. > > > > > > > > > > > > > > > >>> -----Original Message----- > >>> From: [EMAIL PROTECTED] > >>> [mailto:[EMAIL PROTECTED] On Behalf > >>> Of Glen Pike > >>> Sent: Thursday, February 21, 2008 10:50 AM > >>> To: Flash Coders List > >>> Subject: [Flashcoders] CDROM XML search > >>> > >>> Hi, > >>> > >>> I have been asked to look at a search facility for a > >>> CDROM project. > >>> > >>> The customer is archiving magazines, 1 a month, for a > >>> decade per CD and wants a simple search engine. > >>> > >>> The magazines will be archived as scanned images plus XML > >>> data containing page text content. > >>> > >>> Loading in an XML file and searching / filtering is > >>> pretty easy in principle, but I am guessing I may run into > >>> performance issues as the amount of data is scaled up. > >>> > >>> Google is proving fairly useless today, so has anyone had > >>> much experience of this and have any recommendations. > >>> > >>> Thanks > >>> > >>> Glen > >>> -- > >>> > >>> Glen Pike > >>> 01736 759321 > >>> www.glenpike.co.uk <http://www.glenpike.co.uk> > >>> _______________________________________________ > >>> Flashcoders mailing list > >>> [email protected] > >>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > >>> > >>> > > _______________________________________________ > > Flashcoders mailing list > > [email protected] > > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > > > > > > > > -- > > Glen Pike > 01736 759321 > www.glenpike.co.uk <http://www.glenpike.co.uk> > _______________________________________________ > Flashcoders mailing list > [email protected] > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders > -- Cory Petosky : Lead Developer : PUNY 1618 Central Ave NE Suite 130 Minneapolis, MN 55413 Office: 612.216.3924 Mobile: 240.422.9652 Fax: 612.605.9216 http://www.punyentertainment.com _______________________________________________ Flashcoders mailing list [email protected] http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

