Re: [Flashcoders] CDROM XML search

Cory Petosky Thu, 21 Feb 2008 09:37:22 -0800

Since it's a CD-ROM, you can get away with a lot of preprocessing on
the data. Do all the really heavy lifting before you deliver the
project, so that the "live" Flash app doesn't have to.


One straightforward thing you can do is parse the entire collection
XML, keeping every article every word appears in, and dumping this
list  (sorted) as a plaintext file. Something like:

aardvark: i302a27, i322a41, i412a2
anchovy: i210a9, i289a31
bezier: i123a4

Where the list format is i<issueNum>a<articleNum>. Then, at run time,
your Flash app can (relatively) quickly load your index file into a
huge sorted list. Finally, when a search term is entered, you can do a
quick binary search on the search term and find all relevant articles.

You'll probably need to write your preprocessor in another language,
since Flash can't write to local files. You could conceivably write
your preprocessor using AIR and Actionscript.

Also note that I'm not guaranteeing that a simple list index is best
-- I'm just providing one implementation idea off the top of my head
that demonstrates the use of preprocessing the data, which I think you
must do regardless of your final indexing strategy.

On 2/21/08, Glen Pike <[EMAIL PROTECTED]> wrote:
> The system can use AS3 - as it is a CDROM.
>
>  I asked about the data size - at the moment, a sample XML file,
>  generated by an automatic tool is about 500k, gulp.
>  That means, 6MB per year, 60MB per decade at the moment.
>
>  I have asked to see the file, because there may be a lot of rubbish that
>  can be eliminated - I hope so..
>
>
>  Glen
>
>
>
>
>  Merrill, Jason wrote:
>  > First questions to get out of the way is which version of Actionscript
>  > and potentially how much data (in k)?
>  >
>  > Jason Merrill
>  > Bank of America
>  > GT&O L&LD Solutions Design & Development
>  > eTools & Multimedia
>  >
>  > Bank of America Flash Platform Developer Community
>  >
>  >
>  > Are you a Bank of America associate interested in innovative learning
>  > ideas and technologies?
>  > Check out our internal  GT&O Innovative Learning Blog and & subscribe.
>  >
>  >
>  >
>  >
>  >
>  >
>  >
>  >>> -----Original Message-----
>  >>> From: [EMAIL PROTECTED]
>  >>> [mailto:[EMAIL PROTECTED] On Behalf
>  >>> Of Glen Pike
>  >>> Sent: Thursday, February 21, 2008 10:50 AM
>  >>> To: Flash Coders List
>  >>> Subject: [Flashcoders] CDROM XML search
>  >>>
>  >>> Hi,
>  >>>
>  >>>    I have been asked to look at a search facility for a
>  >>> CDROM project.
>  >>>
>  >>>    The customer is archiving magazines, 1 a month, for a
>  >>> decade per CD and wants a simple search engine.
>  >>>
>  >>>    The magazines will be archived as scanned images plus XML
>  >>> data containing page text content.
>  >>>
>  >>>    Loading in an XML file and searching / filtering is
>  >>> pretty easy in principle, but I am guessing I may run into
>  >>> performance issues as the amount of data is scaled up.
>  >>>
>  >>>    Google is proving fairly useless today, so has anyone had
>  >>> much experience of this and have any recommendations.
>  >>>
>  >>>    Thanks
>  >>>
>  >>>    Glen
>  >>> --
>  >>>
>  >>> Glen Pike
>  >>> 01736 759321
>  >>> www.glenpike.co.uk <http://www.glenpike.co.uk>
>  >>> _______________________________________________
>  >>> Flashcoders mailing list
>  >>> [email protected]
>  >>> http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
>  >>>
>  >>>
>  > _______________________________________________
>  > Flashcoders mailing list
>  > [email protected]
>  > http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
>  >
>  >
>  >
>
>  --
>
>  Glen Pike
>  01736 759321
>  www.glenpike.co.uk <http://www.glenpike.co.uk>
>  _______________________________________________
>  Flashcoders mailing list
>  [email protected]
>  http://chattyfig.figleaf.com/mailman/listinfo/flashcoders
>


-- 
Cory Petosky : Lead Developer : PUNY
1618 Central Ave NE Suite 130
Minneapolis, MN 55413
Office: 612.216.3924
Mobile: 240.422.9652
Fax: 612.605.9216
http://www.punyentertainment.com
_______________________________________________
Flashcoders mailing list
[email protected]
http://chattyfig.figleaf.com/mailman/listinfo/flashcoders

Re: [Flashcoders] CDROM XML search

Reply via email to