On the other hand, if you want it to be Unicode correct, you could use my ElfData plugin's function Scan_NextUTF8 <http:// www.elfdata.com/plugin/technicalref/ElfDataMCat12.html#Scan_NextUTF8>.

I wouldn't trust split to be fast on Unicode, especially with large arrays.

How big are the files though? A few MB? KB? Or hundreds of MB? The strategy depends on the size. Assuming "not too big" files, you can just read the entire file in and process it using ElfData.Scan_NextUTF8 :)

Once you start getting into files that don't comfortably fit in your RAM caches, it's time to start reading in chunks which requires a more sophisticated approach, to read small enough chunks to use split () on again, so you wouldn't need my plugin for that.

--
http://elfdata.com/plugin/



_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to