There are some fairly gigantic XML files (the largest is around 30MB, and 
there are as many as 40 per directory).

They are encoded UTF-8, but unfortunately have some non UTF-8 characters, 
which should be replaced with the correct UTF-8 character codes.

(em-dash should be &emdash; or — etc.)

Anybody have experience with treating giant XML files as streams, operating 
on them (ideally in the manner described) and writing the correct version 
of the file, to disk?

Thanks!

-- 
Job board: http://jobs.nodejs.org/
New group rules: 
https://gist.github.com/othiym23/9886289#file-moderation-policy-md
Old group rules: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
--- 
You received this message because you are subscribed to the Google Groups 
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/nodejs/f54222a5-7dcc-4cf0-b1bf-27411f357ca6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to