There are some fairly gigantic XML files (the largest is around 30MB, and there are as many as 40 per directory).
They are encoded UTF-8, but unfortunately have some non UTF-8 characters, which should be replaced with the correct UTF-8 character codes. (em-dash should be &emdash; or — etc.) Anybody have experience with treating giant XML files as streams, operating on them (ideally in the manner described) and writing the correct version of the file, to disk? Thanks! -- Job board: http://jobs.nodejs.org/ New group rules: https://gist.github.com/othiym23/9886289#file-moderation-policy-md Old group rules: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines --- You received this message because you are subscribed to the Google Groups "nodejs" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/nodejs/f54222a5-7dcc-4cf0-b1bf-27411f357ca6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
