I doubt there's anything wrong with the XML, given the company I'm getting the data from. And I checked already that DB2 (the parser) has no issue in this area. And well over 3 million a day are being parsed/inserted just fine.
I just don't think I'm putting the UTF8 back together correctly on some edge case, or there's some other bug somewhere above my pay grade. Is there any npm package I could use other than the standard node.js http client that might already have this resolved? On Friday, December 28, 2012 9:46:20 AM UTC-5, Ben Noordhuis wrote: > > On Fri, Dec 28, 2012 at 3:08 PM, am_p1 <[email protected]<javascript:>> > wrote: > > I'm using this for sure: > > response.setEncoding('utf8') > > > > but the problem is the chunks can be split more than once and with UTF8 > > strings there doesn't seem to be any character that indicates the buffer > was > > split. I read that JSON responses have the \n you can use but I don't > see > > that anywhere in the XML response I'm receiving. > > > > If node.js doesn't put these back together for me, then I need to figure > out > > what characters are at the end of the buffer to indicate a split chunk > so I > > can put them back together myself. Currently I'm looking for some > strings in > > the XML packet to indicate a complete or incomplete chunk but again, > it's > > not working 100%. > > You may be looking at two separate issues here. > > 1. Partial character sequences. When used as documented, > stream.setEncoding() takes care of that: if the data chunk ends in a > partial sequence, it's not emitted until the next chunk arrives. > > For the curious, the relevant code is in lib/string_decoder.js. > > 2. Partial XML documents. node.js can't help you here, you somehow > need to track that yourself. > > If the server sets a Content-Length header, it's easy: just xml += > data until Buffer.byteLength(xml) equals the content length. Caveat > emptor: repeatedly calling Buffer.byteLength() is not very efficient > but don't worry about that until later. Make it work first, then make > it work fast. > > If the response is sent using chunked encoding, you probably need to > parse it with a SAX parser first. > -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en
