and I left out that the console.log in UTF8 of the small XML packet looks perfectly "well formed" with absolutely no issues. That's why I think it's at a lower level...
On Friday, December 28, 2012 11:42:53 AM UTC-5, am_p1 wrote: > > I doubt there's anything wrong with the XML, given the company I'm getting > the data from. And I checked already that DB2 (the parser) has no issue in > this area. And well over 3 million a day are being parsed/inserted just > fine. > > I just don't think I'm putting the UTF8 back together correctly on some > edge case, or there's some other bug somewhere above my pay grade. > > Is there any npm package I could use other than the standard node.js http > client that might already have this resolved? > > On Friday, December 28, 2012 9:46:20 AM UTC-5, Ben Noordhuis wrote: >> >> On Fri, Dec 28, 2012 at 3:08 PM, am_p1 <[email protected]> wrote: >> > I'm using this for sure: >> > response.setEncoding('utf8') >> > >> > but the problem is the chunks can be split more than once and with UTF8 >> > strings there doesn't seem to be any character that indicates the >> buffer was >> > split. I read that JSON responses have the \n you can use but I don't >> see >> > that anywhere in the XML response I'm receiving. >> > >> > If node.js doesn't put these back together for me, then I need to >> figure out >> > what characters are at the end of the buffer to indicate a split chunk >> so I >> > can put them back together myself. Currently I'm looking for some >> strings in >> > the XML packet to indicate a complete or incomplete chunk but again, >> it's >> > not working 100%. >> >> You may be looking at two separate issues here. >> >> 1. Partial character sequences. When used as documented, >> stream.setEncoding() takes care of that: if the data chunk ends in a >> partial sequence, it's not emitted until the next chunk arrives. >> >> For the curious, the relevant code is in lib/string_decoder.js. >> >> 2. Partial XML documents. node.js can't help you here, you somehow >> need to track that yourself. >> >> If the server sets a Content-Length header, it's easy: just xml += >> data until Buffer.byteLength(xml) equals the content length. Caveat >> emptor: repeatedly calling Buffer.byteLength() is not very efficient >> but don't worry about that until later. Make it work first, then make >> it work fast. >> >> If the response is sent using chunked encoding, you probably need to >> parse it with a SAX parser first. >> > -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en
