I doubt there's anything wrong with the XML, given the company I'm getting 
the data from. And I checked already that DB2 (the parser) has no issue in 
this area. And well over 3 million a day are being parsed/inserted just 
fine.

I just don't think I'm putting the UTF8 back together correctly on some 
edge case, or there's some other bug somewhere above my pay grade.

Is there any npm package I could use other than the standard node.js http 
client that might already have this resolved?

On Friday, December 28, 2012 9:46:20 AM UTC-5, Ben Noordhuis wrote:
>
> On Fri, Dec 28, 2012 at 3:08 PM, am_p1 <[email protected]<javascript:>> 
> wrote: 
> > I'm using this for sure: 
> > response.setEncoding('utf8') 
> > 
> > but the problem is the chunks can be split more than once and with UTF8 
> > strings there doesn't seem to be any character that indicates the buffer 
> was 
> > split. I read that JSON responses have the \n you can use but I don't 
> see 
> > that anywhere in the XML response I'm receiving. 
> > 
> > If node.js doesn't put these back together for me, then I need to figure 
> out 
> > what characters are at the end of the buffer to indicate a split chunk 
> so I 
> > can put them back together myself. Currently I'm looking for some 
> strings in 
> > the XML packet to indicate a complete or incomplete chunk but again, 
> it's 
> > not working 100%. 
>
> You may be looking at two separate issues here. 
>
> 1. Partial character sequences.  When used as documented, 
> stream.setEncoding() takes care of that: if the data chunk ends in a 
> partial sequence, it's not emitted until the next chunk arrives. 
>
> For the curious, the relevant code is in lib/string_decoder.js. 
>
> 2. Partial XML documents.  node.js can't help you here, you somehow 
> need to track that yourself. 
>
> If the server sets a Content-Length header, it's easy: just xml += 
> data until Buffer.byteLength(xml) equals the content length.  Caveat 
> emptor: repeatedly calling Buffer.byteLength() is not very efficient 
> but don't worry about that until later.  Make it work first, then make 
> it work fast. 
>
> If the response is sent using chunked encoding, you probably need to 
> parse it with a SAX parser first. 
>

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to