The content-length Is bytes. Sounds like your client is sending a character count instead.
On Wednesday, June 8, 2011, MK <[email protected]> wrote: > On Wed, 8 Jun 2011 12:35:57 -0400 > Paul Davis <[email protected]> wrote: >> On Wed, Jun 8, 2011 at 12:32 PM, MK <[email protected]> wrote: >> > Is there any intention to fix couch's handling of "unusual" unicode >> > characters? One of the "unusual" characters is the right single >> > quote (226,128,153) which is a valid utf8 character and also not >> > very "unusual" IMO. > >> What version of CouchDB are you using and what is an actual request >> look like? > > 1.0.2 built a few weeks ago. > > I tried to replicate this simply using curl PUT and a copy of the > request dumped from node, that works okay. Ie, yep, couch deals with > the multi-byte, and it is in the stdout csv decimal dump. > > So I took the csv decimal dump from couch in debug mode, turned it back > into bytes, and diff'd it with the request. > > The difference: the last couple of bytes are not in the couch csv dump, > such as the closing }, which would make the json invalid. Otherwise it > is identical to the curl request, which goes through. > > Watching the transfer on wireshark, however, couch does receive those > last few bytes, so *it was not truncated by me or node*. > > Go figure. > >> A recent check on trunk shows both decoders handle your case fine: > > I have no idea what decoders you are referring to. Anyway, for > posterity, here's the issue: > > - Client sends utf8 data to node. > - Node passes data on to couch via http (Content-type is > application/x-www-form-urlencoded, identical to that used by curl). > - Couch rejects data with multi-byte character, csv decimal dump is > missing bytes that were in the transmission. > > But even to me this sounds dubious, considering an identical request > from curl is fine...all I can say is that what makes a difference is a > switch with this in node: > > case "\u2019": rv += "’"; > > That's the last thing I do before the PUT. If I leave the multi-byte > in, there's an issue. > > MK > > -- > "Enthusiasm is not the enemy of the intellect." (said of Irving Howe) > "The angel of history[...]is turned toward the past." (Walter Benjamin) > > -- Mark J. Reed <[email protected]>
