[ 
https://issues.apache.org/jira/browse/COUCHDB-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13573190#comment-13573190
 ] 

Paul Joseph Davis commented on COUCHDB-1670:
--------------------------------------------

Jason and Jens are right here, although I do find it a bit surprising that we 
actually have an issue here given how erlang treats numbers. My only guess is 
that we have a guard for is_integer/1 instead of is_number/1 which would badarg 
on the parsed value (mochijson2 at least would parse that as a float).

Couple minor comments on discussion:

[~snej] has it right in that we can't expect that JSON will roundtrip 
byte-for-byte when we have an intermediary translation into an Erlang 
representation. We already rely on them facts so that we can tell people to 
sshh when we mutate number representations.

[~jhs] Is kinda right but its not just a series of numerals, though its not 
much more than "looks like a valid number". While the encoding differences 
aren't quite white space difference levels, they are definitely in below the 
threshold of what we should tolerate, especially considering what we're using 
them for.

I also have no idea what [~jhs] is talking about with whitespace in the key. If 
there's truth to that then it sounds like a bug and not just "merely" a json 
encoding difference.

[~jhs] is also quoting Postel's law which is a crock and I have spent much time 
trying to quash the influence of that terrible idea in the project. The number 
of times I've gotten pissed trying to remember if its descending=true or 
reverse=true and checking if I have typos is annoyingly non-zero.

[~wohali] is also right in the generic sense that since (hehehe) should not be 
restricted to a numerical value and if we didn't have what appear to be laten 
bugs based on that assumption this probably wouldn't even be an issue.

And if y'all want to spend more time on this, start investigating round 
tripping the value 1.1 through a JSON decoder/encoder pair. I'll be here with 
the tissues when you get to asserting 56bit rounding precisions with the GNU 
libc strtod assumptions.
                
> Replicator crashes if numbers in checkpoint docs are expressed in scientific 
> notation
> -------------------------------------------------------------------------------------
>
>                 Key: COUCHDB-1670
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1670
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>            Reporter: Jens Alfke
>
> The CouchDB 1.2 replicator process crashes with an Erlang exception when 
> parsing a checkpoint document read back from a remote database, if numbers in 
> the document were JSON-encoded in scientific notation instead of as integers. 
> This includes the properties source_last_seq, end_last_seq, start_last_seq.
> That is, the following encoding works fine:
>     ..., "source_last_seq": 1234567, ...
> whereas this completely-equivalent encoding causes an exception:
>     ..., "source_last_seq": 1.234567e+06, ...
> This issue raised its head as a result of a CouchDB-compatible engine I'm 
> writing (the Couchbase Sync Gateway) which can serve as a passive replication 
> endpoint. It's implemented in Go, and the Go JSON package has the side effect 
> of (a) parsing all JSON numbers into type 'double', and (b) encoding all 
> doubles into JSON using scientific notation if they're more than six digits 
> long. The net effect is that when CouchDB stores a checkpoint into the Sync 
> Adapter's database and then later reads it back, it barfs due to the 
> scientific notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to