[issue33143] encode UTF-16 generates unexpected results
Anders Rundgren added the comment: Thanx for the superquick response! I really appreciate it. I'm obviously a Python n00b Anders -- ___ Python tracker <https://bugs.python.org/issue33143> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33143] encode UTF-16 generates unexpected results
New submission from Anders Rundgren : Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> v = '\u20ac' >>> print (v) € >>> v.encode('utf-16') b'\xff\xfe\xac ' >>> v.encode('utf-16_be') b' \xac' I had expected to get pair of bytes with 20 AC for the € symbol -- components: Unicode messages: 314443 nosy: anders.rundgren@gmail.com, ezio.melotti, vstinner priority: normal severity: normal status: open title: encode UTF-16 generates unexpected results type: behavior versions: Python 3.5 ___ Python tracker <https://bugs.python.org/issue33143> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26229] Make number serialization ES6/V8 compatible
Anders Rundgren added the comment: In ES6/V8-compatible implementations which include "node.js", Chrome, Firefox, Safari and (of course) my Java reference implementation you can take a cryptographic hash of a JSON object with a predictable result. That is, this request is in no way limited to JCS. Other solutions to this problem has been to create something like XML's canonicalization which is much more complex. The JSON RFC is still valid, it just isn't very useful for people who are interested in security solutions. The predictable property order introduced in ES6 makes a huge difference! Now it is just the number thing left... The other alternative is dressing your JSON objects in Base64 to maintain a predictable signature like in IETF's JOSE. I doubt that this is going to be mainstream except for OpenID/OAuth which JOSE stems from. -- ___ Python tracker <http://bugs.python.org/issue26229> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26229] Make number serialization ES6/V8 compatible
Anders Rundgren added the comment: An easier fix than mucking around in the pretty complex number serializer code would be adding an "ES6Format" option to the "json.dump*" methods which could use the supplied conversion code as is. For JSON parsing in an ES6-compatible way you must anyway use an "OrderedDict" hook option to get the right (=original) property order. -- ___ Python tracker <http://bugs.python.org/issue26229> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26241] repr() and str() are identical for floats in 3.5
Anders Rundgren added the comment: Apparently the docs have changed since 2.7: https://docs.python.org/3.5/tutorial/floatingpoint.html However, the documentation still "sort of" mentions repr() as the most accurate form which isn't entirely correct since it nowadays is identical to str() for floats. No big deal, I just thought I was doing something wrong :-) related: http://bugs.python.org/issue26229 -- ___ Python tracker <http://bugs.python.org/issue26241> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26229] Make number serialization ES6/V8 compatible
Anders Rundgren added the comment: As I said, the problem is close to fixed in 3.5. You should not consider the JCS specification as the [sole] target but the ability to creating a normalized JSON object which has many uses including calculating a hash of such objects. ## # Convert a Python double/float into an ES6/V8 compatible string # ## def convert2Es6Format(value): # Convert double/float to str using the native Python formatter pyDouble = str(value) pySign = '' if pyDouble.find('-') == 0: # # Save sign separately, it doesn't have any role in the rest # pySign = '-' pyDouble = pyDouble[1:] pyExpStr = '' pyExpVal = 0 q = pyDouble.find('e') if q > 0: # # Grab the exponent and remove it from the number # pyExpStr = pyDouble[q:] if pyExpStr[2:3] == '0': # # Supress leading zero on exponents # pyExpStr = pyExpStr[0:2] + pyExpStr[3:] pyDouble = pyDouble[0:q] pyExpVal = int(pyExpStr[1:]) # # Split number in pyFirst + pyDot + pyLast # pyFirst = pyDouble pyDot = '' pyLast = '' q = pyDouble.find('.') if q > 0: pyDot = '.' pyFirst = pyDouble[0:q] pyLast = pyDouble[q + 1:] # # Now the string is split into: pySign + pyFirst + pyDot + pyLast + pyExpStr # if pyLast == '0': # # Always remove trailing .0 # pyDot = '' pyLast = '' if pyExpVal > 0 and pyExpVal < 21: # # Integers are shown as is with up to 21 digits # pyFirst += pyLast pyLast = '' pyDot = '' pyExpStr = '' q = pyExpVal - len(pyFirst) while q >= 0: q -= 1; pyFirst += '0' elif pyExpVal < 0 and pyExpVal > -7: # # Small numbers are shown as 0.etc with e-6 as lower limit # pyLast = pyFirst + pyLast pyFirst = '0' pyDot = '.' pyExpStr = '' q = pyExpVal while q < -1: q += 1; pyLast = '0' + pyLast # # The resulting sub-strings are concatenated # return pySign + pyFirst + pyDot + pyLast + pyExpStr -- ___ Python tracker <http://bugs.python.org/issue26229> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26241] repr() and str() are identical for floats in 3.5
New submission from Anders Rundgren: According to the documentation repr() and str() are different when it comes to number formatting. A test with a 100 million random and selected IEEE 64-bit values returned no differences -- components: Interpreter Core messages: 259244 nosy: anders.rundgren@gmail.com priority: normal severity: normal status: open title: repr() and str() are identical for floats in 3.5 versions: Python 3.5 ___ Python tracker <http://bugs.python.org/issue26241> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26229] Make number serialization ES6/V8 compatible
New submission from Anders Rundgren: ECMA has in their latest release defined that JSON elements must be ordered during serialization. This is easy to accomplish using Python's OrderedDict. What is less trivial is that numbers have to be formatted in a certain way as well. I have tested 100 millions specific and random values and found out that Python 3.5.1 is mathematically identical to ES6/V8 but has some differences in formatting: IEEE DoubleECMAScript 6/V8Python 3.5.1 c43211ede4974a35, -0,-3.333e+20 c3fce97ca0f21056, -6000, -3.3336e+19 c3c7213080c1a6ac, -3334000, -3.334e+18 c39280f39a348556, -333400, -3.334e+17 c35d9b1f5d20d557, -33340,-3.334e+16 c327af4c4a80aaac, -3334, -3334.0 bf0179ec9cbd821e, -0.5, -3.3335e-05 becbf647612f3696, -0.03, -3.e-06 4024, 10,10.0 , 0, 0.0 4014, 5, 5.0 3f0a36e2eb1c432d, 0.5, 5e-05 3ed4f8b588e368f1, 0.05, 5e-06 3ea0c6f7a0b5ed8d, 5e-7, 5e-07 Why could this be important? https://github.com/Microsoft/ChakraCore/issues/149 # Python test program import binascii import struct import json f = open('c:\\es6\\numbers\\es6testfile100m.txt','rb') l = 0; string = ''; while True: byte = f.read(1); if len(byte) == 0: exit(0) if byte == b'\n': l = l + 1; i = string.find(',') if i <= 0 or i >= len(string) - 1: print('Bad string: ' + str(i)) exit(0) hex = string[:i] while len(hex) < 16: hex = '0' + hex o = dict() o['double'] = struct.unpack('>d',binascii.a2b_hex(hex))[0] py3Double = json.dumps(o)[11:-1] es6Double = string[i + 1:] if es6Double != py3Double: es6Dpos = es6Double.find('.') py3Dpos = py3Double.find('.') es6Epos = es6Double.find('e') py3Epos = py3Double.find('e') if py3Epos > 0: py3Exp = int(py3Double[py3Epos + 1:]) if es6Dpos < 0 and py3Dpos > 0: if es6Epos < 0 and py3Epos > 0: py3New = py3Double[:py3Dpos] + py3Double[py3Dpos + 1:py3Epos - len(py3Double)] q = py3Exp - py3Epos + py3Dpos while q >= 0: py3New += '0' q -= 1 if py3New != es6Double: print('E1: ' + py3New) exit(0) elif py3Epos < 0: py3New = py3Double[:-2] if py3New != es6Double: print('E2: ' + py3New) exit(0) else: print (error + hex + '#' + es6Double + '#' + py3Double) exit(0) elif es6Dpos > 0 and py3Dpos > 0 and py3Epos > 0 and es6Epos < 0: py3New = py3Double[py3Dpos - 1:py3Dpos] + py3Double[py3Dpos + 1:py3Epos - len(py3Double)] q = py3Exp + 1 while q < 0: q += 1 py3New = '0' + py3New py3New = py3Double[0:py3Dpos - 1] + '0.' + py3New if py3New != es6Double: print('E3: ' + py3New + '#' + es6Double) exit(0) elif es6Dpos == py3Dpos and py3Epos > 0 and es6Epos > 0: py3New = py3Double[:py3Epos + 2] + str(abs(py3Exp)) if py3New != es6Double: print('E4: ' + py3New + '#' + es6Double) exit(0) elif es6Dpos > 0 and py3Dpos < 0 and py3Epos > 0 and es6Epos < 0: py3New = py3Double[:py3Epos - len(py3Double)] q = py3Exp + 1 while q < 0: q += 1 py3New = '0' + py3New py3New = '0.' + py3New if py3New != es6Double: print('E5: ' + py3New + '#' + es6Double) exit(0) else: print ('Unexpected: ' + hex + '#' + es6Double + '#' + py3Double) exit(0) string = '' if l % 1 == 0: print(l) else: string += byte.decode(encoding='UTF-8') -- components: Interpreter Core messages: 259105 nosy: anders.rundgren@gmail.com priority: normal severity: normal status: open title: Make number serialization ES6/V8 compatible type: enhancement versions: Python 3.5 ___ Python tracker <http://bugs.python.org/issue26229> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26191] pip on Windows doesn't honor Case
Anders Rundgren added the comment: You are right. Pardon me for erring :-( Thanks for the quick response BTW! Anders -- ___ Python tracker <http://bugs.python.org/issue26191> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26191] pip on Windows doesn't honor Case
New submission from Anders Rundgren: pip install Crypto Terminates correctly and the package is there as well. Unfortunately the directory is named "crypto" rather than "Crypto" so when I perform >>>import Crypto the interpreter fails. >>>import crypto seems to work but is incompatible over platforms. If this is a problem with pycrypto or pip is beyond my knowledge of python. -- components: Installation messages: 258887 nosy: anders.rundgren@gmail.com priority: normal severity: normal status: open title: pip on Windows doesn't honor Case type: compile error versions: Python 3.5 ___ Python tracker <http://bugs.python.org/issue26191> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: > Antoine Pitrou added the comment: > > I won't claim to know/understand the specifics, but "message payload in > base64" actually sounds reasonable to me, if far from optimal (both from > readibility and space overhead POV) :-). It is indeed a working solution. I do though think that communities that previously used XML would accept base64-encoded messages. It becomes really messy when applied to counter-signed messages like the following: { "@context": "http://xmlns.webpki.org/wcpp-payment-demo";, "@qualifier": "AuthData", "paymentRequest": { "commonName": "Demo Merchant", "amount": 8600550, "currency": "USD", "referenceId": "#100", "dateTime": "2014-12-18T13:39:35Z", "signature": { "algorithm": "RS256", "signerCertificate": { "issuer": "CN=Merchant Network Sub CA5,C=DE", "serialNumber": "1413983542582", "subject": "CN=Demo Merchant,2.5.4.5=#1306383936333235,C=DE" }, "certificatePath": [ "MIIDQzCCAiugAwIBAgIGAUk3_J02M...eMGlY734U3NasQfAhTUhxrdDbphEvsWTc", "MIIEPzCCAiegAwIBAgIBBTANBgkqh...gU1IyRGA7IbdHOeDB2RUpsXloU2QKfLrk" ], "value": "Ny4Qe6FQhd5_qcSc3xiH8Kt7tIZ9Z...9LEjC6_Rulg_G20fGxJ-wzezFpsAGbmuFQg" } }, "domainName": "merchant.com", "cardType": "SuperCard", "pan": "1618342124919252", "dateTime": "2014-12-18T13:40:59Z", "signature": { "algorithm": "RS256", "signerCertificate": { "issuer": "CN=Mybank Client Root CA1,C=US", "serialNumber": "1413983550045", "subject": "CN=The Cardholder,2.5.4.5=#13083935363733353232" }, "certificatePath": ["MIIENzCCAh-gAwIBAgIGAUk3_LpdM...IGcN1md5feo5DndNnV8D0UM-oBRkUDDFiWlhCU"], "value": "wyUcFcHmvM5ZozZKOEwOQkYic0D7M...S_HbaPGau5KfZjCaksvb5U1lYZaXxP8kAbuGPQ" } } -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: > Antoine Pitrou added the comment: > > "To cope with this potential problem, compliant parsers must preserve the > original textual representation of properties internally in order to support > JCS normalization requirements" > > That sounds ridiculous. Did someone try to reason the "IETF guys"?:) The alternative is either doing what Bob suggested which is almost the same as writing a new parser or take the IETF route and shroud the message payload in base64. So all solutions are "by definition" bd :-) FWIW my super-bad solution has the following compatibility issues: - Whitespace: None, all parsers can "stringify", right? - Escaping: None, all parsers MUST do it to follow the JSON spec. - Property order: A problem in some parsers. If you take a look on stackoverflow lots of folks request that insertion/reader order should be honored since computers <> humans. Fixed in Python. Works in browsers as well. - Floating point: an almost useless JSON feature anyway, it doesn't work for crypto-numbers or money. It is "only" a validation problem though. Now fixed in Python. http://www.ietf.org/mail-archive/web/acme/current/msg00200.html -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: Ethan Furman added the comment: > I am not a regular json user, but my impression is the format is > pretty basic, and we would be overloading it to try and keep numbers > with three decimal places as Decimal, and anything else as float. > Isn't json's main purpose to support data exchange between different > programs of different languages? Not between different Python > programs? Right, unfortunately the need to support non-native data types like big decimals, dates and blobs have lead to a certain amount of confusion and innovation among JSON tool designers. I (FWIW) do actually NOT want to extend a single bit from the RFC, I just want serializing to be "non-invasive". If the parse_float option stays "as is" it seems that both the people who want big (non-standard) numbers and I who want somewhat non-standard serialization would be happy. I.e. a documentation snippet would be sufficient as far as I can tell. Serialization order of objects is apparently a hot topic https://code.google.com/p/v8/issues/detail?id=164 but Python has no problem with that. -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: The current JCS validator is only 150 lines and does both RSA and EC signatures: https://code.google.com/p/openkeystore/source/browse/python/trunk/src/org/webpki/json/JCSValidator.py My Java-version is much more advanced but this is quite useful anyway -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: Bob, I'm not sure I understand why you say that JCS requires *almost* full normalization. Using browsers you can generate fully compliant JCS objects using like 20 lines of javascript/webcrypto (here excluding base64 support). No normalization step is needed. But sure, the IETF JOSE WG has taken an entirely different approach and require JSON objects to be serialized and Base64-encoded. Then the Base64 is signed. Boring. And in conflict with complex messaging like: https://openkeystore.googlecode.com/svn/wcpp-payment-demo/trunk/docs/messages.html#UserAuthorizesTransaction Thanx anyway, I'm pretty happy with how it works now! Well, if Decimal didn't manipulate its argument I would be even happier :-) because then there wouldn't even be a hack. -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: Using simplejson I got it to work!!! I just wonder what you think of the solution: import collections import simplejson as json from decimal import Decimal class EnhancedDecimal(Decimal): def __str__ (self): return self.saved_string def __new__(cls, value="0", context=None): obj = Decimal.__new__(cls,value,context) obj.saved_string = value return obj; jsonString = '{"t":6,"h":4.50, "g":"text","j":1.40e450}' jsonObject = json.loads(jsonString, object_pairs_hook=collections.OrderedDict,parse_float=EnhancedDecimal) for item in jsonObject: print jsonObject[item] print json.dumps(jsonObject) 6 4.50 text 1.40e450 {"t": 6, "h": 4.50, "g": "text", "j": 1.40e450} -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: Well, I could have insisted on canonicalization of floating-point data but that's so awkward that outlawing such data is a cleaner approach. Since the target for JCS is security- and payment-protocols, I don't think the absence of floating-point support will be a show-stopper. I does though make the IETF folks unhappy. Another reason for still wanting it to work as currently specified is because it would be nice to have JCS running on three fully compatible platforms, including one which I haven't designed :-) -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: Bob, Your'e right, I have put up a requirement for JSON serializing that may be "over the top". OTOH, there are (AFAICT...) only two possible solutions: 1. Outlaw floating point data from the plot 2. Insist that serializers conform to the spec As a pragmatic I have settled on something in between :-) https://openkeystore.googlecode.com/svn/resources/trunk/docs/jcs.html#Interoperability I don't think that the overhead in Decimal would be a problem but I'm not a Python platform maintainer so I leave it to you guys. -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: I guess my particular requirement/wish is unusual (keeping the original textual representation of a floating point number intact) while using Decimal should be fairly universal. If these things could be combined in a Decimal support option I would (of course) be extremely happy. They do not appear to be in conflict. Currently I'm a bit bogged down by the crypto-stuff since it is spread over different and incompatible modules which makes it awkward creating a nice unified RSA/EC solution. I may end-up writing a wrapper... -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: It would be great if I could use a sub-classed Decimal during parsing but since it doesn't appear to be a way to serialize the result using the "json" package I'm probably stuck with the current "99%" solution. I have solved this in Java and JavaScript by writing my own JSON stuff http://webpki.org/papers/keygen2/doc/org/webpki/json/package-summary.html but that method obviously doesn't scale and I'm a real n00b when it comes to Python although it was more fun than I had expected :-) A minor patch addressing serialization of Decimal would probably do fine (after sub-classing) and would be generally useful. -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
Anders Rundgren added the comment: I was actually hoping to implement the final part of this: https://openkeystore.googlecode.com/svn/resources/trunk/docs/jcs.html#Normalization_and_Signature_Validation It seems that the current Decimal implementation wouldn't save me anyway since it modifies the input :-( Anyway, floats in JSON have rather little use so maybe my existing Pyhton (PoC) solution will be "good enough": https://code.google.com/p/openkeystore/source/browse/python/trunk/src/org/webpki/json/JCSValidator.py -- ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23123] Only READ support for Decimal in json
New submission from Anders Rundgren: jsonString = '{"t":6,"h":4.50, "g":"text","j":1.40e450}' jsonObject = json.loads(jsonString, object_pairs_hook=collections.OrderedDict,parse_float=Decimal) for item in jsonObject: print jsonObject[item] 6 4.50 text 1.40E+450 Works as expected. However, there seems to be no way to get back to the original JSON string as far as I can tell since you have to convert Decimal to str in cls when using json.dumps which adds "" around the arguments -- components: Extension Modules messages: 233139 nosy: anders.rundgren@gmail.com priority: normal severity: normal status: open title: Only READ support for Decimal in json type: behavior versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue23123> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com