Another two things I'm struggling to understand in PCF are the strings and
integers rules
- [STRINGS] For all JSON string literals in the schema text, replace any
escaped characters (e.g., \uXXXX escapes) with their UTF-8 equivalents.
- [INTEGERS] Eliminate quotes around and any leading zeros in front of
JSON integer literals (which appear in the sizeattributes of fixed
schemas).
These are clear enough on their faces, but I can't come up with a valid
test case for either one.
For strings, once you apply the strip rule, there don't seem to be any
parts left that could contain a Unicode escape. Names, for example, have a
very limited set of characters they can contain.
For integers, the allowed field types are literal numbers anyway, so I
don't see how they could have quotes around them, and I'd expect every
language implementation of json to remove leading zeroes before avro gets
close.
Can someone help me figure out how to test the Python implementation of PCF
with valid schema test cases for these rules?