Hi folks, I was having a problem importing json data with COPY. Lots of things export data nicely as one json blob per line. This is excellent for directly importing into a JSON/JSONB column for analysis. ...Except when there’s an embedded doublequote. Or anything that’s escaped. COPY handles this, but by the time the escaped char hit the JSON parser, it's not escaped anymore. This breaks the JSON parsing. This means I need to manipulate the input data to double-escape it. See bug #12320 for an example. Yuck. I propose this small patch that simply allows specifying COPY … ESCAPE without requiring the CSV parser. It will make it much easier to directly use json formatted export data for folks going forward. This seemed like the simplest route. Usage is simply: postgres=# copy t1 from '/Users/nok/Desktop/queries.json'; ERROR: invalid input syntax for type json DETAIL: Token "root" is invalid. CONTEXT: JSON data, line 1: ...1418066241619 AND <=1418671041621) AND user:"root... COPY t1, line 3, column bleh: "{"timestamp":"2014-12-15T19:17:32.505Z","duration":7.947,"query":{"query":{"filtered":{"filter":{"qu..." postgres=# copy t1 from '/Users/nok/Desktop/queries.json' escape ''; COPY 1966 I’ve included regression tests, and all existing tests pass. This is my first contribution, so be kind to me. :) |
escape-without-csv.patch
Description: Binary data
|