On 02/21/16 23:10, Tom Lane wrote: > Another variable is that your answers might depend on what format you > assume the client is trying to convert from/to. (It's presumably not > text JSON, but then what is it?)
This connects tangentially to a question I've been meaning to ask for a while, since I was looking at the representation of XML. As far as I can tell, XML is simply stored in its character serialized representation (very likely compressed, if large enough to TOAST), and the text in/out methods simply deal in that representation. The 'binary' send/recv methods seem to differ only in possibly using a different character encoding on the wire. Now, also as I understand it, there's no requirement that a type even /have/ binary send/recv methods. Text in/out it always needs, but send/recv only if they are interesting enough to buy you something. I'm not sure the XML send/recv really do buy anything. It is not as if they present the XML in any more structured or tokenized form. If they buy anything at all, it may be only an extra transcoding that the other end will probably immediately do in reverse. So, if that's the situation, is there some other, really simple, choice for what XML send/recv might usefully do, that would buy more than what they do now? Well, PGLZ is in libpqcommon now, right? What if xml send wrote a flag to indicate compressed or not, and then if the value is compressed TOAST, streamed it right out as is, with no expansion on the server? I could see that being a worthwhile win, /without even having to devise some XML-specific encoding/. (XML has a big expansion ratio.) And, since that idea is not inherently XML-specific ... does the JSONB representation have the same properties? How about even text or bytea? The XML question has a related, JDBC-specific part. JDBC presents XML via interfaces that can deal in Source and Result objects, and these come in different flavors (DOMSource, an all-in-memory tree, SAXSource and StAXSource, both streaming tokenized forms, or StreamSource, a streaming, character-serialized form). Client code can ask for one of those forms explicitly, or use null to say it doesn't care. In the doesn't-care case, the driver is expected to choose the form closest to what it's got under the hood; the client can convert if necessary, and if it had any other preference, it would have said so. For PGJDBC, that choice would naturally be the character StreamSource, because that /is/ the form it's got under the hood, but for reasons mysterious to me, pgjdbc actually chooses DOMSource in the don't-care case, and then expends the full effort of turning the serialized stream it does have into a full in-memory DOM that the client hasn't asked for and might not even want. I know this is more a PGJDBC question, but I mention it here just because it's so much like the what-should-send/recv-do question, repeated at another level. -Chap -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers