Re: [HACKERS] about google summer of code 2016

Álvaro Hernández Tortosa Tue, 22 Mar 2016 17:20:07 -0700


On 22/02/16 23:23, Álvaro Hernández Tortosa wrote:



On 22/02/16 05:10, Tom Lane wrote:

Heikki Linnakangas <hlinn...@iki.fi> writes:

On 19/02/16 10:10, Ã�lvaro HernÃ¡ndez Tortosa wrote:

Oleg and I discussed recently that a really good addition to a GSoC
item would be to study whether it's convenient to have a binary
serialization format for jsonb over the wire.

Seems a bit risky for a GSoC project. We don't know if a different
serialization format will be a win, or whether we want to do it in the
end, until the benchmarking is done. It's also not clear what we're
trying to achieve with the serialization format: smaller on-the-wire
size, faster serialization in the server, faster parsing in the client,
or what?

Another variable is that your answers might depend on what format you
assume the client is trying to convert from/to.  (It's presumably not
text JSON, but then what is it?)

As I mentioned before, there are many well-known JSONserialization formats, like:


- http://ubjson.org/
- http://cbor.io/
- http://msgpack.org/
- BSON (ok, let's skip that one hehehe)
- http://wiki.fasterxml.com/SmileFormatSpec


Having said that, I'm not sure that risk is a blocking factor here.
History says that a large fraction of our GSoC projects don't result
in a commit to core PG.  As long as we're clear that "success" in this
project isn't measured by getting a feature committed, it doesn't seem
riskier than any other one.  Maybe it's even less risky, because there's

less of the success condition that's not under the GSoC student'scontrol.

I wanted to bring an update here. It looks like someone did theexpected benchmark "for us" :)


https://eng.uber.com/trip-data-squeeze/    (thanks Alam for the link)

While this is Uber's own test, I think the conclusions are quitesignificant: an encoding like message pack + zlib requires only 14% ofthe size and encodes+decodes in 76% of the time of JSON. There are ofcourse other contenders that trade better encoding times over slightlyslower decoding and bigger size. But there are very interesting numberson this benchmark. MessagePack, CBOR and UJSON (all + zlib) look likereally good options.

So now that we have this data I would like to ask these questionsto the community:


- Is this enough, or do we need to perform our own, different benchmarks?

- If this is enough, and given that we weren't elected for GSoC, isthere interest in the community to work on this nonetheless?

- Regarding GSoC: it looks to me that we failed to submit in time. Isthis what happened, or we weren't selected? If the former (and nocriticism here, just realizing a fact) what can we do next year to avoidthis happening again? Is anyone "appointed" to take care of it?



    Álvaro

--
Álvaro Hernández Tortosa


-----------
8Kdata



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] about google summer of code 2016

Reply via email to