Re: [HACKERS] about google summer of code 2016
On 23/03/16 01:56, Amit Langote wrote: On 2016/03/23 9:19, Álvaro Hernández Tortosa wrote: - Regarding GSoC: it looks to me that we failed to submit in time. Is this what happened, or we weren't selected? If the former (and no criticism here, just realizing a fact) what can we do next year to avoid this happening again? Is anyone "appointed" to take care of it? See Thom's message here: http://www.postgresql.org/message-id/CAA-aLv6i3jh1H-5UHb8jSB0gMwA9sg_cqw3=MwddVzr=pxa...@mail.gmail.com Thanks, Amit OK, read the thread, thanks for the info :) Álvaro -- Álvaro Hernández Tortosa --- 8Kdata -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On 2016/03/23 9:19, Álvaro Hernández Tortosa wrote: > - Regarding GSoC: it looks to me that we failed to submit in time. Is this > what happened, or we weren't selected? If the former (and no criticism > here, just realizing a fact) what can we do next year to avoid this > happening again? Is anyone "appointed" to take care of it? See Thom's message here: http://www.postgresql.org/message-id/CAA-aLv6i3jh1H-5UHb8jSB0gMwA9sg_cqw3=MwddVzr=pxa...@mail.gmail.com Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On 22/02/16 23:23, Álvaro Hernández Tortosa wrote: On 22/02/16 05:10, Tom Lane wrote: Heikki Linnakangas writes: On 19/02/16 10:10, Ã�lvaro Hernández Tortosa wrote: Oleg and I discussed recently that a really good addition to a GSoC item would be to study whether it's convenient to have a binary serialization format for jsonb over the wire. Seems a bit risky for a GSoC project. We don't know if a different serialization format will be a win, or whether we want to do it in the end, until the benchmarking is done. It's also not clear what we're trying to achieve with the serialization format: smaller on-the-wire size, faster serialization in the server, faster parsing in the client, or what? Another variable is that your answers might depend on what format you assume the client is trying to convert from/to. (It's presumably not text JSON, but then what is it?) As I mentioned before, there are many well-known JSON serialization formats, like: - http://ubjson.org/ - http://cbor.io/ - http://msgpack.org/ - BSON (ok, let's skip that one hehehe) - http://wiki.fasterxml.com/SmileFormatSpec Having said that, I'm not sure that risk is a blocking factor here. History says that a large fraction of our GSoC projects don't result in a commit to core PG. As long as we're clear that "success" in this project isn't measured by getting a feature committed, it doesn't seem riskier than any other one. Maybe it's even less risky, because there's less of the success condition that's not under the GSoC student's control. I wanted to bring an update here. It looks like someone did the expected benchmark "for us" :) https://eng.uber.com/trip-data-squeeze/(thanks Alam for the link) While this is Uber's own test, I think the conclusions are quite significant: an encoding like message pack + zlib requires only 14% of the size and encodes+decodes in 76% of the time of JSON. There are of course other contenders that trade better encoding times over slightly slower decoding and bigger size. But there are very interesting numbers on this benchmark. MessagePack, CBOR and UJSON (all + zlib) look like really good options. So now that we have this data I would like to ask these questions to the community: - Is this enough, or do we need to perform our own, different benchmarks? - If this is enough, and given that we weren't elected for GSoC, is there interest in the community to work on this nonetheless? - Regarding GSoC: it looks to me that we failed to submit in time. Is this what happened, or we weren't selected? If the former (and no criticism here, just realizing a fact) what can we do next year to avoid this happening again? Is anyone "appointed" to take care of it? Álvaro -- Álvaro Hernández Tortosa --- 8Kdata -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On 22/02/16 23:34, Tom Lane wrote: =?UTF-8?Q?=c3=81lvaro_Hern=c3=a1ndez_Tortosa?= writes: On 22/02/16 05:10, Tom Lane wrote: Another variable is that your answers might depend on what format you assume the client is trying to convert from/to. (It's presumably not text JSON, but then what is it?) As I mentioned before, there are many well-known JSON serialization formats, like: - http://ubjson.org/ - http://cbor.io/ - http://msgpack.org/ - BSON (ok, let's skip that one hehehe) - http://wiki.fasterxml.com/SmileFormatSpec Ah, the great thing about standards is there are so many to choose from :-( So I guess part of the GSoC project would have to be figuring out which one of these would make the most sense for us to adopt. regards, tom lane Yes. And unless I'm mistaken, there's an int16 to identify the data format. Apart from the chosen format, others may be provided as an alternative using different data formats. Or alternatives (like compressed text json). Of course, this may be better suited for a next GSoC project, of course. Álvaro -- Álvaro Hernández Tortosa --- 8Kdata -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
=?UTF-8?Q?=c3=81lvaro_Hern=c3=a1ndez_Tortosa?= writes: > On 22/02/16 05:10, Tom Lane wrote: >> Another variable is that your answers might depend on what format you >> assume the client is trying to convert from/to. (It's presumably not >> text JSON, but then what is it?) > As I mentioned before, there are many well-known JSON serialization > formats, like: > - http://ubjson.org/ > - http://cbor.io/ > - http://msgpack.org/ > - BSON (ok, let's skip that one hehehe) > - http://wiki.fasterxml.com/SmileFormatSpec Ah, the great thing about standards is there are so many to choose from :-( So I guess part of the GSoC project would have to be figuring out which one of these would make the most sense for us to adopt. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On 22/02/16 05:10, Tom Lane wrote: Heikki Linnakangas writes: On 19/02/16 10:10, Ã�lvaro Hernández Tortosa wrote: Oleg and I discussed recently that a really good addition to a GSoC item would be to study whether it's convenient to have a binary serialization format for jsonb over the wire. Seems a bit risky for a GSoC project. We don't know if a different serialization format will be a win, or whether we want to do it in the end, until the benchmarking is done. It's also not clear what we're trying to achieve with the serialization format: smaller on-the-wire size, faster serialization in the server, faster parsing in the client, or what? Another variable is that your answers might depend on what format you assume the client is trying to convert from/to. (It's presumably not text JSON, but then what is it?) As I mentioned before, there are many well-known JSON serialization formats, like: - http://ubjson.org/ - http://cbor.io/ - http://msgpack.org/ - BSON (ok, let's skip that one hehehe) - http://wiki.fasterxml.com/SmileFormatSpec Having said that, I'm not sure that risk is a blocking factor here. History says that a large fraction of our GSoC projects don't result in a commit to core PG. As long as we're clear that "success" in this project isn't measured by getting a feature committed, it doesn't seem riskier than any other one. Maybe it's even less risky, because there's less of the success condition that's not under the GSoC student's control. Agreed :) Álvaro -- Álvaro Hernández Tortosa --- 8Kdata -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On 21/02/16 21:15, Heikki Linnakangas wrote: On 19/02/16 10:10, Álvaro Hernández Tortosa wrote: Oleg and I discussed recently that a really good addition to a GSoC item would be to study whether it's convenient to have a binary serialization format for jsonb over the wire. Some argue this should be benchmarked first. So the scope for this project would be to benchmark and analyze the potential improvements and then agree on which format jsonb could be serialized to (apart from the current on-disk format, there are many json or nested k-v formats that could be used for sending over the wire). Seems a bit risky for a GSoC project. We don't know if a different serialization format will be a win, Over the current serialization (text) is hard to believe there will be no wins. or whether we want to do it in the end, until the benchmarking is done. It's also not clear what we're trying to achieve with the serialization format: smaller on-the-wire size, faster serialization in the server, faster parsing in the client, or what? Probably all of them (it would be ideal if it could be selectable). Some may favor small on-the-wire size (which can be significant with several serialization formats) or faster decoding (de-serialization takes a significant execution time). Of course, all this should be tested and benchmarked before, but we're not alone here. This is a significant request from many, at least from the Java users, where it has been discussed many times. Specially if wire format adheres to one well-known (or even Standard) format, so that the receiving side and the drivers could expose an API based on that format --one of the other big pains today in this side. I think it fits very well for a GSoC! :) Álvaro -- Álvaro Hernández Tortosa --- 8Kdata -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On 02/21/16 23:10, Tom Lane wrote: > Another variable is that your answers might depend on what format you > assume the client is trying to convert from/to. (It's presumably not > text JSON, but then what is it?) This connects tangentially to a question I've been meaning to ask for a while, since I was looking at the representation of XML. As far as I can tell, XML is simply stored in its character serialized representation (very likely compressed, if large enough to TOAST), and the text in/out methods simply deal in that representation. The 'binary' send/recv methods seem to differ only in possibly using a different character encoding on the wire. Now, also as I understand it, there's no requirement that a type even /have/ binary send/recv methods. Text in/out it always needs, but send/recv only if they are interesting enough to buy you something. I'm not sure the XML send/recv really do buy anything. It is not as if they present the XML in any more structured or tokenized form. If they buy anything at all, it may be only an extra transcoding that the other end will probably immediately do in reverse. So, if that's the situation, is there some other, really simple, choice for what XML send/recv might usefully do, that would buy more than what they do now? Well, PGLZ is in libpqcommon now, right? What if xml send wrote a flag to indicate compressed or not, and then if the value is compressed TOAST, streamed it right out as is, with no expansion on the server? I could see that being a worthwhile win, /without even having to devise some XML-specific encoding/. (XML has a big expansion ratio.) And, since that idea is not inherently XML-specific ... does the JSONB representation have the same properties? How about even text or bytea? The XML question has a related, JDBC-specific part. JDBC presents XML via interfaces that can deal in Source and Result objects, and these come in different flavors (DOMSource, an all-in-memory tree, SAXSource and StAXSource, both streaming tokenized forms, or StreamSource, a streaming, character-serialized form). Client code can ask for one of those forms explicitly, or use null to say it doesn't care. In the doesn't-care case, the driver is expected to choose the form closest to what it's got under the hood; the client can convert if necessary, and if it had any other preference, it would have said so. For PGJDBC, that choice would naturally be the character StreamSource, because that /is/ the form it's got under the hood, but for reasons mysterious to me, pgjdbc actually chooses DOMSource in the don't-care case, and then expends the full effort of turning the serialized stream it does have into a full in-memory DOM that the client hasn't asked for and might not even want. I know this is more a PGJDBC question, but I mention it here just because it's so much like the what-should-send/recv-do question, repeated at another level. -Chap -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
Heikki Linnakangas writes: > On 19/02/16 10:10, Ãlvaro Hernández Tortosa wrote: >> Oleg and I discussed recently that a really good addition to a GSoC >> item would be to study whether it's convenient to have a binary >> serialization format for jsonb over the wire. > Seems a bit risky for a GSoC project. We don't know if a different > serialization format will be a win, or whether we want to do it in the > end, until the benchmarking is done. It's also not clear what we're > trying to achieve with the serialization format: smaller on-the-wire > size, faster serialization in the server, faster parsing in the client, > or what? Another variable is that your answers might depend on what format you assume the client is trying to convert from/to. (It's presumably not text JSON, but then what is it?) Having said that, I'm not sure that risk is a blocking factor here. History says that a large fraction of our GSoC projects don't result in a commit to core PG. As long as we're clear that "success" in this project isn't measured by getting a feature committed, it doesn't seem riskier than any other one. Maybe it's even less risky, because there's less of the success condition that's not under the GSoC student's control. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On 19/02/16 10:10, Álvaro Hernández Tortosa wrote: Oleg and I discussed recently that a really good addition to a GSoC item would be to study whether it's convenient to have a binary serialization format for jsonb over the wire. Some argue this should be benchmarked first. So the scope for this project would be to benchmark and analyze the potential improvements and then agree on which format jsonb could be serialized to (apart from the current on-disk format, there are many json or nested k-v formats that could be used for sending over the wire). Seems a bit risky for a GSoC project. We don't know if a different serialization format will be a win, or whether we want to do it in the end, until the benchmarking is done. It's also not clear what we're trying to achieve with the serialization format: smaller on-the-wire size, faster serialization in the server, faster parsing in the client, or what? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On 02/19/2016 10:10 AM, Álvaro Hernández Tortosa wrote: Hi. Oleg and I discussed recently that a really good addition to a GSoC item would be to study whether it's convenient to have a binary serialization format for jsonb over the wire. Some argue this should be benchmarked first. So the scope for this project would be to benchmark and analyze the potential improvements and then agree on which format jsonb could be serialized to (apart from the current on-disk format, there are many json or nested k-v formats that could be used for sending over the wire). I would like to mentor this project with Oleg. +1 -- -- Josh Berkus Red Hat OSAS (any opinions are my own) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
Hi. Oleg and I discussed recently that a really good addition to a GSoC item would be to study whether it's convenient to have a binary serialization format for jsonb over the wire. Some argue this should be benchmarked first. So the scope for this project would be to benchmark and analyze the potential improvements and then agree on which format jsonb could be serialized to (apart from the current on-disk format, there are many json or nested k-v formats that could be used for sending over the wire). I would like to mentor this project with Oleg. Thanks, Álvaro -- Álvaro Hernández Tortosa --- 8Kdata On 17/02/16 08:40, Amit Langote wrote: Hi Shubham, On 2016/02/17 16:27, Shubham Barai wrote: Hello everyone, I am currently pursuing my bachelor of engineering in computer science at Maharashtra Institute of Technology, Pune ,India. I am very excited about contributing to postgres through google summer of code program. Is postgres applying for gsoc 2016 as mentoring organization ? I think it does. Track this page for updates: http://www.postgresql.org/developer/summerofcode/ You can contact one of the people listed on that page for the latest. I didn't find for 2016 but here is the PostgreSQL wiki page for the last year's GSoC page: https://wiki.postgresql.org/wiki/GSoC_2015#Project_Ideas Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
Atri Sharma wrote: > I agree, there might be scope for non core projects and PL/Java sounds like > a good area. We've hosted MADlib-based projects in the past, so why not. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On 19 Feb 2016 8:30 am, "Chapman Flack" wrote: > > On 02/18/16 19:35, Amit Langote wrote: > > > Apparently, the deadline is: February 20, 2016 at 04:00 (+0900 UTC) > > > > https://summerofcode.withgoogle.com/ > > For anybody finding that web site as anti-navigable as I did, here > are more direct links to the actual rules, and terms of agreement > for the various participants: > > https://summerofcode.withgoogle.com/rules/ > https://summerofcode.withgoogle.com/terms/org > https://summerofcode.withgoogle.com/terms/mentor > https://summerofcode.withgoogle.com/terms/student > > Here is a question: does it ever happen that PostgreSQL acts as > the org for a project that is PostgreSQL-related but isn't > directly PGDG-led? > > ... there are definitely interesting and promising areas for further > development in PL/Java beyond what I would ever have time to tackle > solo, and I could easily enjoy mentoring someone through one or > another of them over a summer, which could also help reinvigorate > the project and get another developer familiar with it at a > non-superficial level. While I could easily see myself mentoring, > I think it would feel like overkill to apply individually as a > one-trick 'organization'. > > I see that there was a "based on PL/Java" GSoC'12 project, so maybe > there is some room for non-core ideas under the PostgreSQL ægis? FWIW it wasn't a PL/Java based project per se, it was a JDBC FDW. I agree, there might be scope for non core projects and PL/Java sounds like a good area. Regards, Atri
Re: [HACKERS] about google summer of code 2016
On 02/18/16 19:35, Amit Langote wrote: > Apparently, the deadline is: February 20, 2016 at 04:00 (+0900 UTC) > > https://summerofcode.withgoogle.com/ For anybody finding that web site as anti-navigable as I did, here are more direct links to the actual rules, and terms of agreement for the various participants: https://summerofcode.withgoogle.com/rules/ https://summerofcode.withgoogle.com/terms/org https://summerofcode.withgoogle.com/terms/mentor https://summerofcode.withgoogle.com/terms/student Here is a question: does it ever happen that PostgreSQL acts as the org for a project that is PostgreSQL-related but isn't directly PGDG-led? ... there are definitely interesting and promising areas for further development in PL/Java beyond what I would ever have time to tackle solo, and I could easily enjoy mentoring someone through one or another of them over a summer, which could also help reinvigorate the project and get another developer familiar with it at a non-superficial level. While I could easily see myself mentoring, I think it would feel like overkill to apply individually as a one-trick 'organization'. I see that there was a "based on PL/Java" GSoC'12 project, so maybe there is some room for non-core ideas under the PostgreSQL ægis? In any case, I am quite confident that I could *not* complete a separate org application by tomorrow 2 pm EST. In reading the rules, it looks possible that the Ideas List does not have to accompany the org application, but would be needed shortly after acceptance? If acceptance announcements are 29 February, I could have some ideas drafted by then. Is this a thinkable thought? -Chap -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On 2016/02/18 22:44, Alexander Korotkov wrote: > On Wed, Feb 17, 2016 at 10:40 AM, Amit Langote wrote: >> I didn't find for 2016 but here is the PostgreSQL wiki page for the last >> year's GSoC page: https://wiki.postgresql.org/wiki/GSoC_2015#Project_Ideas > > > I've created wiki page for GSoC 2016. It contains unimplemented ideas from > 2015 page. > Now, GSoC accepting proposals from organizations. Typically, we have call > for mentors in hackers mailing list in this period. > Thom, do we apply this year? Apparently, the deadline is: February 20, 2016 at 04:00 (+0900 UTC) https://summerofcode.withgoogle.com/ Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] about google summer of code 2016
On Wed, Feb 17, 2016 at 10:40 AM, Amit Langote < langote_amit...@lab.ntt.co.jp> wrote: > On 2016/02/17 16:27, Shubham Barai wrote: > > Hello everyone, > > > > I am currently pursuing my bachelor of engineering in computer science > > at Maharashtra > > Institute of Technology, Pune ,India. I am very excited about > contributing > > to postgres through google summer of code program. > > > > Is postgres applying for gsoc 2016 as mentoring organization ? > > I think it does. Track this page for updates: > http://www.postgresql.org/developer/summerofcode/ > > You can contact one of the people listed on that page for the latest. > > I didn't find for 2016 but here is the PostgreSQL wiki page for the last > year's GSoC page: https://wiki.postgresql.org/wiki/GSoC_2015#Project_Ideas I've created wiki page for GSoC 2016. It contains unimplemented ideas from 2015 page. Now, GSoC accepting proposals from organizations. Typically, we have call for mentors in hackers mailing list in this period. Thom, do we apply this year? -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] about google summer of code 2016
Hi Shubham, On 2016/02/17 16:27, Shubham Barai wrote: > Hello everyone, > > I am currently pursuing my bachelor of engineering in computer science > at Maharashtra > Institute of Technology, Pune ,India. I am very excited about contributing > to postgres through google summer of code program. > > Is postgres applying for gsoc 2016 as mentoring organization ? I think it does. Track this page for updates: http://www.postgresql.org/developer/summerofcode/ You can contact one of the people listed on that page for the latest. I didn't find for 2016 but here is the PostgreSQL wiki page for the last year's GSoC page: https://wiki.postgresql.org/wiki/GSoC_2015#Project_Ideas Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] about google summer of code 2016
Hello everyone, I am currently pursuing my bachelor of engineering in computer science at Maharashtra Institute of Technology, Pune ,India. I am very excited about contributing to postgres through google summer of code program. Is postgres applying for gsoc 2016 as mentoring organization ? Thanks, Shubham Barai