Re: [HACKERS] [PATCH] SortSupport for macaddr type
Hello, Thank you Peter for the assist here and great sleuthing in the disassembly, and thanks Teodor for committing! Neha pointed out (thanks as well) a couple typos upthread that I hadn't gotten around to fixing. I've attached a new three line patch to sort those out. Brandur On Wed, Mar 29, 2017 at 1:31 PM, Teodor Sigaevwrote: > Thank you, pushed > > Excellent! I've attached a new (and hopefully final) >> version of the patch. >> > > -- > Teodor Sigaev E-mail: teo...@sigaev.ru >WWW: > http://www.sigaev.ru/ > 0001-Fix-two-typos-introduced-by-macaddr-SortSupport.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SortSupport for macaddr type
Hi Julien, Thanks for the expedient reply, even after I'd dropped the ball for so long :) > Indeed, I should have checked more examples :/ There > isn't any clear pattern for this, so I guess any one > would be ok. Yeah, agreed. If it's alright with you, I ended up moving the naming back to `macaddr_cmp_internal` just because it results in a smaller final diff. > Thanks. I'm ok with this, but maybe a native english > speaker would have a better opinion on this. Cool! I just re-read my own comment a few days later and I think that it still mostly makes sense, but definitely open to other edits if anyone else has one. > One last thing, I noticed that you added: > > +static int macaddr_internal_cmp(macaddr *a1, macaddr *a2); > > but the existing function is declared as > > static int32 > macaddr_internal_cmp(macaddr *a1, macaddr *a2) > > I'd be in favor to declare both as int. Great catch! I have no idea how I missed that. I've done as you suggested and made them both "int", which seems consistent with SortSupport implementations elsewhere. > After this, I think this patch will be ready for committer. Excellent! I've attached a new (and hopefully final) version of the patch. Two final questions about the process if you'd be so kind: * Should I change the status on the Commitfest link [1] or do I leave that to you (or someone else like a committer)? * I've been generating a new OID value with the `unused_oids` script, but pretty much every time I rebase I collide with someone else's addition and need to find a new one. Is it better for me to pick an OID in an exotic range for my final patch, or that a committer just finds a new one (if necessary) as they're putting it into master? Thanks again! Brandur [1] https://commitfest.postgresql.org/10/743/ On Sat, Feb 25, 2017 at 9:56 AM, Julien Rouhaud <julien.rouh...@dalibo.com> wrote: > On Sun, Feb 05, 2017 at 01:56:41PM -0800, Brandur Leach wrote: > > Hello Brandur, thanks for the updated patch! > > > > > > * you used macaddr_cmp_internal() function name, for uuid > > > the same function is named uuid_internal_cmp(). Using > > > the same naming pattern is probably better. > > > > I was a little split on this one! It's true that UUID uses > > `_internal_cmp`, but `_cmp_internal` is also used in a > > number of places like `enum`, `timetz`, and `network`. I > > don't have a strong feeling about it either way, so I've > > changed it to `_internal_cmp` to match UUID as you > > suggested. > > > > Indeed, I should have checked more examples :/ There isn't any clear > pattern > for this, so I guess any one would be ok. > > > > * the function comment on macaddr_abbrev_convert() > > > doesn't mention specific little-endian handling > > > > I tried to bake this into the comment text. Here are the > > relevant lines of the amended version: > > > > * Packs the bytes of a 6-byte MAC address into a Datum and treats it > as > > an > > * unsigned integer for purposes of comparison. On a 64-bit machine, > > there > > * will be two zeroed bytes of padding. The integer is converted to > > native > > * endianness to facilitate easy comparison. > > > > > * "There will be two bytes of zero padding on the least > > > significant end" > > > > > > "least significant bits" would be better > > > > Also done. Here is the amended version: > > > > * On a 64-bit machine, zero out the 8-byte datum and copy the 6 > bytes of > > * the MAC address in. There will be two bytes of zero padding on the > end > > * of the least significant bits. > > > > Thanks. I'm ok with this, but maybe a native english speaker would have a > better opinion on this. > > > > * This patch will trigger quite a lot modifications after > > > a pgindent run. Could you try to run pgindent on mac.c > > > before sending an updated patch? > > > > Good call! I've run the new version through pgindent. > > > > Thanks also, no more issue here. > > > Let me know if you have any further feedback and/or > > suggestions. Thanks! > > One last thing, I noticed that you added: > > +static int macaddr_internal_cmp(macaddr *a1, macaddr *a2); > > but the existing function is declared as > > static int32 > macaddr_internal_cmp(macaddr *a1, macaddr *a2) > > I'd be in favor to declare both as int. > > After this, I think this patch will be ready for committer. > > -- > Julien Rouhaud > http://dalibo.com - http://dalibo.org > 0003-Implement-SortSupport-for-macaddr-data-type.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] SortSupport for macaddr type
And as a short follow up, I've moved the patch to the current commit fest: https://commitfest.postgresql.org/13/743/ On Sun, Feb 5, 2017 at 1:56 PM, Brandur Leach <bran...@mutelight.org> wrote: > Hi Julien, > > Thank you for taking the time to do this review, and my > apologies for the very delayed response. I lost track of > this work and have only jumped back into it today. > > Please find attached a new version of the patch with your > feedback integrated. I've also rebased the patch against > the current master and selected a new OID because my old > one is now in use. > > > * you used macaddr_cmp_internal() function name, for uuid > > the same function is named uuid_internal_cmp(). Using > > the same naming pattern is probably better. > > I was a little split on this one! It's true that UUID uses > `_internal_cmp`, but `_cmp_internal` is also used in a > number of places like `enum`, `timetz`, and `network`. I > don't have a strong feeling about it either way, so I've > changed it to `_internal_cmp` to match UUID as you > suggested. > > > * the function comment on macaddr_abbrev_convert() > > doesn't mention specific little-endian handling > > I tried to bake this into the comment text. Here are the > relevant lines of the amended version: > > * Packs the bytes of a 6-byte MAC address into a Datum and treats it > as an > * unsigned integer for purposes of comparison. On a 64-bit machine, > there > * will be two zeroed bytes of padding. The integer is converted to > native > * endianness to facilitate easy comparison. > > > * "There will be two bytes of zero padding on the least > > significant end" > > > > "least significant bits" would be better > > Also done. Here is the amended version: > > * On a 64-bit machine, zero out the 8-byte datum and copy the 6 bytes > of > * the MAC address in. There will be two bytes of zero padding on the > end > * of the least significant bits. > > > * This patch will trigger quite a lot modifications after > > a pgindent run. Could you try to run pgindent on mac.c > > before sending an updated patch? > > Good call! I've run the new version through pgindent. > > Let me know if you have any further feedback and/or > suggestions. Thanks! > > Brandur > > > On Wed, Sep 14, 2016 at 3:14 AM, Julien Rouhaud <julien.rouh...@dalibo.com > > wrote: > >> On 26/08/2016 19:44, Brandur wrote: >> > Hello, >> > >> >> Hello, >> >> > I've attached a patch to add SortSupport for Postgres' macaddr which >> has the >> > effect of improving the performance of sorting operations for the type. >> The >> > strategy that I employ is very similar to that for UUID, which is to >> create >> > abbreviated keys by packing as many bytes from the MAC address as >> possible into >> > Datums, and then performing fast unsigned integer comparisons while >> sorting. >> > >> > I ran some informal local benchmarks, and for cardinality greater than >> 100k >> > rows, I see a speed up on `CREATE INDEX` of roughly 25% to 65%. (For >> those >> > interested, I put a few more numbers into a small report here [2].) >> > >> >> That's a nice improvement! >> >> > Admittedly, this is not quite as useful as speeding up sorting on a >> more common >> > data type like TEXT or UUID, but the change still seems like a useful >> > performance improvement. I largely wrote it as an exercise to >> familiarize >> > myself with the Postgres codebase. >> > >> > I'll add an entry into the current commitfest as suggested by the >> Postgres Wiki >> > and follow up here with a link. >> > >> > Thanks, and if anyone has feedback or other thoughts, let me know! >> > >> >> I just reviewed your patch. It applies and compiles cleanly, and the >> abbrev feature works as intended. There's not much to say since this is >> heavily inspired on the uuid SortSupport. The only really specific part >> is in the abbrev_converter function, and I don't see any issue with it. >> >> I have a few trivial comments: >> >> * you used macaddr_cmp_internal() function name, for uuid the same >> function is named uuid_internal_cmp(). Using the same naming pattern is >> probably better. >> >> * the function comment on macaddr_abbrev_convert() doesn't mention >> specific little-endian handling >> >> * "There will be two bytes of zero padding on the least significant end" >> >> "least significant bits" would be better >> >> * This patch will trigger quite a lot modifications after a pgindent >> run. Could you try to run pgindent on mac.c before sending an updated >> patch? >> >> Best regards. >> >> -- >> Julien Rouhaud >> http://dalibo.com - http://dalibo.org >> > >
Re: [HACKERS] [PATCH] SortSupport for macaddr type
Hi Julien, Thank you for taking the time to do this review, and my apologies for the very delayed response. I lost track of this work and have only jumped back into it today. Please find attached a new version of the patch with your feedback integrated. I've also rebased the patch against the current master and selected a new OID because my old one is now in use. > * you used macaddr_cmp_internal() function name, for uuid > the same function is named uuid_internal_cmp(). Using > the same naming pattern is probably better. I was a little split on this one! It's true that UUID uses `_internal_cmp`, but `_cmp_internal` is also used in a number of places like `enum`, `timetz`, and `network`. I don't have a strong feeling about it either way, so I've changed it to `_internal_cmp` to match UUID as you suggested. > * the function comment on macaddr_abbrev_convert() > doesn't mention specific little-endian handling I tried to bake this into the comment text. Here are the relevant lines of the amended version: * Packs the bytes of a 6-byte MAC address into a Datum and treats it as an * unsigned integer for purposes of comparison. On a 64-bit machine, there * will be two zeroed bytes of padding. The integer is converted to native * endianness to facilitate easy comparison. > * "There will be two bytes of zero padding on the least > significant end" > > "least significant bits" would be better Also done. Here is the amended version: * On a 64-bit machine, zero out the 8-byte datum and copy the 6 bytes of * the MAC address in. There will be two bytes of zero padding on the end * of the least significant bits. > * This patch will trigger quite a lot modifications after > a pgindent run. Could you try to run pgindent on mac.c > before sending an updated patch? Good call! I've run the new version through pgindent. Let me know if you have any further feedback and/or suggestions. Thanks! Brandur On Wed, Sep 14, 2016 at 3:14 AM, Julien Rouhaudwrote: > On 26/08/2016 19:44, Brandur wrote: > > Hello, > > > > Hello, > > > I've attached a patch to add SortSupport for Postgres' macaddr which has > the > > effect of improving the performance of sorting operations for the type. > The > > strategy that I employ is very similar to that for UUID, which is to > create > > abbreviated keys by packing as many bytes from the MAC address as > possible into > > Datums, and then performing fast unsigned integer comparisons while > sorting. > > > > I ran some informal local benchmarks, and for cardinality greater than > 100k > > rows, I see a speed up on `CREATE INDEX` of roughly 25% to 65%. (For > those > > interested, I put a few more numbers into a small report here [2].) > > > > That's a nice improvement! > > > Admittedly, this is not quite as useful as speeding up sorting on a more > common > > data type like TEXT or UUID, but the change still seems like a useful > > performance improvement. I largely wrote it as an exercise to familiarize > > myself with the Postgres codebase. > > > > I'll add an entry into the current commitfest as suggested by the > Postgres Wiki > > and follow up here with a link. > > > > Thanks, and if anyone has feedback or other thoughts, let me know! > > > > I just reviewed your patch. It applies and compiles cleanly, and the > abbrev feature works as intended. There's not much to say since this is > heavily inspired on the uuid SortSupport. The only really specific part > is in the abbrev_converter function, and I don't see any issue with it. > > I have a few trivial comments: > > * you used macaddr_cmp_internal() function name, for uuid the same > function is named uuid_internal_cmp(). Using the same naming pattern is > probably better. > > * the function comment on macaddr_abbrev_convert() doesn't mention > specific little-endian handling > > * "There will be two bytes of zero padding on the least significant end" > > "least significant bits" would be better > > * This patch will trigger quite a lot modifications after a pgindent > run. Could you try to run pgindent on mac.c before sending an updated > patch? > > Best regards. > > -- > Julien Rouhaud > http://dalibo.com - http://dalibo.org > 0002-Implement-SortSupport-for-macaddr-data-type.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers