Re: [HACKERS] [PATCH] SortSupport for macaddr type

2017-03-31 Thread Brandur Leach
Hello,

Thank you Peter for the assist here and great sleuthing in
the disassembly, and thanks Teodor for committing!

Neha pointed out (thanks as well) a couple typos upthread
that I hadn't gotten around to fixing. I've attached a new
three line patch to sort those out.

Brandur


On Wed, Mar 29, 2017 at 1:31 PM, Teodor Sigaev  wrote:

> Thank you, pushed
>
> Excellent! I've attached a new (and hopefully final)
>> version of the patch.
>>
>
> --
> Teodor Sigaev   E-mail: teo...@sigaev.ru
>WWW:
> http://www.sigaev.ru/
>


0001-Fix-two-typos-introduced-by-macaddr-SortSupport.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SortSupport for macaddr type

2017-02-28 Thread Brandur Leach
Hi Julien,

Thanks for the expedient reply, even after I'd dropped the
ball for so long :)

> Indeed, I should have checked more examples :/ There
> isn't any clear pattern for this, so I guess any one
> would be ok.

Yeah, agreed. If it's alright with you, I ended up moving
the naming back to `macaddr_cmp_internal` just because it
results in a smaller final diff.

> Thanks. I'm ok with this, but maybe a native english
> speaker would have a better opinion on this.

Cool! I just re-read my own comment a few days later and I
think that it still mostly makes sense, but definitely open
to other edits if anyone else has one.

> One last thing, I noticed that you added:
>
> +static int macaddr_internal_cmp(macaddr *a1, macaddr *a2);
>
> but the existing function is declared as
>
> static int32
> macaddr_internal_cmp(macaddr *a1, macaddr *a2)
>
> I'd be in favor to declare both as int.

Great catch! I have no idea how I missed that. I've done as
you suggested and made them both "int", which seems
consistent with SortSupport implementations elsewhere.

> After this, I think this patch will be ready for committer.

Excellent! I've attached a new (and hopefully final)
version of the patch.

Two final questions about the process if you'd be so kind:

* Should I change the status on the Commitfest link [1] or
  do I leave that to you (or someone else like a committer)?

* I've been generating a new OID value with the
  `unused_oids` script, but pretty much every time I rebase
  I collide with someone else's addition and need to find a
  new one. Is it better for me to pick an OID in an exotic
  range for my final patch, or that a committer just finds
  a new one (if necessary) as they're putting it into
  master?

Thanks again!
Brandur

[1] https://commitfest.postgresql.org/10/743/


On Sat, Feb 25, 2017 at 9:56 AM, Julien Rouhaud <julien.rouh...@dalibo.com>
wrote:

> On Sun, Feb 05, 2017 at 01:56:41PM -0800, Brandur Leach wrote:
>
> Hello Brandur, thanks for the updated patch!
>
> >
> > > * you used macaddr_cmp_internal() function name, for uuid
> > >   the same function is named uuid_internal_cmp().  Using
> > >   the same naming pattern is probably better.
> >
> > I was a little split on this one! It's true that UUID uses
> > `_internal_cmp`, but `_cmp_internal` is also used in a
> > number of places like `enum`, `timetz`, and `network`. I
> > don't have a strong feeling about it either way, so I've
> > changed it to `_internal_cmp` to match UUID as you
> > suggested.
> >
>
> Indeed, I should have checked more examples :/ There isn't any clear
> pattern
> for this, so I guess any one would be ok.
>
> > > * the function comment on macaddr_abbrev_convert()
> > >   doesn't mention specific little-endian handling
> >
> > I tried to bake this into the comment text. Here are the
> > relevant lines of the amended version:
> >
> > * Packs the bytes of a 6-byte MAC address into a Datum and treats it
> as
> > an
> > * unsigned integer for purposes of comparison. On a 64-bit machine,
> > there
> > * will be two zeroed bytes of padding. The integer is converted to
> > native
> > * endianness to facilitate easy comparison.
> >
> > > * "There will be two bytes of zero padding on the least
> > >   significant end"
> > >
> > > "least significant bits" would be better
> >
> > Also done. Here is the amended version:
> >
> > * On a 64-bit machine, zero out the 8-byte datum and copy the 6
> bytes of
> > * the MAC address in. There will be two bytes of zero padding on the
> end
> > * of the least significant bits.
> >
>
> Thanks.  I'm ok with this, but maybe a native english speaker would have a
> better opinion on this.
>
> > > * This patch will trigger quite a lot modifications after
> > >   a pgindent run.  Could you try to run pgindent on mac.c
> > >   before sending an updated patch?
> >
> > Good call! I've run the new version through pgindent.
> >
>
> Thanks also, no more issue here.
>
> > Let me know if you have any further feedback and/or
> > suggestions. Thanks!
>
> One last thing, I noticed that you added:
>
> +static int macaddr_internal_cmp(macaddr *a1, macaddr *a2);
>
> but the existing function is declared as
>
> static int32
> macaddr_internal_cmp(macaddr *a1, macaddr *a2)
>
> I'd be in favor to declare both as int.
>
> After this, I think this patch will be ready for committer.
>
> --
> Julien Rouhaud
> http://dalibo.com - http://dalibo.org
>


0003-Implement-SortSupport-for-macaddr-data-type.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SortSupport for macaddr type

2017-02-07 Thread Brandur Leach
And as a short follow up, I've moved the patch to the
current commit fest:

https://commitfest.postgresql.org/13/743/


On Sun, Feb 5, 2017 at 1:56 PM, Brandur Leach <bran...@mutelight.org> wrote:

> Hi Julien,
>
> Thank you for taking the time to do this review, and my
> apologies for the very delayed response. I lost track of
> this work and have only jumped back into it today.
>
> Please find attached a new version of the patch with your
> feedback integrated. I've also rebased the patch against
> the current master and selected a new OID because my old
> one is now in use.
>
> > * you used macaddr_cmp_internal() function name, for uuid
> >   the same function is named uuid_internal_cmp().  Using
> >   the same naming pattern is probably better.
>
> I was a little split on this one! It's true that UUID uses
> `_internal_cmp`, but `_cmp_internal` is also used in a
> number of places like `enum`, `timetz`, and `network`. I
> don't have a strong feeling about it either way, so I've
> changed it to `_internal_cmp` to match UUID as you
> suggested.
>
> > * the function comment on macaddr_abbrev_convert()
> >   doesn't mention specific little-endian handling
>
> I tried to bake this into the comment text. Here are the
> relevant lines of the amended version:
>
> * Packs the bytes of a 6-byte MAC address into a Datum and treats it
> as an
> * unsigned integer for purposes of comparison. On a 64-bit machine,
> there
> * will be two zeroed bytes of padding. The integer is converted to
> native
> * endianness to facilitate easy comparison.
>
> > * "There will be two bytes of zero padding on the least
> >   significant end"
> >
> > "least significant bits" would be better
>
> Also done. Here is the amended version:
>
> * On a 64-bit machine, zero out the 8-byte datum and copy the 6 bytes
> of
> * the MAC address in. There will be two bytes of zero padding on the
> end
> * of the least significant bits.
>
> > * This patch will trigger quite a lot modifications after
> >   a pgindent run.  Could you try to run pgindent on mac.c
> >   before sending an updated patch?
>
> Good call! I've run the new version through pgindent.
>
> Let me know if you have any further feedback and/or
> suggestions. Thanks!
>
> Brandur
>
>
> On Wed, Sep 14, 2016 at 3:14 AM, Julien Rouhaud <julien.rouh...@dalibo.com
> > wrote:
>
>> On 26/08/2016 19:44, Brandur wrote:
>> > Hello,
>> >
>>
>> Hello,
>>
>> > I've attached a patch to add SortSupport for Postgres' macaddr which
>> has the
>> > effect of improving the performance of sorting operations for the type.
>> The
>> > strategy that I employ is very similar to that for UUID, which is to
>> create
>> > abbreviated keys by packing as many bytes from the MAC address as
>> possible into
>> > Datums, and then performing fast unsigned integer comparisons while
>> sorting.
>> >
>> > I ran some informal local benchmarks, and for cardinality greater than
>> 100k
>> > rows, I see a speed up on `CREATE INDEX` of roughly 25% to 65%. (For
>> those
>> > interested, I put a few more numbers into a small report here [2].)
>> >
>>
>> That's a nice improvement!
>>
>> > Admittedly, this is not quite as useful as speeding up sorting on a
>> more common
>> > data type like TEXT or UUID, but the change still seems like a useful
>> > performance improvement. I largely wrote it as an exercise to
>> familiarize
>> > myself with the Postgres codebase.
>> >
>> > I'll add an entry into the current commitfest as suggested by the
>> Postgres Wiki
>> > and follow up here with a link.
>> >
>> > Thanks, and if anyone has feedback or other thoughts, let me know!
>> >
>>
>> I just reviewed your patch.  It applies and compiles cleanly, and the
>> abbrev feature works as intended.  There's not much to say since this is
>> heavily inspired on the uuid SortSupport. The only really specific part
>> is in the abbrev_converter function, and I don't see any issue with it.
>>
>> I have a few trivial comments:
>>
>> * you used macaddr_cmp_internal() function name, for uuid the same
>> function is named uuid_internal_cmp().  Using the same naming pattern is
>> probably better.
>>
>> * the function comment on macaddr_abbrev_convert() doesn't mention
>> specific little-endian handling
>>
>> * "There will be two bytes of zero padding on the least significant end"
>>
>> "least significant bits" would be better
>>
>> * This patch will trigger quite a lot modifications after a pgindent
>> run.  Could you try to run pgindent on mac.c before sending an updated
>> patch?
>>
>> Best regards.
>>
>> --
>> Julien Rouhaud
>> http://dalibo.com - http://dalibo.org
>>
>
>


Re: [HACKERS] [PATCH] SortSupport for macaddr type

2017-02-05 Thread Brandur Leach
Hi Julien,

Thank you for taking the time to do this review, and my
apologies for the very delayed response. I lost track of
this work and have only jumped back into it today.

Please find attached a new version of the patch with your
feedback integrated. I've also rebased the patch against
the current master and selected a new OID because my old
one is now in use.

> * you used macaddr_cmp_internal() function name, for uuid
>   the same function is named uuid_internal_cmp().  Using
>   the same naming pattern is probably better.

I was a little split on this one! It's true that UUID uses
`_internal_cmp`, but `_cmp_internal` is also used in a
number of places like `enum`, `timetz`, and `network`. I
don't have a strong feeling about it either way, so I've
changed it to `_internal_cmp` to match UUID as you
suggested.

> * the function comment on macaddr_abbrev_convert()
>   doesn't mention specific little-endian handling

I tried to bake this into the comment text. Here are the
relevant lines of the amended version:

* Packs the bytes of a 6-byte MAC address into a Datum and treats it as
an
* unsigned integer for purposes of comparison. On a 64-bit machine,
there
* will be two zeroed bytes of padding. The integer is converted to
native
* endianness to facilitate easy comparison.

> * "There will be two bytes of zero padding on the least
>   significant end"
>
> "least significant bits" would be better

Also done. Here is the amended version:

* On a 64-bit machine, zero out the 8-byte datum and copy the 6 bytes of
* the MAC address in. There will be two bytes of zero padding on the end
* of the least significant bits.

> * This patch will trigger quite a lot modifications after
>   a pgindent run.  Could you try to run pgindent on mac.c
>   before sending an updated patch?

Good call! I've run the new version through pgindent.

Let me know if you have any further feedback and/or
suggestions. Thanks!

Brandur


On Wed, Sep 14, 2016 at 3:14 AM, Julien Rouhaud 
wrote:

> On 26/08/2016 19:44, Brandur wrote:
> > Hello,
> >
>
> Hello,
>
> > I've attached a patch to add SortSupport for Postgres' macaddr which has
> the
> > effect of improving the performance of sorting operations for the type.
> The
> > strategy that I employ is very similar to that for UUID, which is to
> create
> > abbreviated keys by packing as many bytes from the MAC address as
> possible into
> > Datums, and then performing fast unsigned integer comparisons while
> sorting.
> >
> > I ran some informal local benchmarks, and for cardinality greater than
> 100k
> > rows, I see a speed up on `CREATE INDEX` of roughly 25% to 65%. (For
> those
> > interested, I put a few more numbers into a small report here [2].)
> >
>
> That's a nice improvement!
>
> > Admittedly, this is not quite as useful as speeding up sorting on a more
> common
> > data type like TEXT or UUID, but the change still seems like a useful
> > performance improvement. I largely wrote it as an exercise to familiarize
> > myself with the Postgres codebase.
> >
> > I'll add an entry into the current commitfest as suggested by the
> Postgres Wiki
> > and follow up here with a link.
> >
> > Thanks, and if anyone has feedback or other thoughts, let me know!
> >
>
> I just reviewed your patch.  It applies and compiles cleanly, and the
> abbrev feature works as intended.  There's not much to say since this is
> heavily inspired on the uuid SortSupport. The only really specific part
> is in the abbrev_converter function, and I don't see any issue with it.
>
> I have a few trivial comments:
>
> * you used macaddr_cmp_internal() function name, for uuid the same
> function is named uuid_internal_cmp().  Using the same naming pattern is
> probably better.
>
> * the function comment on macaddr_abbrev_convert() doesn't mention
> specific little-endian handling
>
> * "There will be two bytes of zero padding on the least significant end"
>
> "least significant bits" would be better
>
> * This patch will trigger quite a lot modifications after a pgindent
> run.  Could you try to run pgindent on mac.c before sending an updated
> patch?
>
> Best regards.
>
> --
> Julien Rouhaud
> http://dalibo.com - http://dalibo.org
>


0002-Implement-SortSupport-for-macaddr-data-type.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers