Re: Are constructors matched against using "switch" or chained if-else

2022-02-22 Thread David Feuer
That answer (which I did indeed write once upon a time) is somewhat
incomplete. Case matches on Ints actually work in about three different
ways:

1. Linear search
2. Binary search
3. Jump table

There's some amount of mix and match available for gappy things. I don't
know all the details. As for tag matching, I know none of the mechanics. I
just know that I've always been told to make the first constructor hot.

On Wed, Feb 23, 2022, 12:08 AM Clinton Mead  wrote:

> David,
>
> A random google search has revealed this StackOverflow answer
> , presumably by yourself or
> your evil twin, which mentions a binary search being performed. However the
> particular case you mention is a case match on "Ints", which are far from a
> dense set, whereas presumably you could do something nicer on the tag bits
> of constructors, which are a dense set?
>
>
> On Wed, Feb 23, 2022 at 1:59 PM David Feuer  wrote:
>
>> You can ask, but someone else will have to answer. Sorry.
>>
>> On Tue, Feb 22, 2022 at 9:52 PM Clinton Mead 
>> wrote:
>> >
>> > Thanks David. Can I ask why? Is it because the first constructor is
>> treated specially? (perhaps because it has zeroed tag bits)? Or is it just
>> because there's always an if/else chain in order of the constructor
>> definition regardless of the order of the case statement so the higher up
>> the list the better?
>> >
>> > On Wed, Feb 23, 2022 at 1:34 PM David Feuer 
>> wrote:
>> >>
>> >> I can answer one of your questions for sure: the order of your case
>> branches doesn't matter at all. However, the order of the data constructors
>> in the type declaration does matter. Put your most likely one first.
>> >>
>> >> On Tue, Feb 22, 2022, 9:09 PM Clinton Mead 
>> wrote:
>> >>>
>> >>> Hi All
>> >>>
>> >>> I'm developing an unbounded integer type, which I won't go into the
>> details here but in some circumstances has better performance than the
>> standard "Integer".
>> >>>
>> >>> Anyway, whilst there are complex cases, the most common case is a
>> standard machine int multiplication.
>> >>>
>> >>> Hence I want the type to be optimised for that case.
>> >>>
>> >>> I'm going to have a few constructors, anyway, so I first considered
>> something like this:
>> >>>
>> >>> `data MyInt = BasicZero | BasicPos Word | BasicNeg Word | ComplexPosA
>> ... | ComplexNegA ... | ComplexPosB ... | ComplexNegB ...`
>> >>>
>> >>> I'd naturally make the "Word"s in "BasicPos" and "BasicNeg"
>> strict/unpack, hopefully eliminating the indirection, or perhaps just
>> making them primitive directly.
>> >>>
>> >>> This has 7 constructors, which quite nicely I believe fits into the
>> three spare bits in a 64 bit pointer which GHC optimises I believe.
>> >>>
>> >>> However, this approach I thought of because I assumed that GHC would
>> do a switch style statement on the constructors, so once I have more than
>> one I might as well have as many as I want (well, up to 7, until I lose the
>> optimisation).
>> >>>
>> >>> But if GHC compiles this to a "if ... else if..." chain, a better
>> representation is the following:
>> >>>
>> >>> `data MyInt = BasicInt Int | ComplexPosA ... | ComplexNegA ... |
>> ComplexPosB ... | ComplexNegB ...`
>> >>>
>> >>> That way I can match against BasicInt first, and knock that out of
>> the way as my "hot" case. However, using "Int" instead of "Word" does make
>> the logic a bit more complex, but it may be worth it if I'm reducing the
>> number of patterns I have to check against for all arguments.
>> >>>
>> >>> I was just wondering if anyone could share some insight on what GHC
>> does in these circumstances? For example, does the order I list my cases in
>> a case statement matter if they're non-overlapping? Will GHC match them in
>> the order I list, or will it just make them into a switch statement so it
>> doesn't matter if I reorder them?
>> >>>
>> >>> I guess I could benchmark all this (and probably will) but some
>> insights and general guidance would be good.
>> >>>
>> >>> Thanks,
>> >>> Clinton
>> >>> ___
>> >>> Glasgow-haskell-users mailing list
>> >>> Glasgow-haskell-users@haskell.org
>> >>>
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
>>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users


Re: Are constructors matched against using "switch" or chained if-else

2022-02-22 Thread Clinton Mead
David,

A random google search has revealed this StackOverflow answer
, presumably by yourself or
your evil twin, which mentions a binary search being performed. However the
particular case you mention is a case match on "Ints", which are far from a
dense set, whereas presumably you could do something nicer on the tag bits
of constructors, which are a dense set?


On Wed, Feb 23, 2022 at 1:59 PM David Feuer  wrote:

> You can ask, but someone else will have to answer. Sorry.
>
> On Tue, Feb 22, 2022 at 9:52 PM Clinton Mead 
> wrote:
> >
> > Thanks David. Can I ask why? Is it because the first constructor is
> treated specially? (perhaps because it has zeroed tag bits)? Or is it just
> because there's always an if/else chain in order of the constructor
> definition regardless of the order of the case statement so the higher up
> the list the better?
> >
> > On Wed, Feb 23, 2022 at 1:34 PM David Feuer 
> wrote:
> >>
> >> I can answer one of your questions for sure: the order of your case
> branches doesn't matter at all. However, the order of the data constructors
> in the type declaration does matter. Put your most likely one first.
> >>
> >> On Tue, Feb 22, 2022, 9:09 PM Clinton Mead 
> wrote:
> >>>
> >>> Hi All
> >>>
> >>> I'm developing an unbounded integer type, which I won't go into the
> details here but in some circumstances has better performance than the
> standard "Integer".
> >>>
> >>> Anyway, whilst there are complex cases, the most common case is a
> standard machine int multiplication.
> >>>
> >>> Hence I want the type to be optimised for that case.
> >>>
> >>> I'm going to have a few constructors, anyway, so I first considered
> something like this:
> >>>
> >>> `data MyInt = BasicZero | BasicPos Word | BasicNeg Word | ComplexPosA
> ... | ComplexNegA ... | ComplexPosB ... | ComplexNegB ...`
> >>>
> >>> I'd naturally make the "Word"s in "BasicPos" and "BasicNeg"
> strict/unpack, hopefully eliminating the indirection, or perhaps just
> making them primitive directly.
> >>>
> >>> This has 7 constructors, which quite nicely I believe fits into the
> three spare bits in a 64 bit pointer which GHC optimises I believe.
> >>>
> >>> However, this approach I thought of because I assumed that GHC would
> do a switch style statement on the constructors, so once I have more than
> one I might as well have as many as I want (well, up to 7, until I lose the
> optimisation).
> >>>
> >>> But if GHC compiles this to a "if ... else if..." chain, a better
> representation is the following:
> >>>
> >>> `data MyInt = BasicInt Int | ComplexPosA ... | ComplexNegA ... |
> ComplexPosB ... | ComplexNegB ...`
> >>>
> >>> That way I can match against BasicInt first, and knock that out of the
> way as my "hot" case. However, using "Int" instead of "Word" does make the
> logic a bit more complex, but it may be worth it if I'm reducing the number
> of patterns I have to check against for all arguments.
> >>>
> >>> I was just wondering if anyone could share some insight on what GHC
> does in these circumstances? For example, does the order I list my cases in
> a case statement matter if they're non-overlapping? Will GHC match them in
> the order I list, or will it just make them into a switch statement so it
> doesn't matter if I reorder them?
> >>>
> >>> I guess I could benchmark all this (and probably will) but some
> insights and general guidance would be good.
> >>>
> >>> Thanks,
> >>> Clinton
> >>> ___
> >>> Glasgow-haskell-users mailing list
> >>> Glasgow-haskell-users@haskell.org
> >>> http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users


Re: Are constructors matched against using "switch" or chained if-else

2022-02-22 Thread David Feuer
You can ask, but someone else will have to answer. Sorry.

On Tue, Feb 22, 2022 at 9:52 PM Clinton Mead  wrote:
>
> Thanks David. Can I ask why? Is it because the first constructor is treated 
> specially? (perhaps because it has zeroed tag bits)? Or is it just because 
> there's always an if/else chain in order of the constructor definition 
> regardless of the order of the case statement so the higher up the list the 
> better?
>
> On Wed, Feb 23, 2022 at 1:34 PM David Feuer  wrote:
>>
>> I can answer one of your questions for sure: the order of your case branches 
>> doesn't matter at all. However, the order of the data constructors in the 
>> type declaration does matter. Put your most likely one first.
>>
>> On Tue, Feb 22, 2022, 9:09 PM Clinton Mead  wrote:
>>>
>>> Hi All
>>>
>>> I'm developing an unbounded integer type, which I won't go into the details 
>>> here but in some circumstances has better performance than the standard 
>>> "Integer".
>>>
>>> Anyway, whilst there are complex cases, the most common case is a standard 
>>> machine int multiplication.
>>>
>>> Hence I want the type to be optimised for that case.
>>>
>>> I'm going to have a few constructors, anyway, so I first considered 
>>> something like this:
>>>
>>> `data MyInt = BasicZero | BasicPos Word | BasicNeg Word | ComplexPosA ... | 
>>> ComplexNegA ... | ComplexPosB ... | ComplexNegB ...`
>>>
>>> I'd naturally make the "Word"s in "BasicPos" and "BasicNeg" strict/unpack, 
>>> hopefully eliminating the indirection, or perhaps just making them 
>>> primitive directly.
>>>
>>> This has 7 constructors, which quite nicely I believe fits into the three 
>>> spare bits in a 64 bit pointer which GHC optimises I believe.
>>>
>>> However, this approach I thought of because I assumed that GHC would do a 
>>> switch style statement on the constructors, so once I have more than one I 
>>> might as well have as many as I want (well, up to 7, until I lose the 
>>> optimisation).
>>>
>>> But if GHC compiles this to a "if ... else if..." chain, a better 
>>> representation is the following:
>>>
>>> `data MyInt = BasicInt Int | ComplexPosA ... | ComplexNegA ... | 
>>> ComplexPosB ... | ComplexNegB ...`
>>>
>>> That way I can match against BasicInt first, and knock that out of the way 
>>> as my "hot" case. However, using "Int" instead of "Word" does make the 
>>> logic a bit more complex, but it may be worth it if I'm reducing the number 
>>> of patterns I have to check against for all arguments.
>>>
>>> I was just wondering if anyone could share some insight on what GHC does in 
>>> these circumstances? For example, does the order I list my cases in a case 
>>> statement matter if they're non-overlapping? Will GHC match them in the 
>>> order I list, or will it just make them into a switch statement so it 
>>> doesn't matter if I reorder them?
>>>
>>> I guess I could benchmark all this (and probably will) but some insights 
>>> and general guidance would be good.
>>>
>>> Thanks,
>>> Clinton
>>> ___
>>> Glasgow-haskell-users mailing list
>>> Glasgow-haskell-users@haskell.org
>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users


Re: Are constructors matched against using "switch" or chained if-else

2022-02-22 Thread Clinton Mead
Thanks David. Can I ask why? Is it because the first constructor is treated
specially? (perhaps because it has zeroed tag bits)? Or is it just because
there's always an if/else chain in order of the constructor definition
regardless of the order of the case statement so the higher up the list the
better?

On Wed, Feb 23, 2022 at 1:34 PM David Feuer  wrote:

> I can answer one of your questions for sure: the order of your case
> branches doesn't matter at all. However, the order of the data constructors
> in the type declaration does matter. Put your most likely one first.
>
> On Tue, Feb 22, 2022, 9:09 PM Clinton Mead  wrote:
>
>> Hi All
>>
>> I'm developing an unbounded integer type, which I won't go into the
>> details here but in some circumstances has better performance than the
>> standard "Integer".
>>
>> Anyway, whilst there are complex cases, the most common case is a
>> standard machine int multiplication.
>>
>> Hence I want the type to be optimised for that case.
>>
>> I'm going to have a few constructors, anyway, so I first considered
>> something like this:
>>
>> `data MyInt = BasicZero | BasicPos Word | BasicNeg Word | ComplexPosA ...
>> | ComplexNegA ... | ComplexPosB ... | ComplexNegB ...`
>>
>> I'd naturally make the "Word"s in "BasicPos" and "BasicNeg"
>> strict/unpack, hopefully eliminating the indirection, or perhaps just
>> making them primitive directly.
>>
>> This has 7 constructors, which quite nicely I believe fits into the three
>> spare bits in a 64 bit pointer which GHC optimises I believe.
>>
>> However, this approach I thought of because I assumed that GHC would do a
>> switch style statement on the constructors, so once I have more than one I
>> might as well have as many as I want (well, up to 7, until I lose the
>> optimisation).
>>
>> But if GHC compiles this to a "if ... else if..." chain, a better
>> representation is the following:
>>
>> `data MyInt = BasicInt Int | ComplexPosA ... | ComplexNegA ... |
>> ComplexPosB ... | ComplexNegB ...`
>>
>> That way I can match against BasicInt first, and knock that out of the
>> way as my "hot" case. However, using "Int" instead of "Word" does make the
>> logic a bit more complex, but it may be worth it if I'm reducing the number
>> of patterns I have to check against for all arguments.
>>
>> I was just wondering if anyone could share some insight on what GHC does
>> in these circumstances? For example, does the order I list my cases in a
>> case statement matter if they're non-overlapping? Will GHC match them in
>> the order I list, or will it just make them into a switch statement so it
>> doesn't matter if I reorder them?
>>
>> I guess I could benchmark all this (and probably will) but some insights
>> and general guidance would be good.
>>
>> Thanks,
>> Clinton
>> ___
>> Glasgow-haskell-users mailing list
>> Glasgow-haskell-users@haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
>>
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users


Re: Are constructors matched against using "switch" or chained if-else

2022-02-22 Thread David Feuer
I can answer one of your questions for sure: the order of your case
branches doesn't matter at all. However, the order of the data constructors
in the type declaration does matter. Put your most likely one first.

On Tue, Feb 22, 2022, 9:09 PM Clinton Mead  wrote:

> Hi All
>
> I'm developing an unbounded integer type, which I won't go into the
> details here but in some circumstances has better performance than the
> standard "Integer".
>
> Anyway, whilst there are complex cases, the most common case is a standard
> machine int multiplication.
>
> Hence I want the type to be optimised for that case.
>
> I'm going to have a few constructors, anyway, so I first considered
> something like this:
>
> `data MyInt = BasicZero | BasicPos Word | BasicNeg Word | ComplexPosA ...
> | ComplexNegA ... | ComplexPosB ... | ComplexNegB ...`
>
> I'd naturally make the "Word"s in "BasicPos" and "BasicNeg" strict/unpack,
> hopefully eliminating the indirection, or perhaps just making them
> primitive directly.
>
> This has 7 constructors, which quite nicely I believe fits into the three
> spare bits in a 64 bit pointer which GHC optimises I believe.
>
> However, this approach I thought of because I assumed that GHC would do a
> switch style statement on the constructors, so once I have more than one I
> might as well have as many as I want (well, up to 7, until I lose the
> optimisation).
>
> But if GHC compiles this to a "if ... else if..." chain, a better
> representation is the following:
>
> `data MyInt = BasicInt Int | ComplexPosA ... | ComplexNegA ... |
> ComplexPosB ... | ComplexNegB ...`
>
> That way I can match against BasicInt first, and knock that out of the way
> as my "hot" case. However, using "Int" instead of "Word" does make the
> logic a bit more complex, but it may be worth it if I'm reducing the number
> of patterns I have to check against for all arguments.
>
> I was just wondering if anyone could share some insight on what GHC does
> in these circumstances? For example, does the order I list my cases in a
> case statement matter if they're non-overlapping? Will GHC match them in
> the order I list, or will it just make them into a switch statement so it
> doesn't matter if I reorder them?
>
> I guess I could benchmark all this (and probably will) but some insights
> and general guidance would be good.
>
> Thanks,
> Clinton
> ___
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users
>
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/glasgow-haskell-users