Re: [GHC] #1447: Improve code generation for dense switches
#1447: Improve code generation for dense switches -+-- Reporter: simonpj |Owner: Type: task | Status: new Priority: normal|Milestone: _|_ Component: Compiler | Version: 6.6.1 Keywords:| Testcase: Blockedby:| Difficulty: Moderate (less than a day) Os: Unknown/Multiple | Blocking: Architecture: Unknown/Multiple | Failure: None/Unknown -+-- Changes (by igloo): * failure: = None/Unknown Old description: (Adding this for Thomas Conway.) I'm writing some variable byte codec routines, which are used in inner^N loops, and which I want to make really efficient. There are several spots where the code uses lookup tables. The best example is the table, which given the first byte, returns the number of additional bytes. It is a precomputed version of the following function: {{{ codeLength :: Word8 - Int codeLength w | w .. 0x80 == 0 = 0 | otherwise = 1 + (codeLength $ w `shiftL` 1) }}} from which we construct a 256 entry table: {{{ codeLen 0 = 0 codeLen 1 = 0 codeLen 2 = 0 ... codeLen 127 = 0 codeLen 128 = 1 ... codeLen 255 = 8 }}} Now, compiling with ghc 6.6.1 and -O3, I see that it generates a long chain of conditional branches. Now, even taking laziness into account, this seems like terribly inefficient code. I wrote this thinking it would be more efficient than constructing a CAF array: {{{ codeLens = listArray (0,255) (map codeLength [0..255]) }}} but I'm guessing the CAF version would probably work out much more efficient in the long run. However, I think ghc (and any other compiler), should detect the following properties: 1. all the clauses are mutually exclusive, so the sequencing is irrelevant to the semantics 2. Given an explicit type declaration Word8 - , the 256 cases cover all the possible constructors of the type, so there are no missing clauses. Even if you leave out property 2 and include bounds checks, this seems like an important kind of function to optimize well. So what have I missed, or is it time to learn how to hack on ghc? New description: (Adding this for Thomas Conway.) I'm writing some variable byte codec routines, which are used in inner!^N loops, and which I want to make really efficient. There are several spots where the code uses lookup tables. The best example is the table, which given the first byte, returns the number of additional bytes. It is a precomputed version of the following function: {{{ codeLength :: Word8 - Int codeLength w | w .. 0x80 == 0 = 0 | otherwise = 1 + (codeLength $ w `shiftL` 1) }}} from which we construct a 256 entry table: {{{ codeLen 0 = 0 codeLen 1 = 0 codeLen 2 = 0 ... codeLen 127 = 0 codeLen 128 = 1 ... codeLen 255 = 8 }}} Now, compiling with ghc 6.6.1 and -O3, I see that it generates a long chain of conditional branches. Now, even taking laziness into account, this seems like terribly inefficient code. I wrote this thinking it would be more efficient than constructing a CAF array: {{{ codeLens = listArray (0,255) (map codeLength [0..255]) }}} but I'm guessing the CAF version would probably work out much more efficient in the long run. However, I think ghc (and any other compiler), should detect the following properties: 1. all the clauses are mutually exclusive, so the sequencing is irrelevant to the semantics 2. Given an explicit type declaration Word8 - , the 256 cases cover all the possible constructors of the type, so there are no missing clauses. Even if you leave out property 2 and include bounds checks, this seems like an important kind of function to optimize well. So what have I missed, or is it time to learn how to hack on ghc? -- -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/1447#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #1447: Improve code generation for dense switches
#1447: Improve code generation for dense switches -+-- Reporter: simonpj | Owner: Type: task| Status: closed Priority: normal | Milestone: _|_ Component: Compiler|Version: 6.6.1 Resolution: invalid | Keywords: Testcase: | Blockedby: Difficulty: Moderate (less than a day) | Os: Unknown/Multiple Blocking: | Architecture: Unknown/Multiple Failure: None/Unknown| -+-- Changes (by igloo): * status: new = closed * resolution: = invalid Comment: No response from submitter, so closing. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/1447#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler ___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
Re: [GHC] #1447: Improve code generation for dense switches
#1447: Improve code generation for dense switches -+-- Reporter: simonpj |Owner: Type: task | Status: new Priority: normal|Milestone: _|_ Component: Compiler | Version: 6.6.1 Severity: normal| Resolution: Keywords:| Difficulty: Moderate (1 day) Os: Unknown | Testcase: Architecture: Unknown | -+-- Comment (by simonmar): I have a feeling the HEAD should be better here, I fixed some bugs in the switch compilation post 6.6 while working on tagging. If you could try out the example with a recent GHC snapshot, that would be useful. Or post some actual code so that we can reproduce. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/1447 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
[GHC] #1447: Improve code generation for dense switches
#1447: Improve code generation for dense switches ---+ Reporter: simonpj | Owner: Type: task | Status: new Priority: normal| Milestone: _|_ Component: Compiler |Version: 6.6.1 Severity: normal| Keywords: Difficulty: Moderate (1 day) | Os: Unknown Testcase:| Architecture: Unknown ---+ (Adding this for Thomas Conway.) I'm writing some variable byte codec routines, which are used in inner^N loops, and which I want to make really efficient. There are several spots where the code uses lookup tables. The best example is the table, which given the first byte, returns the number of additional bytes. It is a precomputed version of the following function: {{{ codeLength :: Word8 - Int codeLength w | w .. 0x80 == 0 = 0 | otherwise = 1 + (codeLength $ w `shiftL` 1) }}} from which we construct a 256 entry table: {{{ codeLen 0 = 0 codeLen 1 = 0 codeLen 2 = 0 ... codeLen 127 = 0 codeLen 128 = 1 ... codeLen 255 = 8 }}} Now, compiling with ghc 6.6.1 and -O3, I see that it generates a long chain of conditional branches. Now, even taking laziness into account, this seems like terribly inefficient code. I wrote this thinking it would be more efficient than constructing a CAF array: {{{ codeLens = listArray (0,255) (map codeLength [0..255]) }}} but I'm guessing the CAF version would probably work out much more efficient in the long run. However, I think ghc (and any other compiler), should detect the following properties: 1. all the clauses are mutually exclusive, so the sequencing is irrelevant to the semantics 2. Given an explicit type declaration Word8 - , the 256 cases cover all the possible constructors of the type, so there are no missing clauses. Even if you leave out property 2 and include bounds checks, this seems like an important kind of function to optimize well. So what have I missed, or is it time to learn how to hack on ghc? -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/1447 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler___ Glasgow-haskell-bugs mailing list Glasgow-haskell-bugs@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs