RE: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-07-23 Thread Bayley, Alistair
-cafe Cc: Duncan Coutts Subject: Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling,plus comments on withCStringLen/peekCStringLen Hello cafe, (Following up on my own optimisation question, and Duncan's advice to look at http://darcs.haskell.org/ghc/compiler/utils/Encoding.hs

RE: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-07-23 Thread Bayley, Alistair
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Stefan O'Rear fromUTF8Ptr unboxes fine for me with HEAD and 6.6.1. - the chr function tests that its Int argument is less than 1114111, before constructing the Char. It'd be nice to avoid this test. You want unsafeChr

Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-07-22 Thread Alistair Bayley
Hello cafe, (Following up on my own optimisation question, and Duncan's advice to look at http://darcs.haskell.org/ghc/compiler/utils/Encoding.hs) If you want to look at some existing optimised UTF8 encoding/decoding code then take a look at the code used in GHC:

Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-07-22 Thread Stefan O'Rear
On Mon, Jun 04, 2007 at 09:43:32AM +0100, Alistair Bayley wrote: (The docs tell me that using GHC.Exts is the approved way of accessing GHC-specific extensions, but all of the useful stuff seems to be in GHC.Prim.) All of the useful stuff *is* exported from GHC.Exts, it even says so in the

Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-06-15 Thread Alistair Bayley
Simon, Sorry for the delay on responding. I'm using 6.6, so I'll upgrade to 6.6.1 and retest. Preusmably you're only interested if this behaviour persists in 6.6.1. I'll check both cases and make a test cases for them if necessary. I've upgraded to 6.6.1 and am pleased to report that there

Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-06-08 Thread Alistair Bayley
Simon, You're right, both versions should give the same code. Which version of GHC are you using? Both with the HEAD and with 6.6.1 I get the nice unboxed code with the `seq` version too. My test program is below. I'm using 6.6, so I'll upgrade to 6.6.1 and retest. Preusmably you're only

RE: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-06-07 Thread Simon Peyton-Jones
-cafe Subject: Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen {- Arity: 4 Strictness: LSSL -} Right. Unboxed args are always given the annotation L. So that function is strict in that pointer arg, but GHC is choosing not to unbox

Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-06-05 Thread Alistair Bayley
{- Arity: 4 Strictness: LSSL -} Right. Unboxed args are always given the annotation L. So that function is strict in that pointer arg, but GHC is choosing not to unbox it. I'm not sure why that's the case. I thought maybe it was because I hadn't said -funbox-strict-fields, but it didn't

Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-06-04 Thread Duncan Coutts
On Mon, 2007-06-04 at 09:43 +0100, Alistair Bayley wrote: After some expriments with the simplifier, I think I have a portable version of a direct-from-buffer decoder which seems to perform nearly as well as one written directly against GHC primitive unboxed functions. I'm wondering if

Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-06-04 Thread Alistair Bayley
On 04/06/07, Duncan Coutts [EMAIL PROTECTED] wrote: On Mon, 2007-06-04 at 09:43 +0100, Alistair Bayley wrote: After some experiments with the simplifier, ... The portable unboxed version is within about 15% of the unboxed version in terms of time and allocation. Well done. Of course, that

Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-06-04 Thread Duncan Coutts
On Mon, 2007-06-04 at 13:12 +0100, Alistair Bayley wrote: BTW, what's the difference between the indexXxxxOffAddr# and readXxxxOffAddr# functions in GHC.Prim? Right. So it'd only be safe to use the index ones on immutable arrays because there's no way to enforce sequencing with

Re: [Haskell-cafe] Optimising UTF8-CString - String marshaling, plus comments on withCStringLen/peekCStringLen

2007-05-23 Thread Duncan Coutts
On Wed, 2007-05-23 at 10:45 +0100, Alistair Bayley wrote: Hello cafe, D'ya fancy an optimisation exercise? In Takusen we currently marshal UTF8-encoded CStrings by first turning the CString into [word8], and then running this through a [Word8] - String UTF8 decoder. We thought it would be