Re: [Haskell-cafe] Re: Detecting system endianness
Maurício wrote: But why would you want that? I understand the only situation when talking about number of bytes makes sense is when you are using Foreign and Ptr. (...) Because I'm using both Ptr and Foreign? ;) See my recent announcement for bytestring-trie. One of the optimizations I'm working on is to read off a full natural word at a time, (...) I see, you mean the size of a machine word, not of Data.Word. AFAIK, Data.Word.Word is defined to be the same size as Prelude.Int (which it isn't on GHC 6.8.2 on Intel OS X: 32bits vs 31bits) and Int is defined to be at least 31bits but can be more. My interpretation of this is that Int and Word will generally be implemented by the architecture's natural word size in order to optimize performance, much like C's int and unsigned int but with better definition of allowed sizes. This seems to be supported by the existence of definite-sized variants Word8, Word16, Word32... So yeah, I'm meaning the machine word, but I think Word is intended to proxy for that. Maybe I'm wrong, but provided that Word contains (or can be persuaded to contain) a round number of Word8 and that operations on Word are cheaper than the analogous sequence of operations on the Word8 representation, that's good enough for my needs. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Detecting system endianness
On Tue, Dec 23, 2008 at 07:44:14PM -0500, wren ng thornton wrote: AFAIK, Data.Word.Word is defined to be the same size as Prelude.Int (which it isn't on GHC 6.8.2 on Intel OS X: 32bits vs 31bits) and Int is defined to be at least 31bits but can be more. My interpretation of this is that Int and Word will generally be implemented by the architecture's natural word size in order to optimize performance, much like C's int and unsigned int but with better definition of allowed sizes. This seems to be supported by the existence of definite-sized variants Word8, Word16, Word32... Of course, natural word size can mean 'natural pointer size' or 'natural int size'. Which are different on many architectures. So, you want to be careful about which you want. So yeah, I'm meaning the machine word, but I think Word is intended to proxy for that. Maybe I'm wrong, but provided that Word contains (or can be persuaded to contain) a round number of Word8 and that operations on Word are cheaper than the analogous sequence of operations on the Word8 representation, that's good enough for my needs. If you want to find out the 'natural' sizes, then look at the 'CInt', 'Ptr', and 'FunPtr' types, which follow the C 'int' 'void *' and 'void (*fn)()' types. So they will conform to the architecture ABI for the underlying spec/operating system. If you just want a type guarenteed to be able to hold a pointer or an integer, use 'IntPtr' or 'WordPtr' which are provided for just that case. John -- John Meacham - ⑆repetae.net⑆john⑈ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Detecting system endianness
Maurício wrote: But why would you want that? I understand the only situation when talking about number of bytes makes sense is when you are using Foreign and Ptr. Besides that, you can only guess the amount of memory you need to deal with your data (taking laziness, GC etc. into account). Because I'm using both Ptr and Foreign? ;) See my recent announcement for bytestring-trie. One of the optimizations I'm working on is to read off a full natural word at a time, instead of just one byte. To do this properly I need to detect the word size so that I don't accidentally read garbage off the end of the ByteString when there's less than a natural word left. Detecting endianness is similar because it determines how to interpret that word as if it were an array of bytes, which is needed to get the correct behavior when interpreting the word as a bit-vector for trieing. That is, if you only read the first two bytes on a big-endian machine, then you're skipping the 4/6/? bytes which are actually at the beginning of the bytestring. I'm not sure how important physical endianness of bytes within a word is. For IntMap a common case is to get large contiguous chunks of keys, so logical big-endian trieing improves performance over logical little-endian. I'm not sure how common large contiguous chunks of bytestring keys are, though. Reading a word then changing the physical endianness of the bytes seems expensive. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe