Re[2]: [Haskell-cafe] A very edgy language (was: A very nontrivial parser)

2007-07-08 Thread Bulat Ziganshin
Hello Thomas,

Sunday, July 8, 2007, 2:36:43 AM, you wrote:
 This is certainly true. I've coded up in less than six months,
 something that uses better algorithms and finer grained concurrency
 than the software I used to work on, and the latter represented 5 or
 more man-years of coding. However this is server software, which is
 long running so performance and memory usage are pretty important, and
 these are relatively hard to get right in Haskell.

i've improved memory usage of my program 3 times one month after i've
started to use Haskell, and 4 times more 1.5 years later (the last
improvement included development of ByteString-alike library and
strictifying some computations). i think that for programming-in-large
experienced haskeller may reach C-like level of efficiency, unlike for
programming-in-small (i.e. implementation of raw computations)

-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Andrew Coppin

Claus Reinke wrote:
ah, that suggests yet another specification, a variation of the second 
version above, where the parser in control is not p1 itself, but p2, 
with p1 acting as an input transformation for p2, and p3 resuming 
where p1 left off. the difference being that p2's demand is supposed 
to drive p1's input processing. which is a bit of a problem.


Yep, that's the one.

parsers are usually data- and grammar-driven, not demand-driven, ie 
the input consumed by p1 does not usually depend on the demands on 
p1's output.


Yeah, I realise that. ;-)

I did wonder if Parsec's #include mechanism could do it - but apparently 
not. So I wrote my own...


looking a little bit more closely, however, p1 is used more as a 
piecewise input transformation for p2 than as a separate parser. so it 
makes more sense to integrate p1 into p2.


Possibly - except that I want to be able to stack any decoder on top of 
any decoder. For *output* this is a trivial matter of function 
composition. For input, this presents the tricky parsing problem we're 
now discussing...


that seems to be what you have in mind with your stacked approach, 
where the state is read exclusively through the fetch method in the 
Source interface, and a Source can either be a plain list or buffered 
item parser stacked on top of a Source (where fetch is served from the 
buffer, which is replenished by running the item parser over the inner 
Source; btw, are unused buffer contents discarded when p2 finishes? 
they can't be transformed back into p1/p3's input format..).


That's right.

(And yes, the idea is that the buffer should *always* be empty when the 
top parser exits. Assuming the data stream was built correctly in the 
first place, this should always hold...)


instead of using a type class, one could also parameterise p2 by its 
item parser, 'fetch'.


Mmm... that's an interesting idea. I'll have to have a think about that...

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Andrew Coppin

Jonathan Cast wrote:

I wouldn't call rank-2 types extremely rare . . .
  


Well now, my parser is annoyingly clunky to use, but it *works*. 
However, I just found something where it seems to be *impossible* to 
write the necessary code without rank-2 types...



I tried to write this type:

 data Encoder2 = Encoder2 {stage1 :: [Word8] - x, stage2 :: x - 
[Word8] - [Word8]}


However, that doesn't work. All type variables on the right must appear 
on the left:


 data Encoder2 x = Encoder2 {stage1 :: [Word8] - x, stage2 :: x - 
[Word8] - [Word8]}


Now I have a problem. I want to put several of these puppies into a big 
list - and I do *not* want to force the type variable 'x' to be the same 
in all cases! (Although one can easily imagine situations where you 
might want this.) So as of now, my code uses rank-2 types - despite the 
fact that I don't actually know what a rank-2 type *is* yet! o_O This is 
rather troubling...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very edgy language (was: A very nontrivial parser)

2007-07-08 Thread Donald Bruce Stewart
bulat.ziganshin:
 Hello Thomas,
 
 Sunday, July 8, 2007, 2:36:43 AM, you wrote:
  This is certainly true. I've coded up in less than six months,
  something that uses better algorithms and finer grained concurrency
  than the software I used to work on, and the latter represented 5 or
  more man-years of coding. However this is server software, which is
  long running so performance and memory usage are pretty important, and
  these are relatively hard to get right in Haskell.
 
 i've improved memory usage of my program 3 times one month after i've
 started to use Haskell, and 4 times more 1.5 years later (the last
 improvement included development of ByteString-alike library and
 strictifying some computations). i think that for programming-in-large
 experienced haskeller may reach C-like level of efficiency, unlike for
 programming-in-small (i.e. implementation of raw computations)

Yes, this agrees with my experience too. Programs of a certain size
become unfeasible to improve in C or C++ -- while the Haskell program
may be continually refactored and improved.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very edgy language

2007-07-08 Thread Andrew Coppin

Bulat Ziganshin wrote:

i've improved memory usage of my program 3 times one month after i've
started to use Haskell, and 4 times more 1.5 years later (the last
improvement included development of ByteString-alike library and
strictifying some computations). i think that for programming-in-large
experienced haskeller may reach C-like level of efficiency, unlike for
programming-in-small (i.e. implementation of raw computations)
  


Yeah, I spent yesterday building a whole toolbox of compression 
algorithms. However, it turns out that just one algorithm - BWT - is too 
absurdly slow. You may recall I implemented that a while back, and 
discovered that making it use a lazy ByteString make it many orders of 
magnitude faster.


Trouble is... the whole of the rest of my toolbox uses normal lists. So 
I'm going to have to do some horribly ugly hack just to make BWT work 
properly. (Like, I've built this whole nice abstract framework for all 
the other algorithms, and I'm going to have to punch a massive hole 
through the middle of it to make a ByteString BWT fit.) It's a real 
shame. (OTOH, I waited over 5 minutes for my program to try to take the 
BWT of a 12 KB file. It used in excess of 2 GB of RAM. That's clearly 
absurd...)


Does anybody have any clue why ByteStrings are actually faster? (And why 
this information isn't easily findable anywhere - must shorly be a VFAQ.)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very edgy language

2007-07-08 Thread Donald Bruce Stewart
andrewcoppin:
 
 Does anybody have any clue why ByteStrings are actually faster? (And why 
 this information isn't easily findable anywhere - must shorly be a VFAQ.)
 

It's well documented in the API documentation for bytestrings.

Start here:
http://www.cse.unsw.edu.au/~dons/fps/Data-ByteString-Lazy.html

And then read:
http://www.cse.unsw.edu.au/~dons/papers/CSL06.html

Cheers,
  Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Too many packages on hackage? :-)

2007-07-08 Thread Neil Mitchell

Hi


Looks like there's too many packages on hackage.haskell.org now for a
single page listing:

http://hackage.haskell.org/packages/archive/pkg-list.html

Perhaps we can have a page with just the categories, with subpages
hanging off?


Please don't. With one large page I can search the entire page
quickly, and its only 6 printed pages long - I think most people here
will be used to a 12 page limit :-)

In the future once its all nicely searchable, tagged, etc. we might
consider presenting the information in a different way, but one single
page is still fine.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Andrew Coppin

Andrew Coppin wrote:

Dave Bayer wrote:
I was beginning to accept that I might die before clearing my 
pipeline of research projects I want to code up. Haskell has given me 
new hope.


Indeed. ;-)

Today I hve implemented encoders and decoders for RLE, MTF, Fibonacci 
codes, and LZW. Next on my list is BWT, Huffman codes and arithmetic 
coding. (That last is *very* hard though...)


In case anybody cares...

  darcs get http://www.orphi.me.uk/darcs/ToyCompression
  ghc -O2 --make Encode
  ghc -O2 --make Decode

You can now do

 Encode LZW foo

Will look for a file named foo, and generate a file called foo-LZW 
containing the LZW-compressed version of it. Also foo-LZW.log.txt, 
which contains some statistics.


Similarly,

 Decode LZW foo-LZW

will make a file foo-LZW-unLZW, which should *hopefully* be identical 
to the original one. (!)


It didn't occur to me, but I should probably go add some way to discover 
a list of available algorithms! LOL. Anyway, currently working:


  RLE (Run Length Encoding). Counts runs of symbols. Considers 8 bits 
to be one symbol. The maximum run length is 255.
  MTF (Move To Front). Takes an input file and produces an output file 
of exactly the same size, but containing mostly *low* numbers. Works 
well with RLE or Fib.
  BWT (Burners-Wheeler Transform). Like MTF, does no actual 
compression, but makes the file more compressible.
  Fib (Fibonacci codes). Stores numbers using Fibonacci codes instead 
of unsigned binary. This makes low numbers smaller and high numbers 
bigger. (Use it with MTF...)
  LZW (Lempel-Ziv-Welch). The daddy. Looks for repeated input strings 
and compresses them.


Caution: Encoding or decoding BWT currently requires absurd amounts of 
time and space. (Like,  2 GB RAM and 8 minutes wall time to process 12 
KB of text.) I hope to fix that soon...


Currently it's unclear whether LZW or BWT+MTF+Fib gives the best 
compression. (Mainly because BWT is so hard to run!) I hope to implement 
a few other algorithms soonly.


(Note: LZW is typically used with a variable number of bits per output 
symbol. I haven't done this yet. It simply uses 16 bits in all cases. 
Once I fix this, compression will go up. Also, once the symbol 
dictionary is full, the encoder just resets itself.)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Donald Bruce Stewart
andrewcoppin:
 Andrew Coppin wrote:
 Dave Bayer wrote:
 I was beginning to accept that I might die before clearing my 
 pipeline of research projects I want to code up. Haskell has given me 
 new hope.
 
 Indeed. ;-)
 
 Today I hve implemented encoders and decoders for RLE, MTF, Fibonacci 
 codes, and LZW. Next on my list is BWT, Huffman codes and arithmetic 
 coding. (That last is *very* hard though...)
 
 In case anybody cares...
 
   darcs get http://www.orphi.me.uk/darcs/ToyCompression
   ghc -O2 --make Encode
   ghc -O2 --make Decode
 
 You can now do
 
  Encode LZW foo
 
 Will look for a file named foo, and generate a file called foo-LZW 
 containing the LZW-compressed version of it. Also foo-LZW.log.txt, 
 which contains some statistics.
 
 Similarly,
 
  Decode LZW foo-LZW
 
 will make a file foo-LZW-unLZW, which should *hopefully* be identical 
 to the original one. (!)
 
 It didn't occur to me, but I should probably go add some way to discover 
 a list of available algorithms! LOL. Anyway, currently working:
 
   RLE (Run Length Encoding). Counts runs of symbols. Considers 8 bits 
 to be one symbol. The maximum run length is 255.
   MTF (Move To Front). Takes an input file and produces an output file 
 of exactly the same size, but containing mostly *low* numbers. Works 
 well with RLE or Fib.
   BWT (Burners-Wheeler Transform). Like MTF, does no actual 
 compression, but makes the file more compressible.
   Fib (Fibonacci codes). Stores numbers using Fibonacci codes instead 
 of unsigned binary. This makes low numbers smaller and high numbers 
 bigger. (Use it with MTF...)
   LZW (Lempel-Ziv-Welch). The daddy. Looks for repeated input strings 
 and compresses them.
 
 Caution: Encoding or decoding BWT currently requires absurd amounts of 
 time and space. (Like,  2 GB RAM and 8 minutes wall time to process 12 
 KB of text.) I hope to fix that soon...
 
 Currently it's unclear whether LZW or BWT+MTF+Fib gives the best 
 compression. (Mainly because BWT is so hard to run!) I hope to implement 
 a few other algorithms soonly.
 
 (Note: LZW is typically used with a variable number of bits per output 
 symbol. I haven't done this yet. It simply uses 16 bits in all cases. 
 Once I fix this, compression will go up. Also, once the symbol 
 dictionary is full, the encoder just resets itself.)
 

Good work. Probably worth benchmarking against the other compression
libraries on hackage, just to get a sense for how well your
implementations are optimised (and maybe how to best interact with lazy
bytestrings?). See for example:

zlib   http://hackage.haskell.org/cgi-bin/hackage-scripts/package/zlib-0.3
bzlib  http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bzlib-0.3

and also:
lzf
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/Codec-Compression-LZF-0.2

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Too many packages on hackage? :-)

2007-07-08 Thread Conrad Parker

On 08/07/07, Neil Mitchell [EMAIL PROTECTED] wrote:

Hi

 Looks like there's too many packages on hackage.haskell.org now for a
 single page listing:

 http://hackage.haskell.org/packages/archive/pkg-list.html

 Perhaps we can have a page with just the categories, with subpages
 hanging off?

Please don't. With one large page I can search the entire page
quickly, and its only 6 printed pages long - I think most people here
will be used to a 12 page limit :-)

In the future once its all nicely searchable, tagged, etc. we might
consider presenting the information in a different way, but one single
page is still fine.


Also, the categorization doesn't quite correspond to, say, the library
hierarchy. For example, tagsoup is listed under Data, whereas it
exports modules under Text and could just as well be categorized under
Web. Similarly, xmonad is listed under System, but you might
reasonably expect to find it under User Interfaces (or Interfaces ...
those two seem to overlap ;-).

I agree with Don that the current page is getting a bit long, but I
also agree with Neil that it would be better to search/browse by tags
(or by exported modules) than to simply break it up using the current
categories. Perhaps I'm just overly agreeable.

So ... how would one go about hacking on hackage, setting up a local
installation for testing, submitting patches etc?

cheers,

Conrad.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Too many packages on hackage? :-)

2007-07-08 Thread Marc Weber
 Please don't. With one large page I can search the entire page
 quickly, 
That's what I'm doing all the time as well.
There are more options: 
Using columns ;)
Showing one category only / all packages optionally.
So I vote for keeping the existing page.
But I don't mind having also somithing different additionaly.

Marc
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Bulat Ziganshin
Hello Donald,

Sunday, July 8, 2007, 12:50:36 PM, you wrote:



too much quoting :(

 Good work. Probably worth benchmarking against the other compression
 libraries

are you really want to totally discredit Haskell? :)  they should be
hundreds of times slower than any practical compression algorithm
(and btw, zlib/bzlib isn't good performers anyway, my own algorithm is
several times faster with the same compression ratio)

Haskell isn't a good tool to develop compression algorithms because
it's the very well studied area where it has meaning to use all the
sorts of optimizations. so it falls in the low-level algorithms
category where using Haskell means at least 3x slower development and
3x worse performance - or faster development with 100x worse
performance. Andrew's code should fall into later category


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Andrew Coppin

Donald Bruce Stewart wrote:

andrewcoppin:
  
Does anybody have any clue why ByteStrings are actually faster? (And why 
this information isn't easily findable anywhere - must shorly be a VFAQ.)





It's well documented in the API documentation for bytestrings.

Start here:
http://www.cse.unsw.edu.au/~dons/fps/Data-ByteString-Lazy.html
  


I've read the API and still left wondering...


And then read:
http://www.cse.unsw.edu.au/~dons/papers/CSL06.html
  


Now *that* is far more useful... (And interesting.)

So what you're saying is that whereas fusion is usually used to mean 
that amazing technology that will one day supply us with unlimited 
amounts of cheap clean energy [shame it doesn't actually work], in the 
context of Haskell it seems fusion means that technology that makes 
all your code 98% faster for free? ;-)


I guess the question that's really burning in my mind is if ByteString 
is so much faster than [x], why can't you just do the same optimisations 
to [x]? In other words, why should I need to alter my code to get all 
this fusion goodness?


Now, as I understand it, a ByteString is a kind of unboxed array (= big 
RAM savings + big CPU time savings for not building it + big GC savings 
for not processing millions of list nodes + better cache performance). 
Or at least, a *strict* ByteString is; I'm very very fuzzy on exactly 
how a *lazy* ByteString is any different to a normal list. From my 
reading today, I take it the only real difference is that one is a 
linked list, whereas the other is a (boxed?) array, so it's smaller. (?)


At any rate, currently all my toy compression algorithms run at 
respectable speeds using [Word8], *except* for the BWT, which is 
absurdly slow. I've done everything I can think of to it, and it's still 
too slow. It's no use, I'm going to have to use ByteStrings to get any 
kind of performance out of it. I'm just wondering whether using 
getContents :: [Char] and then packing that into a ByteString is going 
to completely negate all the speed advantages. (I'm really not keen to 
completely mangle the entire toolbox just to make this one algorithm 
hurry up.)



Also, while I'm here... I can see a sorting algorithm implemented in 
Data.List, but I don't see any for other structures. For example, there 
doesn't appear to be any sorting functions for any kind of array. There 
doesn't appear to be anything for ByteString either. I'd like to see 
such a thing in the libraries. Yes, you can sort things using (say) 
Data.Map. But what if you have some data that's already *in* an array or 
a ByteString and you just want to sort it? (I also notice that the 
mutable arrays don't seem to provide an in-place map function. Obviously 
that's only ever going to work for a function that doesn't change the 
value's type, but even so...)


Finally, I don't see anything in the libraries that would efficiently 
sort (large) strings. Data.List.sort and Data.Map.Map both use an Ord 
context, and we have instance (Ord x) = Ord [x], but one would think 
that a [potentially large] list of values could be sorted more 
efficiently using a radix sort than a quicksort...? An implementation of 
Data.Map especially for the (common?) case of the keys being a *list* of 
Ord items would seem useful... (And in my current program, I'm probably 
going to implement a radix sort on lists of ByteStrings.)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Andrew Coppin

Bulat Ziganshin wrote:

Hello Donald,
  

Good work. Probably worth benchmarking against the other compression
libraries



are you really want to totally discredit Haskell? :)  they should be
hundreds of times slower than any practical compression algorithm
(and btw, zlib/bzlib isn't good performers anyway, my own algorithm is
several times faster with the same compression ratio)

Haskell isn't a good tool to develop compression algorithms because
it's the very well studied area where it has meaning to use all the
sorts of optimizations. so it falls in the low-level algorithms
category where using Haskell means at least 3x slower development and
3x worse performance - or faster development with 100x worse
performance. Andrew's code should fall into later category
  


Indeed. I'm more interested in which algorithms produce the best 
compression ratios than how to implement them fast. Currently the code 
uses very general, flexible, polymorphic types all over the place, 
everything is normal lists, I'm using monadic parsing rather than 
bit-twiddling, etc etc etc. It goes alright with 100 KB files on my 
monster PC at home, but toss a 4 MB file over there and it takes a 
minute or two to compress. Hardly a match for a practical compression 
solution that's been optimised to within inches of its life in C or even 
assembly. ;-)


I mentioned that my LZW algorithm isn't even as efficient as it could be 
- it uses 16 bits per symbol rather than being variable. Partly that's 
because it's easier to code. But partly that's so that I can write a 
16-bit Huffman compressor and run it over the top. (LZW + Huffman being 
a very common combination.) And that's really what this is - a toolbox 
for throwing algorithms together to see what they do.


OTOH, if the Gods behind GHC want to take a look and figure out whether 
there's any magic that can be added to the optimiser, I wouldn't 
complain... ;-)


(Realistically though. My program takes a [Word8] and turns it into a 
[Bool] before running a parser over it. The GHC optimiser doesn't really 
stand a hope in hell of optimising that into a program that reads a 
machine word into a CPU register and starts playing with bit flips on it...)


PS. Are those zlib libraries actually written in Haskell? Or are they a 
thin layer on top of a C library?


PPS. Does GHC make use of MMX, SSE, et al?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Neil Mitchell

Hi


I guess the question that's really burning in my mind is if ByteString
is so much faster than [x], why can't you just do the same optimisations
to [x]? In other words, why should I need to alter my code to get all
this fusion goodness?


You already get some benefit of fusion with lists:

* http://research.microsoft.com/~simonpj/Papers/rules.htm

People are working on more:

* http://www-users.cs.york.ac.uk/~ndm/supero/
* http://www.cse.unsw.edu.au/~dons/papers/CLS07.html
* many, many others

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: A very edgy language (was: A very nontrivial parser)

2007-07-08 Thread Logan Capaldo
Donald Bruce Stewart dons at cse.unsw.edu.au writes:

 Give #haskell is a far larger community than:
 As well as
 
 #java
 #javascript
 #ruby
Try #ruby-lang instead ;) At least assuming you were talking about the
programming language ruby, and the channel on freenode.
 #lua
 #d
 #perl6
 
 Maybe we need to reconsider where the (FP) mainstream is now? 
 
Maybe so.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Tillmann Rendel

Andrew Coppin wrote:

Now, as I understand it, a ByteString is a kind of unboxed array (= big 
RAM savings + big CPU time savings for not building it + big GC savings 
for not processing millions of list nodes + better cache performance). 
Or at least, a *strict* ByteString is; I'm very very fuzzy on exactly 
how a *lazy* ByteString is any different to a normal list. From my 
reading today, I take it the only real difference is that one is a 
linked list, whereas the other is a (boxed?) array, so it's smaller. (?)


As I understand it (wich may or may not be correct):

A normal Haskell string is basically  [Word8]
A strict ByteString ist basically UArray Int Word8
A lazy ByteString is basically[UArray Int Word8]

[Word8] is processed one byte at once
UArray Int Word8 is processed all at once
[UArray Int Word8] is processed a chunk of bytes at once

So loading and processing [Word8] corresponds to

  while (!eof(file)) {
Byte current = readByteFromFile(file);
processByte(current);
  }

loading and processing a strict ByteString corresponds to

  int size = readWholeFileIntoBuffer(file, buffer);
  for (int i = 0; i  size; i++)
processByte(buffer[i]);


and loading and processing a lazy ByteString corresponds to

  while (!eof(file)) {
int size = readNextNBytesIntoBuffer(file, buffer, buffersize);
for (int i = 0; i  size; i++)
  processByte(buffer[i]);
  }

in a C-like language. The first is nonsense because of I/O overhead, the 
second is fine for small files only and the third combines the best of 
both worlds. Unlike the C examples, Haskell lists face not only I/O 
overhead, but general overhead of boxed representation and laziness, 
too. But even in C, the third solution ist the best, but some programs 
don't use it. So Bytestring powered Haskell programs are able to 
outperform some C programs.


The most important contributions of the ByteStrings library seem to be 
these:


  (1) unify lazy and strict Bytestrings interface-wise and
  (2) allow idiomatic Haskell programs using lazy ByteStrings
  to be compiled to something like the third c program above.

  Tillmann
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Brandon S. Allbery KF8NH


On Jul 8, 2007, at 3:21 , Andrew Coppin wrote:

this.) So as of now, my code uses rank-2 types - despite the fact  
that I don't actually know what a rank-2 type *is* yet! o_O This is  
rather troubling...


Bah --- I use monads all the time and still don't have much of a clue  
about category theory.  :)
(For that matter, I can drive a car without understanding what's  
going on under the hood.)


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Andrew Coppin

Brandon S. Allbery KF8NH wrote:


On Jul 8, 2007, at 3:21 , Andrew Coppin wrote:

this.) So as of now, my code uses rank-2 types - despite the fact 
that I don't actually know what a rank-2 type *is* yet! o_O This is 
rather troubling...


Bah --- I use monads all the time and still don't have much of a clue 
about category theory.  :)
(For that matter, I can drive a car without understanding what's going 
on under the hood.)




Aye, you drive a car without knowing how it works - but it was put 
together by some people who *do* know these things. Would you drive a 
car you built yourself? ;-)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very edgy language (was: A very nontrivial parser)

2007-07-08 Thread Brandon S. Allbery KF8NH


On Jul 7, 2007, at 7:23 , Thomas Conway wrote:


I've been working in a mostly Python shop this last year, and it
reinforces my belief that people who don't like strong static typing
are yahoos, not professionals interested in producing high quality
code. Maybe I just don't get the line between professionalism and
paranoia. ;-)


Security is one of my side specialties (i.e. not my main focus as a  
sysadmin, but a greater focus than e.g. storage or networking which  
aren't really my focus at all).  And it seems to me that most people  
treat it about the same way they treat the notion of strong typing.   
In fact, I could make pretty much the same point about  
professionalism vs. paranoia with respect to security; the difference  
being that, thanks to Internet-facing credit card processing systems,  
at least *some* people have had their faces rubbed in their mistakes.


(This correspondence is in fact one of the reasons I became  
interested in Haskell.  http://blog.moertel.com/articles/2006/10/18/a- 
type-based-solution-to-the-strings-problem is exactly the kind of  
thing I was hoping to find.)


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Andrew Coppin

Neil Mitchell wrote:

Hi


I guess the question that's really burning in my mind is if ByteString
is so much faster than [x], why can't you just do the same optimisations
to [x]? In other words, why should I need to alter my code to get all
this fusion goodness?


You already get some benefit of fusion with lists:

* http://research.microsoft.com/~simonpj/Papers/rules.htm

People are working on more:

* http://www-users.cs.york.ac.uk/~ndm/supero/
* http://www.cse.unsw.edu.au/~dons/papers/CLS07.html
* many, many others


I always have trouble tracking exactly which version(s) of GHC actually 
implement all this stuff... ;-)


Maybe I'll go find a Linux box sometime and try the head version, just 
for kicks...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Brandon S. Allbery KF8NH


On Jul 8, 2007, at 8:12 , Andrew Coppin wrote:


Brandon S. Allbery KF8NH wrote:


On Jul 8, 2007, at 3:21 , Andrew Coppin wrote:

this.) So as of now, my code uses rank-2 types - despite the fact  
that I don't actually know what a rank-2 type *is* yet! o_O This  
is rather troubling...


Bah --- I use monads all the time and still don't have much of a  
clue about category theory.  :)
(For that matter, I can drive a car without understanding what's  
going on under the hood.)




Aye, you drive a car without knowing how it works - but it was put  
together by some people who *do* know these things. Would you drive  
a car you built yourself? ;-)


No :) --- but depending on what you're doing, you can use rank-2  
types without knowing what's under the hood.  In fact, I'd say the  
fact that you're using them is evidence of that.


(Aside --- looking at your problem description, I wonder if GADTs  
would be a better fit.)


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Andrew Coppin

Brandon S. Allbery KF8NH wrote:


On Jul 8, 2007, at 8:12 , Andrew Coppin wrote:

Aye, you drive a car without knowing how it works - but it was put 
together by some people who *do* know these things. Would you drive a 
car you built yourself? ;-)


No :) --- but depending on what you're doing, you can use rank-2 types 
without knowing what's under the hood.  In fact, I'd say the fact that 
you're using them is evidence of that.


(Aside --- looking at your problem description, I wonder if GADTs 
would be a better fit.)


Oh, I don't mind not knowing how rank-2 types are *implemented*. ;-) But 
it would be nice to know what they *are*... :-S


(Thus far, they just seem to be some incomprehensible syntax that makes 
the compiler stop complaining. In particular, I have no idea what the 
difference between rank-2, rank-N and existentially quantified is...)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Trying to make a Typeable instance

2007-07-08 Thread Adrian Hey

Neil Mitchell wrote:

data ListGT map k a
  = Empt
  | BraF ![k] a !(map (ListGT map k a))
  | BraE ![k]   !(map (ListGT map k a))
   deriving( Typeable )


Not in Haskell, only in GHC.


Thanks for the suggestions from Hugh and Neil. I tried this anyway
and it doesn't work even with ghc I'm afraid..


Can't make a derived instance of `Typeable (ListGT map k a)'
 (`ListGT' has arguments of kind other than `*')
When deriving instances for `ListGT'


So it seems ghc doesn't like kinds (* - *) either :-(

Actually, AFAICT the problem seems to be with Data.Typeable
itself rather than ghc. There is no proper TypeRep for
(ListGT map k a) because map is not a type.

Or maybe I'm missing something. Is it possible to make correct
instances of Typeable for types like this?

What would Data.Derive make of this?

Thanks for any thoughts or insight
--
Adrian Hey


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Too many packages on hackage? :-)

2007-07-08 Thread apfelmus
Conrad Parker wrote:
 On 08/07/07, Neil Mitchell [EMAIL PROTECTED] wrote:
 Hi

  Looks like there's too many packages on hackage.haskell.org now for a
  single page listing:
 
  http://hackage.haskell.org/packages/archive/pkg-list.html
 
  Perhaps we can have a page with just the categories, with subpages
  hanging off?

 Please don't. With one large page I can search the entire page
 quickly, and its only 6 printed pages long - I think most people here
 will be used to a 12 page limit :-)

 In the future once its all nicely searchable, tagged, etc. we might
 consider presenting the information in a different way, but one single
 page is still fine.

 Also, the categorization doesn't quite correspond to, say, the library
 hierarchy. For example, tagsoup is listed under Data, whereas it
 exports modules under Text and could just as well be categorized under
 Web. Similarly, xmonad is listed under System, but you might
 reasonably expect to find it under User Interfaces (or Interfaces ...
 those two seem to overlap ;-).

For me, a single page makes it difficult to browse the list to see
what's interesting or to see what package would fit my current needs
most. Scrambled categorizations have the same effect. Also, I'd favor a
greater distinction between applications and libraries.

For browsing libraries, I like the wiki pages much more than hackage.
Can't those two be merged into one?

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Trying to make a Typeable instance

2007-07-08 Thread Benja Fallenstein

Hi Adrian,

2007/7/8, Adrian Hey [EMAIL PROTECTED]:

So it seems ghc doesn't like kinds (* - *) either :-(

Actually, AFAICT the problem seems to be with Data.Typeable
itself rather than ghc. There is no proper TypeRep for
(ListGT map k a) because map is not a type.


Have you tried using (Typeable1 map) as the constraint?

- Benja
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Tillmann Rendel

Andrew Coppin wrote:
Oh, I don't mind not knowing how rank-2 types are *implemented*. ;-) But 
it would be nice to know what they *are*... :-S


(Thus far, they just seem to be some incomprehensible syntax that makes 
the compiler stop complaining. In particular, I have no idea what the 
difference between rank-2, rank-N and existentially quantified is...)


Most Haskell extensions are more like restriction removals from a 
application programmer's point of view. If you are fully ignorant of the 
matter and never realized there was a restriction, you have no reason to 
fear the removal of the restriction, since it enables you to stay 
ignorant of the matter. All you have to do is to pass some flag to the 
compiler for historical reasons.


(Ok, there is the question of portability to other compilers...)

The idea about higher-ranked types is to allow explicit forall keywords 
in types wherever you like. The restriction about rank-1-types is to 
disallow forall keywords almost everywhere.


So higher-ranked types aren't a edge-case extension, they are a sensible 
lifting of some edge-case restriction.


(Of course, for compiler writers and the like, lifting this restrictions 
means extending the type system and underlying theory. But why should 
you care?)


So instead of learning about higher-ranked types, I would learn about 
the forall keyword, getting higher-ranked types for free, by using it in 
some former illegal positions out of ignorance of the former restriction.



Most programming language extensions seem to arise like this: Wouldn't 
it be nice if I could write this and it would mean that. Maybe if I 
steal something from category theory, it could be possible...


  Tillmann
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Malte Milatz
Tillmann Rendel:
 As I understand it (wich may or may not be correct):
 
 A normal Haskell string is basically  [Word8]

Hm, let's see whether I understand it better or worse.  Actually it is
[Char], and Char is a Unicode code point in the range 0..1114111 (at
least in GHC).  Compare:

Prelude Data.Word fromEnum (maxBound :: Char)
1114111
Prelude Data.Word fromEnum (maxBound :: Word8)
255

So it seems that the Char type abstracts the encoding away.  I'm
actually a little confused by this, because I haven't found any means to
make the I/O functions of the Prelude (getContents etc.) encoding-aware:
The string รค, when read from a UTF-8-encoded file via readFile, has a
length of 2.  Anyone with a URI to enlighten me?

Malte

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Too many packages on hackage? :-)

2007-07-08 Thread Bulat Ziganshin
Hello apfelmus,

Sunday, July 8, 2007, 5:20:18 PM, you wrote:

  Looks like there's too many packages on hackage.haskell.org now for a

it's the nicest problem i can imagine :)

 For browsing libraries, I like the wiki pages much more than hackage.
 Can't those two be merged into one?

may be we just need alternative single-page listing with FULL
package descriptions instead of brief one. this should be rather close
in amount of information to old wiki. Categories section on top of
page should be enough for anyone who need pages split by category

many years ago i already proposed to extend hackage to include
information about general entities, not only packages uploaded to it.
unfortunately, its author was not interested in this. i still think
that joining up HCAR, librariestools wiki pages and hackage into the
one information source will give us significant benefits, making
searching of needed library easier, especially for casual haskellers.
as you can see, in many cases, haskellers think that some libraries
doesn't exist just because it's hard to find them


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Bulat Ziganshin
Hello Andrew,

Sunday, July 8, 2007, 4:31:32 PM, you wrote:
 Oh, I don't mind not knowing how rank-2 types are *implemented*. ;-) But
 it would be nice to know what they *are*... :-S

concrete types are rank-0:

sin :: Double-Double

polymorphic types are rank-1:

length :: forall a . [a] - Int

functions which arguments are rank-1 types are rank-2:

f :: (forall a . [a] - Int) - Int

and so on. rank-2 and rank-N considered separately because it's easier
to implement only rank-2 polymorphism and some compilers stops here
(and rank-2 polymorphism used in ST monad which is pretty standard
feature, while higher-rank functions are rarely required)


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Bulat Ziganshin
Hello Malte,

Sunday, July 8, 2007, 6:38:19 PM, you wrote:
 The string a, when read from a UTF-8-encoded file via readFile, has a
 length of 2.  Anyone with a URI to enlighten me?

if you need UTF-8 i/o, look at

http://hackage.haskell.org/packages/archive/pkg-list.html



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Stefan O'Rear
On Sun, Jul 08, 2007 at 04:38:19PM +0200, Malte Milatz wrote:
 Tillmann Rendel:
  As I understand it (wich may or may not be correct):
  
  A normal Haskell string is basically  [Word8]
 
 Hm, let's see whether I understand it better or worse.  Actually it is
 [Char], and Char is a Unicode code point in the range 0..1114111 (at
 least in GHC).  Compare:
 
   Prelude Data.Word fromEnum (maxBound :: Char)
   1114111
   Prelude Data.Word fromEnum (maxBound :: Word8)
   255
 
 So it seems that the Char type abstracts the encoding away.  I'm
 actually a little confused by this, because I haven't found any means to
 make the I/O functions of the Prelude (getContents etc.) encoding-aware:
 The string รค, when read from a UTF-8-encoded file via readFile, has a
 length of 2.  Anyone with a URI to enlighten me?

Not sure of any URIs.

Char is just a code point.  It's a 32 bit integer (64 on 64-bit
platforms due to infelicities in the GHC backend) with a code point.  It
is not bytes.  A Char in the heap also has a tag-pointer, bringing the
total to 8 (16) bytes.  (However, GHC uses shared Char objects for
Latin-1 characters, so a fresh Char in that range uses 0 bytes).

[a] is polymorphic.  It is a linked list, it consumes 12 (24) bytes per
element.  It just stores pointers to its elements, and has no hope of
packing anything.

[Char] is a linked list of pointers to heap-allocated fullword integers,
20 (40) bytes per character (assuming non-latin1).

The GHC IO functions truncate down to 8 bits.  There is no way in GHC to
read or write full UTF-8, short of doing the encoding yourself (google
for UTF8.lhs).

Stefan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Too many packages on hackage? :-)

2007-07-08 Thread brad clawsie
On Sun, Jul 08, 2007 at 01:32:08PM +1000, Donald Bruce Stewart wrote:
 Looks like there's too many packages on hackage.haskell.org now for a
 single page listing:
 
 Perhaps we can have a page with just the categories, with subpages
 hanging off?

perhaps support both views? 

a comprehensive listing works nicely for searching with /, cpan
for example supports this with a much larger repository
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Stefan O'Rear
On Sun, Jul 08, 2007 at 12:10:04PM +0100, Andrew Coppin wrote:
 (Realistically though. My program takes a [Word8] and turns it into a 
 [Bool] before running a parser over it. The GHC optimiser doesn't really 
 stand a hope in hell of optimising that into a program that reads a machine 
 word into a CPU register and starts playing with bit flips on it...)

Actually, if you're very lucky (fusion is just as hard in Haskell as it
is in real life), it *does*.  It seems to fit nicely into the
stream-fusion framework.

 PS. Are those zlib libraries actually written in Haskell? Or are they a 
 thin layer on top of a C library?

Yup, they wrap C's zlib.

 PPS. Does GHC make use of MMX, SSE, et al?

No (in spirit - the native code generator uses 1-element SSE operations
for floating point because it's easier to optimize than FPU code).

Stefan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Andrew Coppin

Bulat Ziganshin wrote:

Hello Andrew,

Sunday, July 8, 2007, 3:10:04 PM, you wrote:
  

Indeed. I'm more interested in which algorithms produce the best
compression ratios than how to implement them fast.



algorithms you are tried so far (except for bwt+mtf) is the standard
student's set :)  of those, lzw is most advanced, but even this
algorithm is 20-year old
  


Yeah, well, naturally I'm implementing the simple ones first. ;-)

Actually, LZW works surprisingly well for such a trivial little 
algorithm... When you compare it to the complexity of an arithmetic 
coder driven by a high-order adaptive PPM model (theoretically the best 
general-purpose algorithm that exists), it's amazing that anything so 
simple as LZW should work at all!



Currently the code
uses very general, flexible, polymorphic types all over the place, 
everything is normal lists, I'm using monadic parsing rather than 
bit-twiddling, etc etc etc. It goes alright with 100 KB files on my



testing today's algorithms on such small datasets don't make much
sense because they need much more input to gather enough statistics
about data compressed. i prefer hundreds-megabytes tests and don't
know anyone with a test files smaller than 1 mb
  


OTOH, I'd like it to finish this month...

(LZW is faster now I'm using Data.Map. But even so, 15 minutes to 
compress 640 KB of zeros? That's pretty slow!)



btw, in most cases, C algorithm with today's C compilers will work
faster than asm oe - because you should have very good knowledge of
COU internals to write efficient program. i dream about the days when
the same will become true for Haskell
  


We all do... *sigh*

(Or rather, most of us dream. A tiny few people seem to be trying to 
actually do stuff about it...)



i never heard about lzw+huf compressors :)  there are two lz families
- lz78 (lzw) which was popular in 80's and used variable-size words,
and lz77 (lzss) which together with huffman was the most popular in
90's. compress (.Z) and pkzip 1 was lzw, while zip is lzh (i.e.
lz77+huffman) 


there is no much meaning to use huffman with lzw because many words
will have only one or two occurrences
  


Hmm... that's interesting. I was *sure* it said that LHA used 
LZW+Huffman... oh well!


You say that there would be no meaning to using a Huffman code to 
encode the output from an LZW stage. But consider this: The standard 
approach is for the LZW encoder to spit out 9 bits/symbol initially, and 
than 10 bits/symbol once the table fills up a bit more, and then 11 
bits/symbol after that, and so on. (Usually until some hard-coded limit 
is reached. Typically the encoder just resets itself at that point or 
something.) So we're allocating more and more bits per symbol, but this 
is based only on how full the table is, *regardless* of whether any of 
these new symbols are even in fact *used*. ;-) Now, by using a Huffman 
code (scanning the actual produced output rather than working 
adaptively) we can avoid assigning bit combinations to unused symbols.


(The downside of course is that now we need a Huffman table in the 
output - and any rare symbols might end up with rather long codewords. 
But remember: Huffman *guarantees* 0% compression or higher on the 
payload itself. A Huffman-compressed payload can *never* be bigger, only 
smaller or same size. So long as you can encode the Huffman table 
efficiently, you should be fine...)



btw, i invite you to the http://encode.ru/forums where you will find
many compressor authors. you will be second one there who use haskell
and first one who write compression algorithms in it :)
  


.ru = Russia?


OTOH, if the Gods behind GHC want to take a look and figure out whether
there's any magic that can be added to the optimiser, I wouldn't 
complain... ;-)

(Realistically though. My program takes a [Word8] and turns it into a 
[Bool] before running a parser over it. The GHC optimiser doesn't really
stand a hope in hell of optimising that into a program that reads a 
machine word into a CPU register and starts playing with bit flips on it...)



as time goes, those terrible compilers becomes smarter and smarter :)
  


Oh hey, I think GHC is already pretty smart. But no optimiser can ever 
hope to cover *every* possible case. And transforming [Bool] - [Bool] 
into UArray Word8 - UArray Word8 just seems a little bit 
optimistic, methinks. ;-)





PPS. Does GHC make use of MMX, SSE, et al?



it can do it with via-c compilation and there are plans for native
codegenerator too.


The way I heard it is that it *does* native code generation, but it's 
not as good as the via-C version. (Can't imagine why... You would have 
thought directly implementing functional primitives would be much easier 
than trying to mangle them to fit C's multitude of limitations. Still, 
what do I know?)



but it seems that you don't know asm and believe in
advertising hype. i'm pretty sure that mmx can't help in algorithms
you implementing and 

Re: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Malte Milatz
Stefan O'Rear:
 Char is just a code point.  It's a 32 bit integer (64 on 64-bit
 platforms due to infelicities in the GHC backend) with a code point. 
 [...]
 The GHC IO functions truncate down to 8 bits.  There is no way in GHC to
 read or write full UTF-8, short of doing the encoding yourself (google
 for UTF8.lhs).

Thanks, this makes things clear to me.

 [Char] is a linked list of pointers to heap-allocated fullword 
 integers, 20 (40) bytes per character (assuming non-latin1).

Hey, I love ByteStrings!  ;-)

Malte

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Andrew Coppin

Stefan O'Rear wrote:

On Sun, Jul 08, 2007 at 12:10:04PM +0100, Andrew Coppin wrote:
  
(Realistically though. My program takes a [Word8] and turns it into a 
[Bool] before running a parser over it. The GHC optimiser doesn't really 
stand a hope in hell of optimising that into a program that reads a machine 
word into a CPU register and starts playing with bit flips on it...)



Actually, if you're very lucky (fusion is just as hard in Haskell as it
is in real life), it *does*.  It seems to fit nicely into the
stream-fusion framework.
  


Ooo... really? That's pretty impressive...(!)

Is there a way I can check? ;-) More usefully, can I do stuff to my code 
to make myself more lucky?


(Love the comment about RL BTW!)

PS. Are those zlib libraries actually written in Haskell? Or are they a 
thin layer on top of a C library?



Yup, they wrap C's zlib.
  


Thought so. Comparing native Haskell to a heavily optimised C library 
would surely be just like comparing native Haskell to a compiled C binary...



PPS. Does GHC make use of MMX, SSE, et al?



No (in spirit - the native code generator uses 1-element SSE operations
for floating point because it's easier to optimize than FPU code).
  


Does GHC actually do anything that could usefully use these primitives? 
(I'm guessing no...)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Andrew Coppin

Malte Milatz wrote:

Stefan O'Rear:
  
[Char] is a linked list of pointers to heap-allocated fullword 
integers, 20 (40) bytes per character (assuming non-latin1).



Hey, I love ByteStrings!  ;-)
  


If only there were a way to write functions that transparently work on 
both [x] and ByteString...


(Well, I mean, there *is* - if you want to write *lots* of code by hand 
yourself...)


Anyone have any comments on how ByteString is different from, say, 
UArray Word8?


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Stefan O'Rear
On Sun, Jul 08, 2007 at 04:07:59PM +0100, Andrew Coppin wrote:
 The way I heard it is that it *does* native code generation, but it's not 
 as good as the via-C version. (Can't imagine why... You would have thought 
 directly implementing functional primitives would be much easier than 
 trying to mangle them to fit C's multitude of limitations. Still, what do I 
 know?)

This is very true - but -fvia-C isn't really C.  First it generates GCC
C (with global register variables, among other things), then it feeds
the output through gcc - and then it runs a GHC-specific Perl script
called the Evil Mangler over the assembly output, which promptly regexes
your code to death.

If you want to see what a difference exists between true C and the NCG,
try compiling your program with -fvia-C -unreg.  -unreg tells GHC to
generate (something very close to) ANSI C; also, the mangler is not
used.  -unreg is mainly used for porting...

Stefan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Stefan O'Rear
On Sun, Jul 08, 2007 at 04:16:46PM +0100, Andrew Coppin wrote:
 Malte Milatz wrote:
 Stefan O'Rear:
   
 [Char] is a linked list of pointers to heap-allocated fullword integers, 
 20 (40) bytes per character (assuming non-latin1).
 

 Hey, I love ByteStrings!  ;-)
   

 If only there were a way to write functions that transparently work on both 
 [x] and ByteString...

 (Well, I mean, there *is* - if you want to write *lots* of code by hand 
 yourself...)

 Anyone have any comments on how ByteString is different from, say, UArray 
 Word8?

1. ByteString uses pinned memory, so you can safely pass ByteStrings to
   C code (as char*) without worrying about the strings moving.

2. ByteString uses foreignptrs, which mean that you can construct
   bytestrings from any block of memory, and you can associate arbitrary
   actions (free(), nothing, something fancier, ...) with the ByteString
   becoming unreferenced.

3. ByteString uses explicit offset and length fields, allowing a O(1)
   tail operation (very important for functional-style code)

4. Lazy bytestrings are completely different - look elsewhere in this
   thread.

Stefan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Bulat Ziganshin
Hello Andrew,

Sunday, July 8, 2007, 7:16:46 PM, you wrote:
 [Char] is a linked list of pointers to heap-allocated fullword
 integers, 20 (40) bytes per character (assuming non-latin1).
 

 Hey, I love ByteStrings!  ;-)


actually only 12 (24 for 64-but cpu) as far as you use latin-1 chars.
the same should be true for [word8] but it's better to ask Simon

OTOH, using GC makes memory usage 3x larger. in those practical C
compression algorithms, memory is controlled manually. so, you have
36x space overhead. now you may guess how i decreased memory usage
12-fold :)

 If only there were a way to write functions that transparently work on
 both [x] and ByteString...

use pack/unpack to convert between them - it's cheap compared to your
algorithms

 Anyone have any comments on how ByteString is different from, say,
 UArray Word8?

mainly, algorithms implemented. the only technical difference is that
UArray uses ByteArray# and ByteString uses PinnedByteArray#, which has
different behavior in GC

-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Bulat Ziganshin
Hello Stefan,

Sunday, July 8, 2007, 7:22:03 PM, you wrote:

 This is very true - but -fvia-C isn't really C.

 If you want to see what a difference exists between true C and the NCG,
 try compiling your program with -fvia-C -unreg

even better, try to to use jhc. it compiles via gcc and unlike ghc, it
generates natural code. as result, it's only haskell compiler that is
able to generate C-quality code:

http://thread.gmane.org/gmane.comp.lang.haskell.glasgow.user/8827


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very edgy language

2007-07-08 Thread Andrew Coppin

Gwern Branwen wrote:


Out of curiosity, why does ByteString wreck the cleanness of your BWT? It seems 
to me that if you're doing
 bwt :: String - Whatever
 bwt arg = ...(time and space intensive ByteString operations) $ 
ByteString.pack arg

then your code is only modestly less clean.
  


It's more that currently I have

 bwt :: (Ord x) = [x] - [x]

and I'm going to have to change that to

 bwt :: ByteString - ByteString

Then I'll have to change all the list functions to ByteString functions. 
And then - the hardest part - I'm going to have to edit my entire 
program framework to make it do ByteString I/O instead of [Char], and 
pipe it all the way through to the BWT function. And then I'll have to 
go edit all the algorithms that *don't* use ByteStrings to fix them...


The alternative is to do

 bwt = ByteString.unpack $ ... $ ByteString.pack

I have no idea how efficient or not that would be...

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Andrew Coppin

Bulat Ziganshin wrote:

Hello Andrew,

  

Anyone have any comments on how ByteString is different from, say,
UArray Word8?



mainly, algorithms implemented. the only technical difference is that
UArray uses ByteArray# and ByteString uses PinnedByteArray#, which has
different behavior in GC
  


I just wish I could have all that ByteString goodness without being 
limited to only having Word8 to play with. :-(


(E.g., the inverse BWT makes use of positive integers that are likely to 
be way bigger than 255. But there isn't a fast packed Word16 or Word32 
array I can use there...)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] In-place modification

2007-07-08 Thread Andrew Coppin
I was wittering on about stream fusion and how great it is, and I got a 
message from Mr C++.


(Mr C++ develops commercial games, and is obsessed with performance. For 
him, the only way to achieve the best performance is to have total 
control over every minute detail of the implementation. He sees Haskell 
is a stupid language that can never be fast. It seems he's not alone...)



He said two things. The first was it really perplexes me that Haskell 
insists that everything must be read-only. Naturally, I have a good 
answer to that one. (It's like being perplexed that relational 
databases refuse to keep table data in the same order. It's not to be 
awkward, it is a fundamental and important properly of the underlying 
theoretical foundation - the relational algebra, et al.)


He also said what basically boils down to being able to swap elements 
around in O(1) time and O(0) space is the whole thing that makes linked 
lists useful in the first place; take that away and it's rather 
pointless. I don't really have an answer to that one. (Lazyness and GC 
is probably going to kill most of the space cost. There's still a time 
cost - particularly the extra GC time...)


I've asked this before and nobody answered, so I take it that nobody 
knows the answer... Does GHC *ever* do an in-place update on anything? 
Does the STG even have a command for that? I always assumed that the 
compiler tried to find instances where a new structure is created and 
the old one is no longer referenced, and make that be in-place. But now 
I'm not so sure...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Jonathan Cast
On Sunday 08 July 2007, Andrew Coppin wrote:
 Jonathan Cast wrote:
  I wouldn't call rank-2 types extremely rare . . .

 Well now, my parser is annoyingly clunky to use, but it *works*.
 However, I just found something where it seems to be *impossible* to
 write the necessary code without rank-2 types...


 I tried to write this type:

   data Encoder2 = Encoder2 {stage1 :: [Word8] - x, stage2 :: x -
 [Word8] - [Word8]}

 However, that doesn't work. All type variables on the right must appear
 on the left:

   data Encoder2 x = Encoder2 {stage1 :: [Word8] - x, stage2 :: x -
 [Word8] - [Word8]}

 Now I have a problem. I want to put several of these puppies into a big
 list - and I do *not* want to force the type variable 'x' to be the same
 in all cases! (Although one can easily imagine situations where you
 might want this.) So as of now, my code uses rank-2 types - despite the
 fact that I don't actually know what a rank-2 type *is* yet! o_O This is
 rather troubling...

I think surely you're using existential data types rather than rank-2 types.

Existential types: each application of Encoder2 is to arguments which require 
a specific value of x.

Rank-2 types (polymorphic fields, actually): each application of Encoder2 is 
to arguments which work with any value of x.

Jonathan Cast
http://sourceforge.net/projects/fid-core
http://sourceforge.net/projects/fid-emacs
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Andrew Coppin

Jonathan Cast wrote:

I think surely you're using existential data types rather than rank-2 types.
  


You expect *me* to know?

Existential types: each application of Encoder2 is to arguments which require 
a specific value of x.


Rank-2 types (polymorphic fields, actually): each application of Encoder2 is 
to arguments which work with any value of x.
  


All I know is it didn't compile, so I added {-# LANGUAGE Rank2Types #-}, 
and now it works. *shrugs*


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Jonathan Cast
On Sunday 08 July 2007, Andrew Coppin wrote:
 Jonathan Cast wrote:
  I think surely you're using existential data types rather than rank-2
  types.

 You expect *me* to know?

Surely not :)  That's why I tried briefly explaining the ideas again.

  Existential types: each application of Encoder2 is to arguments which
  require a specific value of x.
 
  Rank-2 types (polymorphic fields, actually): each application of Encoder2
  is to arguments which work with any value of x.

 All I know is it didn't compile, so I added {-# LANGUAGE Rank2Types #-},
 and now it works. *shrugs*

If  you're happy, then I guess I can accept the situation.  I think you'll 
trip over the distinction again, but that's just me.

Jonathan Cast
http://sourceforge.net/projects/fid-core
http://sourceforge.net/projects/fid-emacs
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A very nontrivial parser

2007-07-08 Thread Andrew Coppin

Jonathan Cast wrote:

On Sunday 08 July 2007, Andrew Coppin wrote:
  

Jonathan Cast wrote:


I think surely you're using existential data types rather than rank-2
types.
  

You expect *me* to know?



Surely not :)  That's why I tried briefly explaining the ideas again.
  


LOL! Thanks...


All I know is it didn't compile, so I added {-# LANGUAGE Rank2Types #-},
and now it works. *shrugs*



If  you're happy, then I guess I can accept the situation.  I think you'll 
trip over the distinction again, but that's just me.
  


I must admit, the code compiles, but I have not yet attempted to *use* 
it for anything...


(It's the code for supporting 2-pass encoders, and I haven't written any 
yet.)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Bulat Ziganshin
Hello Andrew,

Sunday, July 8, 2007, 9:38:18 PM, you wrote:
 (E.g., the inverse BWT makes use of positive integers that are likely to
 be way bigger than 255. But there isn't a fast packed Word16 or Word32
 array I can use there...)

well, that's true as long as you swear to never use UArray :))


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] In-place modification

2007-07-08 Thread Bulat Ziganshin
Hello Andrew,

Sunday, July 8, 2007, 9:40:15 PM, you wrote:

 I've asked this before and nobody answered, so I take it that nobody
 knows the answer... Does GHC *ever* do an in-place update on anything?

no. this will break GC's heart :)  and it really breaks it as far as
you start to work with updateable arrays. look for details at
http://haskell.org/haskellwiki/Modern_array_libraries

 Does the STG even have a command for that?

hm. there are ghc primitives that work with mutable arrays. look for
primops.txt.pp in ghc sources. btw, you doesn't need to use unix in
order to play with ghc HEAD - you can download compiled windows binary

-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Andrew Coppin

Bulat Ziganshin wrote:

Hello Andrew,

Sunday, July 8, 2007, 9:38:18 PM, you wrote:
  

(E.g., the inverse BWT makes use of positive integers that are likely to
be way bigger than 255. But there isn't a fast packed Word16 or Word32
array I can use there...)



well, that's true as long as you swear to never use UArray :))
  


OTOH, everybody uses ByteString rather than UArray Word8. (And, in fact, 
ByteString *exists* even though UArray Word8 was there first.) So one 
presumes it's because ByteString has some kind of magic that makes it 
faster than a UArray...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Bulat Ziganshin
Hello Andrew,

Sunday, July 8, 2007, 7:12:38 PM, you wrote:
 (Realistically though. My program takes a [Word8] and turns it into a
 [Bool] before running a parser over it. The GHC optimiser doesn't really 
 stand a hope in hell of optimising that into a program that reads a machine 
 word into a CPU register and starts playing with bit flips on it...)
 

 Actually, if you're very lucky (fusion is just as hard in Haskell as it
 is in real life), it *does*.  It seems to fit nicely into the
 stream-fusion framework.
   

 Ooo... really? That's pretty impressive...(!)

it's our collective tale for bringing new haskellers. i bet that
Stefan never seen asm code generated by ghc :)


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Fun with ByteStrings [was: A very edgy language]

2007-07-08 Thread Bulat Ziganshin
Hello Andrew,

Sunday, July 8, 2007, 10:41:53 PM, you wrote:

 OTOH, everybody uses ByteString rather than UArray Word8. (And, in fact,
 ByteString *exists* even though UArray Word8 was there first.) So one 
 presumes it's because ByteString has some kind of magic that makes it 
 faster than a UArray...

as i already said, it's due to algorithms implemented. when one need
UArray operations, it use it. when one need string operations, he use
ByteStrings. the rest is marketing hype ands i bet that UArray
advertising company was ~15 years ago :)


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Andrew Coppin

Bulat Ziganshin wrote:

Hello Andrew,

  


Ooo... really? That's pretty impressive...(!)



it's our collective tale for bringing new haskellers. i bet that
Stefan never seen asm code generated by ghc :)

  


LOL!

Once - just once - I did take a look at the C output from GHC. Now, I 
realise that I can't code in C to save my life, but... it didn't even 
*look* like C. Even slightly. Wow.


(OTOH, reading Core isn't too helpful either. It just looks like the 
original Haskell...)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] In-place modification

2007-07-08 Thread Andrew Coppin

Bulat Ziganshin wrote:

Hello Andrew,

Sunday, July 8, 2007, 9:40:15 PM, you wrote:

  

I've asked this before and nobody answered, so I take it that nobody
knows the answer... Does GHC *ever* do an in-place update on anything?



no. this will break GC's heart :)


Yeah, having only immutable objects to collect must simplify a whole 
bunch of stuff...



and it really breaks it as far as
you start to work with updateable arrays. look for details at
http://haskell.org/haskellwiki/Modern_array_libraries
  


GHC 6.6 (currently in beta testing) will add...

Oh, that's cute. ;-)


Does the STG even have a command for that?



hm. there are ghc primitives that work with mutable arrays. look for
primops.txt.pp in ghc sources.


The GHC sources. Scary place. ;-)

I did think about compiling GHC myself once. But then I found out that 
it's not actually written in Haskell - it's written in Haskell + C + asm 
+ Perl (?!) and it's utterly *huge*...



btw, you doesn't need to use unix in
order to play with ghc HEAD - you can download compiled windows binary
  


Seriously? I'm pretty sure I tried to do that and couldn't...

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Language semantics

2007-07-08 Thread Daniil Elovkov

2007/7/1, Andrew Coppin [EMAIL PROTECTED]:


I'm going to have to take some time to bend my mind around that one too...

(Does anybody else on this list frequently get the feeling their IQ is
just too low for Haskell??)


I do. Sometimes when I find myself doing some type hackery.

And that's a sad feeling. Not because I realise that my IQ is lower
than it could be, but because the whole idea of Haskell is doing
things fast and easily. While on that I can spend quite a lot of time.

But that's only that fragment of work. The rest goes well :)

When I do that, I may think hey, seems like in Java I'd do that
pretty quickly without a lot of thinking but then you realise that
you're striving for much more that you'd got in Java. Much more static
garantees, so the comparison isn't correct at all.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Stefan O'Rear
On Sun, Jul 08, 2007 at 10:40:10PM +0400, Bulat Ziganshin wrote:
 Hello Andrew,
 
 Sunday, July 8, 2007, 7:12:38 PM, you wrote:
  (Realistically though. My program takes a [Word8] and turns it into a
  [Bool] before running a parser over it. The GHC optimiser doesn't really 
  stand a hope in hell of optimising that into a program that reads a 
  machine 
  word into a CPU register and starts playing with bit flips on it...)
  
 
  Actually, if you're very lucky (fusion is just as hard in Haskell as it
  is in real life), it *does*.  It seems to fit nicely into the
  stream-fusion framework.

 
  Ooo... really? That's pretty impressive...(!)
 
 it's our collective tale for bringing new haskellers. i bet that
 Stefan never seen asm code generated by ghc :)

If you check the list archives, you'll see that I've been a major
contributer to quite a few threads on the GHC code generator, and I
posted to the JHC list months ago.

Also, I said it would be read into a register, I never said it wouldn't
be spilled two instructions later ;)

Stefan (If I'm taking this totally wrong, make sure I know)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] In-place modification

2007-07-08 Thread Stefan O'Rear
On Sun, Jul 08, 2007 at 07:52:19PM +0100, Andrew Coppin wrote:
 Bulat Ziganshin wrote:
 Hello Andrew,
 Sunday, July 8, 2007, 9:40:15 PM, you wrote:
 I've asked this before and nobody answered, so I take it that nobody
 knows the answer... Does GHC *ever* do an in-place update on anything?

 no. this will break GC's heart :)

 Yeah, having only immutable objects to collect must simplify a whole bunch 
 of stuff...

Yes, GHC does in-place updates on almost everything.  But probably not
in the way you wanted.

The crucial feature of laziness, that distinguishes it from call-by-name
as found in old ALGOL, is that when you finish evaluating an expression
you overwrite the expression with its value.  This guarantees that
expressions are only evaluated once, and is crucial for the performance
of idioms such as circular programming and array-memoization.

This does mean that GHC needs a *very* simple write barrier in the GC -
it just tacks the current node onto a linked list.  No overflow checks,
no occurence checks, just a couple of fast instructions.

But this does mean that everything overwritable (with pointers)
(MutVar#, MutArray#, SynchVar#, TVar#, indirections) needs an extra link
pointer.

 Does the STG even have a command for that?
 

 hm. there are ghc primitives that work with mutable arrays. look for
 primops.txt.pp in ghc sources.

 The GHC sources. Scary place. ;-)

You could also look at the HTMLized primop documentation:

http://haskell.org/ghc/dist/current/docs/libraries/base/GHC-Prim.html

writeArray#, writetypeArray#, writetypeOffAddr#, writeMutVar#,
writeTVar#, takeMVar#, tryTakeMVar#, putMVar#, tryPutMVar#, do inplace
writes.

 I did think about compiling GHC myself once. But then I found out that it's 
 not actually written in Haskell - it's written in Haskell + C + asm + Perl 
 (?!) and it's utterly *huge*...

Perl may die soon - GHC HQ is considering retiring the registerised
-fvia-C path, leaving only native code for normal use and ANSI C for
porting.  This is because mangled C is only about 3% faster according to
the GHC benchmark suite, and carries a high maintaince cost.

Stefan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] In-place modification

2007-07-08 Thread Neil Mitchell

Hi


btw, you doesn't need to use unix in
order to play with ghc HEAD - you can download compiled windows binary

Seriously? I'm pretty sure I tried to do that and couldn't...


Seriously. Thanks to Igloo, you can even download a GHC nightly
complete with an installer! It doesn't get any easier than that :)

http://www.haskell.org/ghc/dist/current/dist/ghc-6.7.20070706-i386-unknown-mingw32.exe


 I did think about compiling GHC myself once. But then I found out that it's
 not actually written in Haskell - it's written in Haskell + C + asm + Perl
 (?!) and it's utterly *huge*...



Perl may die soon - GHC HQ is considering retiring the registerised
-fvia-C path, leaving only native code for normal use and ANSI C for
porting.  This is because mangled C is only about 3% faster according to
the GHC benchmark suite, and carries a high maintaince cost.


I still wouldn't want to compile GHC on Windows... For the only GHC
patch I've ever contributed I borrowed a friends Linux box. The build
instructions alone for Windows scarred me off.

Thanks

Neil
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Gtk2Hs, Glade, Multiple windows/dialog in one interface file.

2007-07-08 Thread D . V .

Is it possible to have multiple windows in a glade file, and display
only one when the program starts, then show/hide the others as the
need arises ?

I can't find how...

I have a main window with a menu. One entry shows the about dialog
which is also defined in the glade file :
 aboutmenu   - xmlGetWidget xml castToMenuItem apropos1
 onActivateLeaf aboutmenu $ do
   widgetShowAll aboutdialog

My problems are multiple :
1)  how do I handle the close button that is automatically generated ?
I tried doing this ...

 aboutdialog - xmlGetWidget xml castToAboutDialog aboutdialog1
 onDestroy aboutdialog $ do
   widgetHide aboutdialog

But the Close button doesn't work, and if I close the about dialog by
the upper right cross button, it does disappear, but if I select again
the menuitem, then a tiny empty dialog opens, as if the about dialog
has been stripped out of all his inner components. And I get this line
:

(interactive:10852): Gtk-CRITICAL **: gtk_container_foreach:
assertion `GTK_IS_CONTAINER (container)' failed

I tried looking at the demos, but it doesn't help. The closest that
could have helped uses an about dialog it creates, instead of reading
from a glade file.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[2]: [Haskell-cafe] Toy compression algorithms [was: A very edgy language]

2007-07-08 Thread Bulat Ziganshin
Hello Stefan,

Sunday, July 8, 2007, 11:03:00 PM, you wrote:

  (Realistically though. My program takes a [Word8] and turns it into a
  [Bool] before running a parser over it. The GHC optimiser doesn't really 
  stand a hope in hell of optimising that into a program that reads a 
  machine 
  word into a CPU register and starts playing with bit flips on it...)

well, can you give us an example of code which does that Andrew said
and translates into the simple bit fiddling without battling all
around lazy lists which kills performance?

-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Gtk2Hs, Glade, Multiple windows/dialog in one interface file.

2007-07-08 Thread Brandon S. Allbery KF8NH


On Jul 8, 2007, at 15:37 , D.V. wrote:


Is it possible to have multiple windows in a glade file, and display
only one when the program starts, then show/hide the others as the
need arises ?


You can but it's not well documented.

I found that it was necessary to load the window each time you need  
it.  (you must re-load the XML each time)



  (Just g) - xmlNewWithRootAndDomain (pathLoad ++ hwatchmwp.glade)
  (Just vwin)
  Nothing
  w - xmlGetWidget g castToDialog vwin


(I assume it'll work because it did at program startup.  Arguably I  
should actually do error checking instead...)


I *think* the right way to handle the close button is the onDelete  
(not onDestroy which happens far too late to back out) handler.  (I  
just noticed that I'm not actually handling it, where I'd thought I  
was.  Oops.)


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Gtk2Hs, Glade, Multiple windows/dialog in one interface file.

2007-07-08 Thread D . V .

I'm not sure I understand xmlNewWithRootAndDomain, I'll make some tests.


I found that it was necessary to load the window each time you need
it.  (you must re-load the XML each time)


I was hoping I could only hide the dialog and not destroy it.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Gtk2Hs, Glade, Multiple windows/dialog in one interface file.

2007-07-08 Thread Brandon S. Allbery KF8NH


On Jul 8, 2007, at 16:00 , D.V. wrote:

I'm not sure I understand xmlNewWithRootAndDomain, I'll make some  
tests.



I found that it was necessary to load the window each time you need
it.  (you must re-load the XML each time)


I was hoping I could only hide the dialog and not destroy it.


As I mentioned afterward, try hiding it in the OnDelete hook.  (And  
return True (I think) so the standard handler doesn't go ahead and  
destroy the window afterward.)


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Gtk2Hs, Glade, Multiple windows/dialog in one interface file.

2007-07-08 Thread D . V .

Oh !

Okay I tried and YOU ROCK ! Thanks a lot. Now it hides and comes back at will.

But I still can't react to that Close button that's automatically
added to that About Dialog.

Here's my latest (random) try :

 aboutdialog - xmlGetWidget xml castToAboutDialog aboutdialog1
 onDelete aboutdialog $ \event - do
 widgetHide aboutdialog
 return True
 onResponse aboutdialog $ \resp - do
   case resp of
 ResponseClose - widgetHide aboutdialog
 otherwise- return ()
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] In-place modification

2007-07-08 Thread Andrew Coppin

Stefan O'Rear wrote:

Yes, GHC does in-place updates on almost everything.  But probably not
in the way you wanted.
  


Ah yes, true...

(Try implementing that in Java. Tricky...)

I did think about compiling GHC myself once. But then I found out that it's 
not actually written in Haskell - it's written in Haskell + C + asm + Perl 
(?!) and it's utterly *huge*...



Perl may die soon


Yay!

Oh wait, you meant Perl in GHC? :-}


GHC HQ is considering retiring the registerised
-fvia-C path, leaving only native code for normal use and ANSI C for
porting.  This is because mangled C is only about 3% faster according to
the GHC benchmark suite, and carries a high maintaince cost.
  


It would mean that on Windoze, you don't have to supply gcc and an 
emulator for running it...


(Are you keeping an external assembler and linker, or doing the whole 
shooting match in GHC?)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] In-place modification

2007-07-08 Thread Andrew Coppin

Neil Mitchell wrote:

Hi


btw, you doesn't need to use unix in
order to play with ghc HEAD - you can download compiled windows binary

Seriously? I'm pretty sure I tried to do that and couldn't...


Seriously. Thanks to Igloo, you can even download a GHC nightly
complete with an installer! It doesn't get any easier than that :)

http://www.haskell.org/ghc/dist/current/dist/ghc-6.7.20070706-i386-unknown-mingw32.exe 



OK. So that's just the GHC binary itself, right?


Perl may die soon - GHC HQ is considering retiring the registerised
-fvia-C path, leaving only native code for normal use and ANSI C for
porting.  This is because mangled C is only about 3% faster according to
the GHC benchmark suite, and carries a high maintaince cost.


I still wouldn't want to compile GHC on Windows... For the only GHC
patch I've ever contributed I borrowed a friends Linux box. The build
instructions alone for Windows scarred me off.


Hmm... Mind you, not sure I fancy trying it on Linux either. Just 
because Linux is scary... :-P


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Gtk2Hs, Glade, Multiple windows/dialog in one interface file.

2007-07-08 Thread Brandon S. Allbery KF8NH


On Jul 8, 2007, at 16:24 , D.V. wrote:


But I still can't react to that Close button that's automatically
added to that About Dialog.

Here's my latest (random) try :

 aboutdialog - xmlGetWidget xml castToAboutDialog aboutdialog1
 onDelete aboutdialog $ \event - do
 widgetHide aboutdialog
 return True
 onResponse aboutdialog $ \resp - do
   case resp of
 ResponseClose - widgetHide aboutdialog
 otherwise- return ()


That looks reasonable to me, unfortunately I'd guess the dialog  
already got dealt with in the stock Close button handler.  You might  
need to dig inside the AboutDialog and install your own onClicked  
handler on the Close button.


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Gtk2Hs, Glade, Multiple windows/dialog in one interface file.

2007-07-08 Thread D . V .

I finally got it to work with onResponse : I traced each possible
response to see which one was fired when clicking the close button

 onResponse aboutdialog $ \resp - do
   putStrLn onResponse!!!
   case resp of
 ResponseNone- putStrLn ResponseNone
 ResponseReject  - putStrLn ResponseReject
 ResponseAccept  - putStrLn ResponseAccept
 ResponseDeleteEvent - putStrLn ResponseDeleteEvent
 ResponseOk  - putStrLn ResponseOk
 ResponseCancel  - putStrLn ResponseCancel  widgetHide aboutdialog
 ResponseClose   - putStrLn ResponseClose
 ResponseYes - putStrLn ResponseYes
 ResponseNo  - putStrLn ResponseNo
 ResponseApply   - putStrLn ResponseApply
 ResponseHelp- putStrLn ResponseHelp
 ResponseUser n  - putStrLn (ResponseUser ++ show n)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Gtk2Hs, Glade, Multiple windows/dialog in one interface file.

2007-07-08 Thread Brandon S. Allbery KF8NH


On Jul 8, 2007, at 16:36 , D.V. wrote:


I finally got it to work with onResponse : I traced each possible
response to see which one was fired when clicking the close button


Great, another place where the documentation's wrong.  :/   
(onActivateLeaf vs. onActivateItem (and after- versions) also found  
to be wrong.  Must submit a bug report at some point.)


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] In-place modification

2007-07-08 Thread Stefan O'Rear
On Sun, Jul 08, 2007 at 09:35:10PM +0100, Andrew Coppin wrote:
 GHC HQ is considering retiring the registerised
 -fvia-C path, leaving only native code for normal use and ANSI C for
 porting.  This is because mangled C is only about 3% faster according to
 the GHC benchmark suite, and carries a high maintaince cost.
   

 It would mean that on Windoze, you don't have to supply gcc and an emulator 
 for running it...

That was also part of the justification.

 (Are you keeping an external assembler and linker, or doing the whole 
 shooting match in GHC?)

Currently, we use gas and ld, yes...

Stefan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: practicality of typeful programming

2007-07-08 Thread Daniil Elovkov

Hello Titto

2007/6/28, Pasqualino 'Titto' Assini [EMAIL PROTECTED]:

Hi Daniil,

I had a look at the paper and associated code that Oleg refers to there is no
special parsing taking place:

From Vector/read-examples.hs:

v3 = do
let m1 = $(dAM [[1,2],[3,4]])
s - readFile Vector/example-data.txt
listMatRow (read s) (\(m2::AVector Double a) -
print $ m2 * trans m1
  )

It does not make any difference if the list that is used to populate the
matrix is specified in the code or read from a file.

In both cases, if the list lenght is incorrect, an error is generated at
run-time (I think, I cannot run the actual code).


You're right.

Let me explain how I see what's going on there.

It looks like, indeed, in both cases (matrix read from a file and
matrix given in the program) the error will be reported at run-time.
But that is because in both those cases Frederik converts an untyped
value to a typed one, via listVec*, listMat*. The untyped values being
lists and lists of lists.

Compare that to what we see, for example, in Oleg's (and other's)
paper Strongly typed hetergeneous collections, where the value is
constructed as typed in the first place.

In case of reading from a file, an error obviously can't be reported
at compile-time.

So, the 'parsing' takes place in both of those cases. But, indeed,
there seems to be no specific parsing, in the sense of graceful
(UnTyped - Maybe Typed). I failed to find the place, but I suspect
there should be the 'error' function called somewhere in a class
method which builds the typed thing.

I think, the typechecker sees the constraint of (*) and supposes the
correct type for that lambda bound parameter. Having guessed that, it
calls the appropriate instance's method. The latter fails to build the
right thing and fails with an error. This is only a guess.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: practicality of typeful programming

2007-07-08 Thread Daniil Elovkov

Hello Oleg

2007/6/28, [EMAIL PROTECTED] [EMAIL PROTECTED]:


Daniil Elovkov wrote:
 The fact that structure is mixed with properties seems to put some
 limits on both doability and, even more, practilaty of encoding
 complex properties.

That's why phantom types, attached via a newtype wrapper, are so
appealing. If we remove the wrapper, we get the original value to be
used in other parts of the program. Adding the wrapper requires either
a dynamic test or a special way of constructing a value. Please see
the Non-empty list discussion and the example on Haskell Wiki.


Yes, but phantom types seem to only solve rather simple problems.

In HList, every (needed) list function had to be lifted to the type
level. No phantom type will let us work with an HList like an ordinary
list, with ordinary list functions, right?


 in the paper Strongly typed heterogeneous collections you say, that the
 given approach only works for statically specified SQL query, I mean
 encoded in the Haskell program, not parsed from the untyped input
 string? (I've just flipped through it, but failed to find this
 place...) Either in case when data schema is statically known, or,
 even worse, when it's also dynamic.

 Interesting, can all the assertions be proved in that case? Like
 correspondence between select field types and resultset record types.

Indeed, the assumption of the HList paper is that the query is
embedded in a program (entered by the programmer alongside the code)
and the database schema is known. This is quite a common use
case. There are cases however when the database schema is dynamic and
so is the query: that is, the query is received in a string or file,
after the (part of the) code has been compiled. Then we need to
typecheck that query. The best way of doing this is via Template
Haskell. We read the query form the string, make an AST, and then
splice it in. The latter operation implicitly invokes the Haskell
typechecker. If it is happy, the result can be used as if the query
were static to start with.om files.
 ... snip


Well, it is expected to be _hard_. AnimalSchema is a type defined in
the program. I really believe this particular thing is doable.

But how acceptable is the price? It takes a lot of effort. It is
enough non-trivial to be a good subject for a scientific paper. But
how _practical_ is it?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Allocating enormous amounts of memory and wondering why

2007-07-08 Thread Jefferson Heard
I'm using the Data.AltBinary package to read in a list of 4.8 million
floats and 1.6 million ints.  Doing so caused the memory footprint to
blow up to more than 2gb, which on my laptop simply causes the program
to crash.  I can do it on my workstation, but I'd really rather not,
because I want my program to be fairly portable.  

The file that I wrote out in packing the data structure was only 28MB,
so I assume I'm just using the wrong data structure, or I'm using full
laziness somewhere I shouldn't be.

I've tried compiling with profiling enabled, but I wasn't able to,
because the Streams package doesn't seem to have an option for compiling
with profiling.  I'm also a newbie to Cabal, so I'm probably just
missing something.  

The fundamental question, though is Is there something wrong with how I
wrote the following function?

binaryLoadDocumentCoordinates :: String - IO (Ptr CFloat, [Int])
binaryLoadDocumentCoordinates path = do
  pointsH - openBinaryFile (path ++ /Clusters.bin) ReadMode
  coordinates - get pointsH :: IO [Float]
  galaxies - get pointsH :: IO [Int]
  coordinatesArr - mallocArray (length coordinates)
  pokeArray coordinatesArr (map (fromRational . toRational) coordinates)
  return (coordinatesArr, galaxies)

I suppose in a pinch I could write a C function that serializes the
data, but I'd really rather not.  What I'm trying to do is load a bunch
of coordinates into a vertex array for OpenGL.  I did this for a small
30,000 item vertex array, but I need to be able to handle several
million vertices in the end.  

If I serialize an unboxed array instead of a list or if I do repeated
put_ and get calls, will that help with the memory problem?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Allocating enormous amounts of memory and wondering why

2007-07-08 Thread Jefferson Heard
By the way, I've confirmed it doesn't even make it past the call to 

coordinates - get pointsH :: IO [Float]

It just runs for about 15 seconds and then all the memory is consumed.
I'm using a laptop with 2gb of RAM and a 2.0gHz processor, so I assume
the read shouldn't take that long, since on the wiki, AltBinary says it
can run at around 20-50MB/sec.  I assume I'm doing something *way* wrong
here...

On Sun, 2007-07-08 at 17:26 -0400, Jefferson Heard wrote:
 I'm using the Data.AltBinary package to read in a list of 4.8 million
 floats and 1.6 million ints.  Doing so caused the memory footprint to
 blow up to more than 2gb, which on my laptop simply causes the program
 to crash.  I can do it on my workstation, but I'd really rather not,
 because I want my program to be fairly portable.  
 
 The file that I wrote out in packing the data structure was only 28MB,
 so I assume I'm just using the wrong data structure, or I'm using full
 laziness somewhere I shouldn't be.
 
 I've tried compiling with profiling enabled, but I wasn't able to,
 because the Streams package doesn't seem to have an option for compiling
 with profiling.  I'm also a newbie to Cabal, so I'm probably just
 missing something.  
 
 The fundamental question, though is Is there something wrong with how I
 wrote the following function?
 
 binaryLoadDocumentCoordinates :: String - IO (Ptr CFloat, [Int])
 binaryLoadDocumentCoordinates path = do
   pointsH - openBinaryFile (path ++ /Clusters.bin) ReadMode
   coordinates - get pointsH :: IO [Float]
   galaxies - get pointsH :: IO [Int]
   coordinatesArr - mallocArray (length coordinates)
   pokeArray coordinatesArr (map (fromRational . toRational) coordinates)
   return (coordinatesArr, galaxies)
 
 I suppose in a pinch I could write a C function that serializes the
 data, but I'd really rather not.  What I'm trying to do is load a bunch
 of coordinates into a vertex array for OpenGL.  I did this for a small
 30,000 item vertex array, but I need to be able to handle several
 million vertices 
 in the end.  
 
 If I serialize an unboxed array instead of a list or if I do repeated
 put_ and get calls, will that help with the memory problem?
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANNOUNCE: utf8-string-0.1

2007-07-08 Thread Eric Mertens

Hello,

I'd like to announce that I have posted a UTF-8 encoding/decoding
library to hackage. This library also includes replacements for most
of the System.IO namespace under System.IO.UTF8. This library detects
overlong sequences, and replaces invalid code-points and invalid
encodings with the replacement character '\xfffd'.

The following file was used to ensure that the decoder was considered safe:
http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt

utf8-string can be found on hackage:
http://hackage.haskell.org/cgi-bin/hackage-scripts/package/utf8-string-0.1

source code is available via:
darcs get http://code.haskell.org/utf8-string/

--
Eric Mertens
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Allocating enormous amounts of memory and wondering why

2007-07-08 Thread Stefan O'Rear
On Sun, Jul 08, 2007 at 05:26:18PM -0400, Jefferson Heard wrote:
 I'm using the Data.AltBinary package to read in a list of 4.8 million
 floats and 1.6 million ints.  Doing so caused the memory footprint to
 blow up to more than 2gb, which on my laptop simply causes the program
 to crash.  I can do it on my workstation, but I'd really rather not,
 because I want my program to be fairly portable.  
 
 The file that I wrote out in packing the data structure was only 28MB,
 so I assume I'm just using the wrong data structure, or I'm using full
 laziness somewhere I shouldn't be.
 
 I've tried compiling with profiling enabled, but I wasn't able to,
 because the Streams package doesn't seem to have an option for compiling
 with profiling.  I'm also a newbie to Cabal, so I'm probably just
 missing something.  
 
 The fundamental question, though is Is there something wrong with how I
 wrote the following function?
 
 binaryLoadDocumentCoordinates :: String - IO (Ptr CFloat, [Int])
 binaryLoadDocumentCoordinates path = do
   pointsH - openBinaryFile (path ++ /Clusters.bin) ReadMode
   coordinates - get pointsH :: IO [Float]
   galaxies - get pointsH :: IO [Int]
   coordinatesArr - mallocArray (length coordinates)
   pokeArray coordinatesArr (map (fromRational . toRational) coordinates)
   return (coordinatesArr, galaxies)
 
 I suppose in a pinch I could write a C function that serializes the
 data, but I'd really rather not.  What I'm trying to do is load a bunch
 of coordinates into a vertex array for OpenGL.  I did this for a small
 30,000 item vertex array, but I need to be able to handle several
 million vertices in the end.  
 
 If I serialize an unboxed array instead of a list or if I do repeated
 put_ and get calls, will that help with the memory problem?

Why are you using AltBinary instead of the (much newer and faster)
Binary?  Binary *does* work with profiling and does not depend on
streams.

(To compile Binary with profiling support, add -p to the Cabal
configuration line.  This is documented in the --help message!)

Yes, using unboxed arrays will help.  Also try using the -c RTS option
(that is, run your program as ./myprogram +RTS -c -RTS) - this tells the
garbage collector to use a mark-compact system, which is slower than the
default copying collector but uses roughly half as much memory.

Stefan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] In-place modification

2007-07-08 Thread Donald Bruce Stewart
andrewcoppin:
 I was wittering on about stream fusion and how great it is, and I got a 
 message from Mr C++.
 
 (Mr C++ develops commercial games, and is obsessed with performance. For 
 him, the only way to achieve the best performance is to have total 
 control over every minute detail of the implementation. He sees Haskell 
 is a stupid language that can never be fast. It seems he's not alone...)

Many libraries use inplace update and mutable data. Like, hmm, Data.ByteString.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Too many packages on hackage? :-)

2007-07-08 Thread Donald Bruce Stewart
bulat.ziganshin:
 Hello apfelmus,
 
 Sunday, July 8, 2007, 5:20:18 PM, you wrote:
 
   Looks like there's too many packages on hackage.haskell.org now for a
 
 it's the nicest problem i can imagine :)
 
  For browsing libraries, I like the wiki pages much more than hackage.
  Can't those two be merged into one?
 
 may be we just need alternative single-page listing with FULL
 package descriptions instead of brief one. this should be rather close
 in amount of information to old wiki. Categories section on top of
 page should be enough for anyone who need pages split by category
 
 many years ago i already proposed to extend hackage to include
 information about general entities, not only packages uploaded to it.
 unfortunately, its author was not interested in this. i still think
 that joining up HCAR, librariestools wiki pages and hackage into the
 one information source will give us significant benefits, making
 searching of needed library easier, especially for casual haskellers.
 as you can see, in many cases, haskellers think that some libraries
 doesn't exist just because it's hard to find them

Another idea I've been pondering is allowing people to add links to
documentation for libraries, from their hackage page. We have a fair 
few libs documented in blog form, here,

http://haskell.org/haskellwiki/Blog_articles

So adding links to those from the hackage page for the library would help.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Memoisation + unsafePerformIO

2007-07-08 Thread Tony Morris
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello everyone,
I am trying to write a generalisation of memoisation for a couple of
backtracking algorithms that I wrote (which don't specify a memoisation
technique), by keeping a local destructive update using unsafePerformIO
and IORef - neither of which I have used before and I am struggling to
figure out. I am going from the document:
http://haskell.org/hawiki/GlobalMutableState

I am not sure if:
a) this is even possible
b) if a) holds, if this is a good thing

Here is some code to help demonstrate what I intend:

{-# OPTIONS_GHC -fglasgow-exts #-}
import Data.Ix

- -- implementations must guarantee that
- -- memo f k is equivalent to f k
- -- but may otherwise use local destructive updates
class Memo k v where
  memo :: (k - v) - k - v

instance (Ix k) = Memo k v where
  memo f k = error(todo: array)

instance (Ord k) = Memo k v where
  memo f k = error(todo: binary tree)

- --
Tony Morris
http://tmorris.net/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGkZlFmnpgrYe6r60RArT8AJ4iFVyzmUN1pdxpMBokZpj48CiqRgCfReIe
2yZ55LMcFs24TJOVeCV4tbo=
=IR5s
-END PGP SIGNATURE-
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Compress and serialise data with lazy bytestrings, zlib and Data.Binary (was: Allocating enormous amounts of memory)

2007-07-08 Thread Donald Bruce Stewart
Jefferson Heard write:

 I'm using the Data.AltBinary package to read in a list of 4.8 million
 floats and 1.6 million ints.  Doing so caused the memory footprint to
 blow up to more than 2gb, which on my laptop simply causes the program
 to crash.  I can do it on my workstation, but I'd really rather not,
 because I want my program to be fairly portable.  
 
 The file that I wrote out in packing the data structure was only 28MB,
 so I assume I'm just using the wrong data structure, or I'm using full
 laziness somewhere I shouldn't be.

Here's a quick example of how to efficient read and write such a structure to
disk, compressing and decompressing on the fly. 

$ time ./A
Wrote 480 floats, and 160 ints
Read  480 floats, and 160 ints
./A  0.93s user 0.06s system 89% cpu 1.106 total

It uses Data.Binary to provide quick serialisation, and the zlib library to
compress the resulting stream. It builds the tables in memory, writes and
compresses the result to disk, reads them back in, and checks we read the right
amount of CFloats and CInts. You'd then pass the CFloats over to your C library
that needs them.

Compressing with zlib is a flourish, but cheap and simple, so we may as well do
it. With zlib and Data.Binary, the core code just becomes:

encodeFile /tmp/table.gz table
table' - decodeFile /tmp/table.gz

Which transparently streams the data through zlib, and onto the disk, and back.

Simple and efficient.

-- Don


Here's the code:

-- some imports

import Text.Printf
import Data.Binary
import Codec.Compression.Zlib
import Control.Monad
import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString.Base as B
import Foreign
import Foreign.C.Types

-- A simple table type for the arrays, that will be easy to manipulate

data Table = Table { floats :: L.ByteString
   , ints   :: L.ByteString }


-- Serialise this data in gzip-compressed form: a gzipping Binary instance.

instance Binary Table where
put t = do put (compress (floats t))
   put (compress (ints   t))

get = liftM2 Table (decompress `liftM` get)
   (decompress `liftM` get)

-- Default sizes

floatSize = 480
intSize   = 160

-- Build a new empty table

newTable :: IO Table
newTable = do
f - mallocArray floatSize :: IO (Ptr CFloat)
i - mallocArray intSize   :: IO (Ptr CInt)

-- fill them with data here --

-- convert to bytestrings.
bf - B.packCStringFinalizer (castPtr f)
(floatSize * sizeOf (undefined :: CFloat)) (return ())
bi - B.packCStringFinalizer (castPtr i)
(intSize   * sizeOf (undefined :: CInt))   (return ())
return $ Table (L.fromChunks [bf]) (L.fromChunks [bi])

-- Now just build the table, serialise it, read it back in, and print the 
result

main = do
table - newTable

-- write the data to disk, compressed with gzip as we go.
encodeFile /tmp/table.gz table
draw Wrote table

-- load it back in, decompressing on the fly
table' - decodeFile /tmp/table.gz
draw Read  table'

-- now use 'floats' as a Ptr to pass back to C.

  where
draw s v = printf %s %d floats, and %d ints\n s
(fromIntegral (lengthFloats v) `div` sizeOf (undefined 
:: CFloat))
(fromIntegral (lengthInts v  ) `div` sizeOf (undefined 
:: CInt))

lengthFloats = L.length . floats
lengthInts   = L.length . ints

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Compress and serialise data with lazy bytestrings, zlib and Data.Binary (was: Allocating enormous amounts of memory)

2007-07-08 Thread Donald Bruce Stewart
dons:
 Jefferson Heard write:
 
  I'm using the Data.AltBinary package to read in a list of 4.8 million
  floats and 1.6 million ints.  Doing so caused the memory footprint to
  blow up to more than 2gb, which on my laptop simply causes the program
  to crash.  I can do it on my workstation, but I'd really rather not,
  because I want my program to be fairly portable.  
  
  The file that I wrote out in packing the data structure was only 28MB,
  so I assume I'm just using the wrong data structure, or I'm using full
  laziness somewhere I shouldn't be.
 
 Here's a quick example of how to efficient read and write such a structure to
 disk, compressing and decompressing on the fly. 
 
 $ time ./A
 Wrote 480 floats, and 160 ints
 Read  480 floats, and 160 ints
 ./A  0.93s user 0.06s system 89% cpu 1.106 total
 
 It uses Data.Binary to provide quick serialisation, and the zlib library to
 compress the resulting stream. It builds the tables in memory, writes and
 compresses the result to disk, reads them back in, and checks we read the 
 right
 amount of CFloats and CInts. You'd then pass the CFloats over to your C 
 library
 that needs them.
 
 Compressing with zlib is a flourish, but cheap and simple, so we may as well 
 do
 it. With zlib and Data.Binary, the core code just becomes:
 
 encodeFile /tmp/table.gz table
 table' - decodeFile /tmp/table.gz
 
 Which transparently streams the data through zlib, and onto the disk, and 
 back.
 
 Simple and efficient.

Oh, and profiling this code:

$ ghc -prof -auto-all -O2 --make A.hs

$ ./A +RTS -p
Wrote 480 floats, and 160 ints
Read 480 floats, and 160 ints

$ cat A.prof 
Mon Jul  9 12:44 2007 Time and Allocation Profiling Report  (Final)

total time  =0.90 secs   (18 ticks @ 50 ms)
total alloc =  26,087,140 bytes  (excludes profiling overheads)

COST CENTREMODULE   %time %alloc
main   Main 100.0  100.0

Looks fine. We'd expect at least 25,600,000 bytes, and a little overhead for 
the 
runtime system. I note that the compressed file on disk is 26k too (yay for
gzip on zeros ;)

-- Don

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Memoisation + unsafePerformIO

2007-07-08 Thread Tim Chevalier

You might look at this paper:
http://research.microsoft.com/Users/simonpj/Papers/weak.htm
Stretching the storage manager: weak pointers and stable names in Haskell -
Simon Peyton Jones, Simon Marlow, and Conal Elliott, IFL'99.

It describes a solution to exactly the problem you're trying to solve,
that's implemented in GHC. It may not be the right thing for you, but
you may be interested to see previous approaches to the problem in any
case.

Cheers,
Tim

--
Tim Chevalier* catamorphism.org *Often in error, never in doubt
Poor man wanna be rich, rich man wanna be king, the king ain't
satisfied until he rules everything -- Bruce Springsteen
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] More binary IO, compression, bytestrings and FFI fun

2007-07-08 Thread Donald Bruce Stewart
Processing larger amounts of data, compression, serialisation and calling C.

An elaboration of the previous example:

* Build a largish structure in Haskell
* Compress it in memory
* Serialise it to disk
* Deserialise it
* Decompress
* Pass it to C
* Display the result

Pretty common pattern for low level stuff. We use zlib + lazy
bytestrings for streaming decompression, and Data.Binary for the
serialisation.

We will use

* Foreign.* to generate the data
* Wrap it as a lazy bytestring
* Data.Binary to serialise it
* Code.Compression.Gzip to compress/uncompress
* Pass it to C and make a simple FFI call on the result
* Display the result

Running:

$ ghc -O2 A.hs --make

$ time ./A
Built table
Compressed  2560 bytes
Compressed size  2231545 bytes (91.28%)
Decompressed2560 bytes
Calling into C ...
-8.742278e-8
-0.6865875
-0.7207948
-0.1401903
0.63918984
0.7437966
0.27236375
-0.5763547
-0.75708854
-0.39026973
./A  2.98s user 0.11s system 94% cpu 3.275 total

The code:

{-# OPTIONS -fglasgow-exts #-}

-- 
-- Some imports
--
import Foreign
import Foreign.C.Types
import Data.Int

import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString.Base as S
import qualified Data.ByteString  as S

import Data.Binary
import Codec.Compression.GZip

import System.IO
import Text.Printf
import Control.Monad


-- Foreign Ptrs
--
-- A simple wrapper type
--
data Table = Table { floats :: ForeignPtr CFloat
   , ints   :: ForeignPtr Int}

-- Statically fixed sizes
floatSize = 480
intSize   = 160

totalBytes = sizeOf (undefined :: CFloat) * floatSize
   + sizeOf (undefined :: Int)* intSize

--
-- Build a table populated with some defaults
-- Float table filled with 'pi' , ints numbered consecutively
--
newTable :: IO Table
newTable = do
fp - S.mallocByteString (floatSize * sizeOf (undefined :: CFloat))
ip - S.mallocByteString (intSize   * sizeOf (undefined :: Int   ))
withForeignPtr fp $ \p -
forM_ [0..floatSize-1] $ \n -
pokeElemOff p n pi
withForeignPtr ip $ \p -
forM_ [0..intSize-1]   $ \n -
pokeElemOff p n n
return (Table fp ip)


-- Lazy ByteStrings
--
-- Convert ForeignPtr a to and from a lazy ByteString
--
toByteString   :: Storable a = ForeignPtr a - Int - L.ByteString
toByteString (fp :: ForeignPtr a) n =
L.fromChunks . (:[]) $ S.fromForeignPtr (castForeignPtr fp)
(n * sizeOf (undefined :: a))

--
-- Flatten a lazy bytestring back to a ForeignPtr.
--
fromByteString :: Storable a = L.ByteString - ForeignPtr a
fromByteString lbs = castForeignPtr fp
   where (fp,_,n) = S.toForeignPtr . S.concat $ L.toChunks lbs


-- GZip and Data.Binary
--
-- Serialise a Table, compressing with gzip it as we go:
--
instance Binary Table where
put (Table f i) = do
put . compress . toByteString f $ floatSize
put . compress . toByteString i $ intSize

get = do
fs - liftM decompress get
is - liftM decompress get

-- check we read the correct amount:
if L.length fs + L.length is == fromIntegral totalBytes
then return $ Table (fromByteString fs) (fromByteString is)
else error Partial read


-- FFI
--
-- Example call to process the data using C functions.
--
rounded :: Int - ForeignPtr CFloat - IO [CFloat]
rounded l fp = withForeignPtr fp $ \p - go p
where
go p = forM [0..l-1] $ \n - do
v - peekElemOff p n
return $ c_tanhf (c_sinf (v + fromIntegral n))

-- A random C function to use:
foreign import ccall unsafe math.h sinf  c_sinf  :: CFloat - CFloat
foreign import ccall unsafe math.h tanhf c_tanhf :: CFloat - CFloat



--
-- Now glue it all together
-- 
main = do
table - newTable
putStrLn Built table

-- write the data to disk, compressed with gzip as we go.
encodeFile /tmp/table.gz table
printf Compressed  %d bytes\n totalBytes

-- how good was the compression?
h - openFile /tmp/table.gz ReadMode
n - hFileSize h
  

Re: [Haskell-cafe] Clearly, Haskell is ill-founded

2007-07-08 Thread Donald Bruce Stewart
drtomc:
 I don't know if you saw the following linked off /.
 
 http://www.itwire.com.au/content/view/13339/53/
 
 An amazon link for the book is here:
 
 http://www.amazon.com/Computer-Science-Reconsidered-Invocation-Expression/dp/0471798142
 
 The basic claim appears to be that discrete mathematics is a bad
 foundation for computer science. I suspect the subscribers to this
 list would beg to disagree.
 
 Enjoy,

:-)

And he's patented it...

http://www.patentstorm.us/patents/5355496-description.html

SUMMARY OF THE INVENTION

A method and system for process expression and resolution is described. A
first language structure comprising a possibility expression having at least
one definition which is inherently and generally concurrent is provided.
Further, a second language structure comprising an actuality expression
including a fully formed input data name to be resolved is provided.
Furthermore, a third language structure comprising an active expression
initially having at least one invocation, the invocation comprising an
association with a particular definition and the fully formed input data 
name
of the actuality expression is provided. Subsequently, the process of 
resolving
invocations begins in the active expression with fully formed input data 
names
in relation to their associated definition to produce at least one or both 
of
the following: (1) an invocation with a fully formed input data name and 
(2) a
result data name. 

Interesting...

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe