On 15/09/2014, at 2:35 PM, srean wrote:

> It is creating a a full list whereas the Python code doesnt, so using a 
> generator style might recover even more ground. But I dont know the cost 
> model of Felix GC at all, so you certainly would know better.

I think you mean iterator, which is a generator closure: readln itself is a 
generator,
that's just a function with side-effects.

readln would be faster because it just reads lines using fgets.
So there's no need to do a split.

Er ....

for file in split(file_file,"\n") do
  println$ file;
  var f = fopen_input file;
  reading: while true do
    var line = readln f;
    if line == "" do break reading; done
    index.add line (index.get_dflt (line,0)+1);
    ++counter;
  done
  f.fclose;
done

Time elapsed: 1.077088s, 56397 trx,  or 52361 trx/sec

which is SLOWER.

(the outer split doesn't count).

Why is it slower? I have no idea. Probably because its faster to read
a whole file at once than with multiple fgets()?

[I still get an unknown exception for the larger file .. very annoying,
however at least I now get an error code :]

Seriously, with immutable garbage collected strings, stuff like split
is trivia: if a string consists of 
        
        (count, start)

then there's no copying involved at all. And its perfectly safe,
since the strings are immutable, and the array is garbage collected.

Concatenation remains slow, unless we use a rope.
However ropes are slower to scan. And a rope of individual
characters would be a disaster (as it is in Haskell where the standard
string is a list of chars .. :)

So the problem with ropes is quite nasty. One needs to know when
to compact the rope into a single fragment. Of course you can only
do this creating a rope, otherwise the functional behaviour is lost.

--
john skaller
skal...@users.sourceforge.net
http://felix-lang.org




------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk
_______________________________________________
Felix-language mailing list
Felix-language@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/felix-language

Reply via email to