Re: slow raw io

2010-08-09 Thread David Powell
On Sat 07/08/10 14:02 , Stuart Halloway stuart.hallo...@gmail.com sent: No. We want to collect more information and do more comparisons before moving away from the recommended Java buffering. Stu This isn't an issue with the buffering, it is an issue with the massive overhead of doing

Re: slow raw io

2010-08-09 Thread David Powell
This isn't an issue with the buffering, it is an issue with the massive overhead of doing character at a time i/o - it is something that you really should never ever do. I'd say something somewhere doing character at a time i/o is probably the number one cause of crippling performance

Re: slow raw io

2010-08-09 Thread cageface
On Aug 7, 5:43 am, Peter Schuller peter.schul...@infidyne.com wrote: Interesting. Why do you consider it recommended to read one character at a time in a case like this? Maybe there is such a recommendation that I don't know about, but in general I would consider it contrary to expected

Re: slow raw io

2010-08-09 Thread David Powell
Maybe this seems like a low-priority issue but I think slurp is likely to be very commonly used. For instance, the Riak tutorial just posted to Hacker News uses it: http://mmcgrana.github.com/2010/08/riak-clojure.html In the past I've steered clear of using slurp because it didn't hand

Re: slow raw io

2010-08-07 Thread cageface
Any chance of getting this in before 1.2? On Jun 25, 7:43 am, cageface milese...@gmail.com wrote: Thanks Stuart Peter for following up on this. Now I can get back to plowing through this mountain of ldiff data with Clojure! -- You received this message because you are subscribed to the

Re: slow raw io

2010-08-07 Thread Stuart Halloway
No. We want to collect more information and do more comparisons before moving away from the recommended Java buffering. Stu Any chance of getting this in before 1.2? On Jun 25, 7:43 am, cageface milese...@gmail.com wrote: Thanks Stuart Peter for following up on this. Now I can get back

Re: slow raw io

2010-08-07 Thread Peter Schuller
No. We want to collect more information and do more comparisons before moving away from the recommended Java buffering. Interesting. Why do you consider it recommended to read one character at a time in a case like this? Maybe there is such a recommendation that I don't know about, but in

Re: slow raw io

2010-06-25 Thread Peter Schuller
I put a self-contained test up here: http://gist.github.com/452095 To run it copy this to slurptest.clj and run these commands java clojure.main slurptest.clj makewords 100 (100 seems good for macs, 300 for linux) java -Xmx3G -Xms3G clojure.main slurptest.clj slurp| slurp2 Trying either

Re: slow raw io

2010-06-25 Thread Peter Schuller
And reading the thread history I realize the problem was already identified (sorry), however: Has anyone else had a chance to try this? I'm surprised to see manual buffering behaving so much better than the BufferedReader implementation but it does seem to make quite a difference. Not really

Re: slow raw io

2010-06-25 Thread Stuart Halloway
Hi Peter, You are on the contributors list, so I just need to know your account name on Assembla to activate your ability to add tickets, patches, etc. Let me know your account name (which needs to be some permutation of your real name, not a nick). Thanks, Stu I put a self-contained test

Re: slow raw io

2010-06-25 Thread Peter Schuller
You are on the contributors list, so I just need to know your account name on Assembla to activate your ability to add tickets, patches, etc. Let me know your account name (which needs to be some permutation of  your real name, not a nick). When I read up before submitting the contributor

Re: slow raw io

2010-06-25 Thread Peter Schuller
I can register another account (no problem), but what implications are there on the fact that I wrote 'scode' on the contributor agreement I mail:ed Rich? I just registered peterschuller. -- / Peter Schuller -- You received this message because you are subscribed to the Google Groups

Re: slow raw io

2010-06-25 Thread cageface
Thanks Stuart Peter for following up on this. Now I can get back to plowing through this mountain of ldiff data with Clojure! -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts

Re: slow raw io

2010-06-24 Thread cageface
Has anyone else had a chance to try this? I'm surprised to see manual buffering behaving so much better than the BufferedReader implementation but it does seem to make quite a difference. -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this

Re: slow raw io

2010-06-24 Thread cageface
I put a self-contained test up here: http://gist.github.com/452095 To run it copy this to slurptest.clj and run these commands java clojure.main slurptest.clj makewords 100 (100 seems good for macs, 300 for linux) java -Xmx3G -Xms3G clojure.main slurptest.clj slurp| slurp2 Trying either slurp

slow raw io

2010-06-23 Thread cageface
Not sure if this is a clojure issue or a something else but I'm seeing surprisingly slow I/O on large text files. For example, on a unix machine try this: 1. create a large file rm -f words; for x in $(seq 300); do cat /usr/share/dict/words words; done 2. create a clj file that just slurps it

Re: slow raw io

2010-06-23 Thread cageface
For the record, this program runs in 3.3 seconds so I guess that points to the implementation of slurp: (import '[java.io BufferedReader InputStreamReader]) (let [reader (BufferedReader. (InputStreamReader. System/in)) file-data (StringBuffer.) buffer (char-array 4096)] (loop

Re: slow raw io

2010-06-23 Thread Stuart Halloway
I am seeing more like 1.8 seconds for the raw version, vs. 2.8 seconds for slurp (master branch). Can you post a complete example (including the clj script you use, and what version of Clojure), so we can be apples-to-apples? Stu For the record, this program runs in 3.3 seconds so I guess

Re: slow raw io

2010-06-23 Thread cageface
Sure. Here's my clj script: #!/bin/sh if [ -z $1 ]; then exec java -server jline.ConsoleRunner clojure.main else SCRIPT=$(dirname $1) export CLASSPATH=$SCRIPT/*:$SCRIPT:$CLASSPATH exec java -Xmx3G -server clojure.main $1 $@ fi (Usually I don't have the -Xmx flag there. I added it

Re: slow raw io

2010-06-23 Thread cageface
Another example. I'm running this on a Ubuntu 10.04 laptop with this java: java version 1.6.0_18 OpenJDK Runtime Environment (IcedTea6 1.8) (6b18-1.8-0ubuntu1) OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) and this command line: java -Xmx3G -server clojure.main cat2.clj (require

Re: slow raw io

2010-06-23 Thread Stuart Halloway
On my laptop (Mac) the biggest difference here has nothing to do with buffering in slurp. It is whether you use System/in (fast) or *in* (slow). The latter is a LineNumberingPushbackReader. Can you check and confirm? When I slurp System/in it is more than twice as fast as slurping *in*. I

Re: slow raw io

2010-06-23 Thread cageface
Interesting. Here are the times I get: LINUX: slurp, *in* 18.8 seconds slurp, System/in 18.2 seconds slurp2, *in* 6.7 seconds slurp2, System/in 5.7 seconds I have an intel iMac here too, running 10.6.4: slurp, *in* 20.4 seconds slurp, System.in 19.0 seconds slurp2, *in* 7.2 seconds slurp2,