Ok, then the good news is that regex is already a lot faster, but the bad news is that you have to wait until 0.3 is released or use a prerelease build.
On Saturday, June 21, 2014 4:31:03 PM UTC-7, Matías Guzmán Naranjo wrote: > > The one in the ubuntu repositories: "Version 0.2.1 (2014-02-11 06:30 UTC)" > > > 2014-06-22 1:24 GMT+02:00 Daniel Jones <[email protected] <javascript:>> > : > >> What version of julia are you using here? I made some improvements to >> regex speed a few weeks ago. I think it's still slightly slower than >> python, but it shouldn't be this slow. >> >> >> On Saturday, June 21, 2014 4:12:55 PM UTC-7, Matías Guzmán Naranjo wrote: >>> >>> Is it just my impression or Python's regular expressions are much faster >>> than julia's? >>> >>> In my machine: >>> >>> Python 2: >>> >>> %timeit a = re.sub(r"[,\.;:'\"!¡?¿_\n/\t\(\)\{\}\[\]\- ]", "<!%", text) >>> 1 loops, best of 3: 302 ms per loop >>> >>> Julia >>> @time a = replace(f, r"[,\.;:'\"!¡?¿_\n/\t\(\)\{\}\[\]\- ]", "<!%"); >>> elapsed time: 5.151800487 seconds (1169203300 bytes allocated) >>> >>> With a corpus of about one million words >>> >>> This is pretty bad. Is there any way to improve it? >>> >> >
