Is it just my impression or Python's regular expressions are much faster 
than julia's?

In my machine:

Python 2:

%timeit a = re.sub(r"[,\.;:'\"!¡?¿_\n/\t\(\)\{\}\[\]\- ]", "<!%", text)
1 loops, best of 3: 302 ms per loop

Julia
@time a = replace(f, r"[,\.;:'\"!¡?¿_\n/\t\(\)\{\}\[\]\- ]", "<!%");
elapsed time: 5.151800487 seconds (1169203300 bytes allocated)

With a corpus of about one million words

This is pretty bad. Is there any way to improve it?

Reply via email to