Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
My Py solution: = import psyco psyco.full() def foo(s): n = len(s) s = s + ' ' a = [[] for i in xrange(128)] ans = 0 for i in xrange(n - 1, -1, -1): lev = 0 for st in xrange(len(a[ord(s[i])]) - 1, -1, -1):

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
This worked out in 5.28s Imo it's not that *much* slower (of course, Psyco can't help here) === import itertools def subs(s): return len(set(itertools.chain( s[i:j] for i in xrange(len(s)) for j in xrange(i, len(s)+1 - 1 from time

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread Bearophile
n00m: This worked out in 5.28s Imo it's not that *much* slower (of course, Psyco can't help here) There's no need of a chain here, so you can rewrite this: import itertools def subs(s): return len(set(itertools.chain( s[i:j] for i in xrange(len(s)) for j in

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread Bearophile
n00m: My Py solution: ... Cute. You can replace: a[ord(s[i])] += [i] With: a[ord(s[i])].append(i) If you want to micro-optimize this with Psyco may be a little faster: (lev 1) Than: lev // 2 But to increase speed it's usually better to reduce memory allocations. So you can try to find a way

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread Bearophile
n00m: my home tests proved Python is a fellow fast beast of C++. Quite unexpected (I expected Py would be by ~10 times slower). PS Both my codes are identical in their algorithms. = 0.016         0.0150001049042   --- exec times Maybe in your C++ code there's

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
On Nov 29, 3:15 pm, Bearophile bearophileh...@lycos.com wrote: Maybe in your C++ code there's something that can be improved, this is a 1:1 translation to D (V.1) language (using dlibs) and it's about 2.2 times faster than the Psyco version:http://codepad.org/MQLj0ydB Using a smarter usage of

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
http://en.wikipedia.org/wiki/Suffix_tree Looks not very friendly appealing :-) -- http://mail.python.org/mailman/listinfo/python-list

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread inhahe
On Sun, Nov 29, 2009 at 9:10 AM, n00m n...@narod.ru wrote: Even if Py by 4x *slower* -- it's still a perfect Ok for it (C# will be much (much) slower than Python). How do you figure? As far as I know C# is many, many times faster than Python. (i was disappointed to find out that even

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
Tested both my codes against a random string of length = 1. === from random import choice s = '' for i in xrange(1): s += choice(('a','b','c','d','e','f')) === C++: ~0.28s Python: ~0.48s PS I suspect

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread Bearophile
n00m: I suspect that building of Suffix Tree would be a big exec.time-consuming overhead In C/D/C++ there are ways to allocate memory in smarter ways, using pools, arenas, stacks, freelists, etc. With that, using a compact encoding of tree nodes (there are many different tricks to reduce space

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
On Nov 29, 11:43 pm, Bearophile bearophileh...@lycos.com wrote: Anyway, you may try a pure-Python2.x implementation:http://suffixtree.googlecode.com/files/suffixtree-0.1.py Ouch, Bearie... 214 lines of code are there. Ok.. I tested it. Firstly I replaced all print with pass##print (aiming to