On Sep 15, 2:07 pm, John Machin <[EMAIL PROTECTED]> wrote: > On Sep 15, 10:56 pm, Paddy <[EMAIL PROTECTED]> wrote: > > > > > On Sep 14, 9:49 pm, Paddy <[EMAIL PROTECTED]> wrote: > > > > Lets say i have a generator running that generates successive > > > characters of a 'string'>From what I know, if I want to do a regexp > > > search for a pattern of > > > > characters then I would have to 'freeze' the generator and pass the > > > characters so far to re.search. > > > It is expensive to create successive characters, but caching could be > > > used for past characters. is it possible to wrap the generator in a > > > class, possibly inheriting from string, that would allow the regexp > > > searching of the string but without terminating the generator? In > > > other words duck typing for the usual string object needed by > > > re.search? > > > > - Paddy. > > > There seems to be no way of breaking into the re library accessing > > characters from the string: > > > >>> class S(str): > > > ... def __getitem__(self, *a): > > ... print "getitem:",a > > ... return str.__getitem__(self, *a) > > ... def __get__(self, *a): > > ... print "get:",a > > ... return str.__get__(self, *a) > > ...>>> s = S('sdasd') > > >>> m = re.search('as', s); m.span() > > (2, 4) > > >>> m = sre.search('as', s); m.span() > > (2, 4) > > >>> class A(array.array): > > > ... def __getitem__(self, *a): > > ... print "getitem:",a > > ... return str.__getitem__(self, *a) > > ... def __get__(self, *a): > > ... print "get:",a > > ... return str.__get__(self, *a) > > ... > > > >>> s = A('c','sdasd') > > >>> m = re.search('as', s); m.span() > > (2, 4) > > >>> m = sre.search('as', s); m.span() > > (2, 4) > > > - Paddy. > > That would no doubt be because it either copies the input [we hope > not] or more likely because it hands off the grunt work to a C module > (_sre).
Yes, it seems to need a buffer/string so probably access a contiguous area of memory from C. o > > Why do you want to "break into" it, anyway? A simulation generates stream of data that could be gigabytes from which I'd like to find interesting bits by doing a regexp search. I could use megabyte length sliding buffers, and probably will have to. - Paddy. -- http://mail.python.org/mailman/listinfo/python-list