Emanuele D'Arrigo wrote:
Hi everybody,

I just had a bit of a shiver for something I'm doing often in my code
but that might be based on a wrong assumption on my part. Take the
following code:

pattern = "aPattern"

compiledPatterns = [ ]
compiledPatterns.append(re.compile(pattern))

if(re.compile(pattern) in compiledPatterns):

Note that for this generally take time proportional to the length of the list. And as MRAB said, drop the parens.

    print("The compiled pattern is stored.")

As you can see I'm effectively assuming that every time re.compile()
is called with the same input pattern it will return the exact same
object rather than a second, identical, object. In interactive tests
via python shell this seems to be the case but... can I rely on it -
always- being the case? Or is it one of those implementation-specific
issues?

As MRAB indicated, this only works because the CPython re module itself has a cache so you do not have to make one. It is, however, limited to 100 or so since programs that use patterns repeatedly generally use a limited number of patterns. Caches usually use a dict so that cache[input] == output and lookup is O(1).

And what about any other function or class/method? Is there a way to
discriminate between methods and functions that when invoked twice
with the same arguments will return the same object and those that in
the same circumstances will return two identical objects?

In general, a function that calculates and return an object will return a new object. The exceptions are exceptions.


If the answer is no, am I right to state the in the case portrayed
above the only way to be safe is to use the following code instead?

for item in compiledPatterns:
   if(item.pattern == pattern):

Yes. Unless you are comparing against None (or True or False in Py3) or specifically know otherwise, you probably want '==' rather than 'is'.

Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to