05.12.19 23:47, Kyle Stanley пише:

Serhiy Storchaka wrote:
 > We still do not know a use case for findfirst. If the OP would show his
 > code and several examples in others code this could be an argument for
 > usefulness of this feature.

I'm not sure about the OP's exact use case, but using GitHub's code search for .py files that match with "first re.findall" shows a decent amount of code that uses the format ``re.findall()[0]``. It would be nice if GitHub's search properly supported symbols and regular expressions, but this presents a decent number of examples. See https://github.com/search?l=Python&q=first+re.findall&type=Code.

I also spent some time looking for a few specific examples, since there were a number of false positives in the above results. Note that I didn't look much into the actual purpose of the code or judge it based on quality, I was just looking for anything that seemed remotely practical and contained something along the lines of ``re.findall()[0]``. Several of the links below contain multiple lines where findfirst would likely be a better alternative, but I only included one permalink per code file.

Thank you Kyle for your investigation!

https://github.com/MohamedAl-Hussein/my_projects/blob/15feca5254fe1b2936d39369365867496ce5b2aa/fifa_workspace/fifa_market_analysis/fifa_market_analysis/items.py#L325

It is easy to rewrite it using re.search().

- input_processor=MapCompose(lambda x: re.findall(r'pointDRI = ([0-9]+)', x)[0], eval), + input_processor=MapCompose(lambda x: re.search(r'pointDRI = ([0-9]+)', x).group(1), eval),

I also wonder if it is worth to replace eval with more efficient and safe int.


https://github.com/MohamedAl-Hussein/FIFA/blob/2b1390fe46f94648e5b0bcfd28bc67a3bc43f09d/fifa_data/fifa_data/items.py#L370

It is the same code differently formatted.

https://github.com/democracyworks/dog-catcher/blob/9f6200084d4505091399d36ab0d5e3379b04588c/new_jersey.py#L82

-       clerk_name = name_re.findall(clerk)[0]
+       clerk_name = name_re.search(clerk).group(1)


https://github.com/democracyworks/dog-catcher/blob/9f6200084d4505091399d36ab0d5e3379b04588c/connecticut.py#L182

-     official_name = name_re.findall(town)[0].title()
+     official_name = name_re.search(town).group().title()


https://github.com/jessyL6/CQUPTHUB-spiders_task1/blob/db73c47c0703ed01eb2a6034c37edd9e18abb2e0/ZhongBiao2/spiders/zhongbiao2.py#L176

-             first_1_results = re.findall(first_1,all_list9)[0]
+             first_1_results = re.findall(first_1,all_list9).group(1)



https://github.com/kerinin/giscrape/blob/d398206ed4a7e48e1ef6afbf37b4f98784cf2442/giscrape/spiders/people_search.py#L26

It is a complex example which performs multiple searches with different regular expressions. It is all can be replaced with a single more efficient regular expression.

-   if re.search('^(\w+) (\w+)$', parcel.owner):
-     last, first = re.findall( '(\w+) (\w+)',parcel.owner )[0]
-   elif re.search('^(\w+) (\w+) (\w+)$', parcel.owner):
- last, first, middle = re.findall( '(\w+) (\w+) (\w+)',parcel.owner )[0]
-   elif re.search('^(\w+) (\w+) & (\w+)$', parcel.owner):
-     last, first = re.findall( '(\w+) (\w+)',parcel.owner )[0]
-   elif re.search('^(\w+) (\w+) (\w+) &amp: (\w+)$', parcel.owner):
- last, first, middle = re.findall( '(\w+) (\w+) (\w+)',parcel.owner )[0]
-   elif re.search('^(\w+) (\w+) & (\w+) (\w+)$', parcel.owner):
-     last, first = re.findall( '(\w+) (\w+)',parcel.owner )[0]
-   elif re.search('^(\w+) (\w+) (\w+) &amp: (\w+) (\w+)$', parcel.owner):
- last, first, middle = re.findall( '(\w+) (\w+) (\w+)',parcel.owner )[0]
-   elif re.search('^(\w+) (\w+) & (\w+) (\w+) (\w+)$', parcel.owner):
-     last, first = re.findall( '(\w+) (\w+)',parcel.owner )[0]
- elif re.search('^(\w+) (\w+) (\w+) &amp: (\w+) (\w+) (\w+)$', parcel.owner): - last, first, middle = re.findall( '(\w+) (\w+) (\w+)', parcel.owner )[0]

+ m = re.fullmatch('(\w+) (\w+)(?: (\w+))?(?: &(?: \w+){1,3})?', parcel.owner)
+   if m:
+     last, first, middle = m.groups()


https://github.com/songweifun/parsebook/blob/529a86739208e9dc07abbb31363462e2921f00a0/dao/parseMarc.py#L211

This is the only example which checks if findall() returns an empty list. It calls findall() twice! Fortunately it can be easily optimized using a fact that the Match object support subscription. I used group() above because it is more explicit and works in older Python.

- self.item.first_tutor_name = REGPX_A.findall(value)[0] if REGPX_A.findall(value) else '' + self.item.first_tutor_name = (REGPX_A.search(value) or [''])[0]


It seems that in most cases the author just do not know about re.search(). Adding re.findfirst() will not fix this.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5O2TP5HZHHJC7E55K2OYVKND4ITDB5DM/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to