Forwarding to the list, plase use reply-all or reply-list when responding to list mails.
Alan G. -------- Forwarded Message -------- Subject: RE: [Tutor] regular expressions query Date: Fri, 24 May 2019 20:10:48 +1000 From: mhysnm1...@gmail.com To: 'Alan Gauld' <alan.ga...@yahoo.co.uk> Allan, I have gone back to the drawing board as I found to many problems with the approach I was using. As the original data has multiple spaces between words. I want to find unique phrases in the strings such as "Hello World" regardless of the number of spaces that might be in the string. I have used the following lines of code which finds the number of unique complete strings. transaction = [item for item, count in collections.Counter(narration).items() if count > 1] none-dup-narration = [item for item, count in collections.Counter(narration).items() if count < 2] So I end up with two lists one containing complete unique strings with more than one occurrence and another with only one. As there is common words in the none-dup-narration list of strings. I am trying to find a method of extracting this information. I am still reading collections as this could help. But wanted to understand if you can inject variables into the pattern of regular expression which was the intent of the original question. Each time the regular expression is executed, a different word would be in the pattern. In Python 3.7, I want to understand Unions and Maps. I have read information on this in different places and still don't understand why, how and when you would use them. Something else I have been wondering. Goal here is to grow my knowledge in programming. # end if # end for print (count) # end for input () # end for -----Original Message----- From: Tutor <tutor-bounces+mhysnm1964=gmail....@python.org> On Behalf Of Alan Gauld via Tutor Sent: Friday, 24 May 2019 7:41 PM To: tutor@python.org Subject: Re: [Tutor] regular expressions query On 24/05/2019 01:15, mhysnm1...@gmail.com wrote: > Below I am just providing the example of what I want to achieve, not > the original strings that I will be using the regular expression against. While I'm sure you understand what you want I'm not sure I do. Can you be more precise? > The original strings could have: > "Hello world" > "hello World everyone" > "hello everyone" > "hello world and friends" > I have a string which is "hello world" which I want to identify by > using regular expression how many times: > > * "hello" occurs on its own. Define "on its own" Is the answer for the strings above 4? Or is it 1 (ie once without an accompanying world)? > * "Hello world" occurs in the list of strings regardless of the number > of white spaces. I assume you mean the answer above should be 3? Now for each scenario how do we treat "helloworldeveryone"? "hello to the world" "world, hello" "hello, world" > Splitting the string into an array ['hello', 'world'] and then > re-joining it together and using a loop to move through the strings > does not provide the information I want. So I was wondering if this is > possible via regular expressions matching? It is probably possible by splitting the strings and searching, or even just using multiple standard string searches. But regex is possible too. A lot depends on the complexity of your real problem statement, rather than the hello world example you've given. I suspect the real case will be trickier and therefore more likely to need a regex. > Modifying the original string is one option. But I was wondering if > this could be done? I'm not sure what you have in mind. For searching purposes you shouldn't need to modify the original string. (Of course Python strings are immutable so technically you can never modify a string, but in practice you can achieve the same effect.) -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor