If this was for a school assignment, I'd probably go to edit distance and fuzzy string match next: https://en.wikipedia.org/wiki/Edit_distance https://en.wikipedia.org/wiki/String-to-string_correction_problem
- https://pypi.org/search/?q=Levenshtein - https://pypi.org/project/textdistance/ As a bioinformatics program, this is a bit like CRISPR: https://en.wikipedia.org/wiki/CRISPR BioPython Seq has a count_overlap method with a BSD 3-Clause LICENSE: https://github.com/biopython/biopython/blob/master/LICENSE.rst Can it be made faster with e.g. itertools.count and a generator comprehension? - Bio.Seq.Seq.count_overlap() http://biopython.org/DIST/docs/api/Bio.Seq.Seq-class.html#count_overlap Are there any changes or features necessary in core Python in order to finish this application? If not, the python-tutor mailing list or r/learnpython are set up to handle this sort of thing. It may or may not be appropriate for core Python to support all of these string algorithms: http://rosalind.info/problems/topics/string-algorithms/ On Thursday, April 26, 2018, Julia Kim <julia.hiyeon....@gmail.com> wrote: > There are two ‘AA’ in ‘AAA’, one starting from 0 and the other starting > from 1. > > If ‘AA’ starting from 0 is deleted and inserted with ‘BANAN’, ‘AAA’ > becomes ‘BANANA ‘. > > If ‘AA’ starting from 1 is deleted and inserted with ‘PPLE’, ‘AAA’ becomes > ‘APPLE’. > > Depending on which one is chosen, ‘AAA’ can be edited to ‘BANANA’ or > ‘APPLE ‘, two different results. > > > I wrote a program which edits a part of a text. If the part to be edited > occurs more than once, it presents the positions and asks the user to > choose which one to be edited. > > I tried with different algorithms. Best one so far would be using just > find() and collecting the results in a list. > > > > On Apr 25, 2018, at 11:57 PM, Wes Turner <wes.tur...@gmail.com> wrote: > > > > On Wednesday, April 25, 2018, Steven D'Aprano <st...@pearwood.info> wrote: > >> On Wed, Apr 25, 2018 at 11:22:24AM -0700, Julia Kim wrote: >> > Hi, >> > >> > There’s an error with the string method count(). >> > >> > x = ‘AAA’ >> > y = ‘AA’ >> > print(x.count(y)) >> > >> > The output is 1, instead of 2. >> >> Are you proposing that there ought to be a version of count that looks >> for *overlapping* substrings? >> >> When will this be useful? > > > "Finding a motif in DNA" > http://rosalind.info/problems/subs/ > > This is possible with re.find, re.finditer, re.findall, regex.findall(, > overlapped=True), sliding window > https://stackoverflow.com/questions/2970520/string-count-with-overlapping- > occurrences > > n-grams can be by indices or by value. > count = len(indices) > https://en.wikipedia.org/wiki/N-gram#Examples > > https://en.wikipedia.org/wiki/String_(computer_science)# > String_processing_algorithms > > https://en.wikipedia.org/wiki/Sequential_pattern_mining > > >> >> -- >> Steve >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas@python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/