Re: mapLast, mapFirst, and just general iterator questions
On Tue, 21 Jun 2022 at 06:16, Leo wrote: > > On Wed, 15 Jun 2022 04:47:31 +1000, Chris Angelico wrote: > > > Don't bother with a main() function unless you actually need to be > > able to use it as a function. Most of the time, it's simplest to > > just have the code you want, right there in the file. :) Python > > isn't C or Java, and code doesn't have to get wrapped up in > > functions in order to exist. > > Actually a main() function in Python is pretty useful, because Python > code on the top level executes a lot slower. I believe this is due to > global variable lookups instead of local. > > Here is benchmark output from a small test. > > ``` > Benchmark 1: python3 test1.py > Time (mean ± σ): 662.0 ms ± 44.7 ms > Range (min … max): 569.4 ms … 754.1 ms > > Benchmark 2: python3 test2.py > Time (mean ± σ): 432.1 ms ± 14.4 ms > Range (min … max): 411.4 ms … 455.1 ms > > Summary > 'python3 test2.py' ran > 1.53 ± 0.12 times faster than 'python3 test1.py' > ``` > > Contents of test1.py: > > ``` > l1 = list(range(5_000_000)) > l2 = [] > > while l1: > l2.append(l1.pop()) > > print(len(l1), len(l2)) > ``` > > Contents of test2.py: > > ``` > def main(): > l1 = list(range(5_000_000)) > l2 = [] > > while l1: > l2.append(l1.pop()) > > print(len(l1), len(l2)) > main() > ``` > To be quite honest, I have never once in my life had a time when the execution time of a script is dominated by global variable lookups in what would be the main function, AND it takes long enough to care about it. Yes, technically it might be faster, but I've probably spent more time reading your post than I'll ever save by putting stuff into a function :) Also, often at least some of those *need* to be global in order to be useful, so you'd lose any advantage you gain. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: mapLast, mapFirst, and just general iterator questions
On Wed, 15 Jun 2022 04:47:31 +1000, Chris Angelico wrote: > Don't bother with a main() function unless you actually need to be > able to use it as a function. Most of the time, it's simplest to > just have the code you want, right there in the file. :) Python > isn't C or Java, and code doesn't have to get wrapped up in > functions in order to exist. Actually a main() function in Python is pretty useful, because Python code on the top level executes a lot slower. I believe this is due to global variable lookups instead of local. Here is benchmark output from a small test. ``` Benchmark 1: python3 test1.py Time (mean ± σ): 662.0 ms ± 44.7 ms Range (min … max): 569.4 ms … 754.1 ms Benchmark 2: python3 test2.py Time (mean ± σ): 432.1 ms ± 14.4 ms Range (min … max): 411.4 ms … 455.1 ms Summary 'python3 test2.py' ran 1.53 ± 0.12 times faster than 'python3 test1.py' ``` Contents of test1.py: ``` l1 = list(range(5_000_000)) l2 = [] while l1: l2.append(l1.pop()) print(len(l1), len(l2)) ``` Contents of test2.py: ``` def main(): l1 = list(range(5_000_000)) l2 = [] while l1: l2.append(l1.pop()) print(len(l1), len(l2)) main() ``` -- Leo -- https://mail.python.org/mailman/listinfo/python-list
Re: mapLast, mapFirst, and just general iterator questions
On 15Jun2022 05:49, Chris Angelico wrote: >On Wed, 15 Jun 2022 at 05:45, Roel Schroeven wrote: >> Not (necessarily) a main function, but these days the general >> recommendation seems to be to use the "if __name__ == '__main__':" >> construct, so that the file can be used as a module as well as as a >> script. Even for short simple things that can be helpful when doing >> things like running tests or extracting docstrings. > >If it does need to be used as a module as well as a script, sure. But >(a) not everything does, and (b) even then, you don't need a main() >function; what you need is the name-is-main check. The main function >is only necessary when you need to be able to invoke your main entry >point externally, AND this main entry point doesn't have a better >name. That's fairly rare in my experience. While I will lazily not-use-a-function in dev, using a function has the benefit of avoiding accidental global variable use, because assignments within the function will always make local variables. That is a big plus for me all on its own. I've used this practice as far back as Pascal, which also let you write outside-a-function code, and consider it a great avoider of a common potential bug situation. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: mapLast, mapFirst, and just general iterator questions
On 15/06/22 7:49 am, Chris Angelico wrote: If it does need to be used as a module as well as a script, sure. But (a) not everything does, and (b) even then, you don't need a main() I think this is very much a matter of taste. Personally I find it tidier to put the top level code in a function, because it ties it together visually and lets me have locals that are properly local. If the file is only ever used as a script, I just put an unconditional call to the main function at the bottom. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: mapLast, mapFirst, and just general iterator questions
On Wed, 15 Jun 2022 at 05:45, Roel Schroeven wrote: > > Chris Angelico schreef op 14/06/2022 om 20:47: > > > def main(): > > > for each in (iterEmpty, iter1, iter2, iterMany): > > > baseIterator = each() > > > chopFirst = mapFirst(baseIterator, lambda x: x[1:-1]) > > > andCapLast = mapLast(chopFirst, lambda x: x.upper()) > > > print(repr(" ".join(andCapLast))) > > > > Don't bother with a main() function unless you actually need to be > > able to use it as a function. Most of the time, it's simplest to just > > have the code you want, right there in the file. :) Python isn't C or > > Java, and code doesn't have to get wrapped up in functions in order to > > exist. > Not (necessarily) a main function, but these days the general > recommendation seems to be to use the "if __name__ == '__main__':" > construct, so that the file can be used as a module as well as as a > script. Even for short simple things that can be helpful when doing > things like running tests or extracting docstrings. If it does need to be used as a module as well as a script, sure. But (a) not everything does, and (b) even then, you don't need a main() function; what you need is the name-is-main check. The main function is only necessary when you need to be able to invoke your main entry point externally, AND this main entry point doesn't have a better name. That's fairly rare in my experience. My recommendation is to write the code you need, and only add boilerplate when you actually need it. Don't just start every script with an if-name-is-main block at the bottom just for the sake of doing it. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: mapLast, mapFirst, and just general iterator questions
Chris Angelico schreef op 14/06/2022 om 20:47: > def main(): > for each in (iterEmpty, iter1, iter2, iterMany): > baseIterator = each() > chopFirst = mapFirst(baseIterator, lambda x: x[1:-1]) > andCapLast = mapLast(chopFirst, lambda x: x.upper()) > print(repr(" ".join(andCapLast))) Don't bother with a main() function unless you actually need to be able to use it as a function. Most of the time, it's simplest to just have the code you want, right there in the file. :) Python isn't C or Java, and code doesn't have to get wrapped up in functions in order to exist. Not (necessarily) a main function, but these days the general recommendation seems to be to use the "if __name__ == '__main__':" construct, so that the file can be used as a module as well as as a script. Even for short simple things that can be helpful when doing things like running tests or extracting docstrings. -- "This planet has - or rather had - a problem, which was this: most of the people living on it were unhappy for pretty much of the time. Many solutions were suggested for this problem, but most of these were largely concerned with the movement of small green pieces of paper, which was odd because on the whole it wasn't the small green pieces of paper that were unhappy." -- Douglas Adams -- https://mail.python.org/mailman/listinfo/python-list
Re: mapLast, mapFirst, and just general iterator questions
On Wed, 15 Jun 2022 at 04:07, Travis Griggs wrote: > def mapFirst(stream, transform): > try: > first = next(stream) > except StopIteration: > return > yield transform(first) > yield from stream Small suggestion: Begin with this: stream = iter(stream) That way, you don't need to worry about whether you're given an iterator or some other iterable (for instance, you can't call next() on a list, but it would make good sense to be able to use your function on a list). (BTW, Python's convention would be to call this "map_first" rather than "mapFirst". But that's up to you.) > def mapLast(stream, transform): > try: > previous = next(stream) > except StopIteration: > return > for item in stream: > yield previous > previous = item > yield transform(previous) Hmm. This might be a place to use multiple assignment, but what you have is probably fine too. > def main(): > for each in (iterEmpty, iter1, iter2, iterMany): > baseIterator = each() > chopFirst = mapFirst(baseIterator, lambda x: x[1:-1]) > andCapLast = mapLast(chopFirst, lambda x: x.upper()) > print(repr(" ".join(andCapLast))) Don't bother with a main() function unless you actually need to be able to use it as a function. Most of the time, it's simplest to just have the code you want, right there in the file. :) Python isn't C or Java, and code doesn't have to get wrapped up in functions in order to exist. > Is this idiomatic? Especially my implementations of mapFirst and mapList > there in the middle? Or is there some way to pull this off that is more > elegant? > Broadly so. Even with the comments I've made above, I wouldn't say there's anything particularly *wrong* with your code. There are, of course, many ways to do things, and what's "best" depends on what your code is doing, whether it makes sense in context. > I've been doing more with iterators and stacking them (probably because I've > been playing with Elixir elsewhere), I am generally curious what the > performance tradeoffs of heavy use of iterators and yield functions in python > is. I know the argument for avoiding big list copies when moving between > stages. Is it one of those things where there's also some overhead with them, > where for small stuff, you'd just be better list-ifying the first iterator > and then working with lists (where, for example, I could do the first/last > clamp operation with just indexing operations). > That's mostly right, but more importantly: Don't worry about performance. Worry instead about whether the code is expressing your intent. If that means using a list instead of an iterator, go for it! If that means using an iterator instead of a list, go for it! Python won't judge you. :) But if you really want to know which one is faster, figure out a reasonable benchmark, and then start playing around with the timeit module. Just remember, it's very very easy to spend hours trying to make the benchmark numbers look better, only to discover that it has negligible impact on your code's actual performance - or, in some cases, it's *worse* than before (because the benchmark wasn't truly representative). So if you want to spend some enjoyable time exploring different options, go for it! And we'd be happy to help out. Just don't force yourself to write bad code "because it's faster". ChrisA -- https://mail.python.org/mailman/listinfo/python-list
mapLast, mapFirst, and just general iterator questions
I want to be able to apply different transformations to the first and last elements of an arbitrary sized finite iterator in python3. It's a custom iterator so does not have _reversed_. If the first and last elements are the same (e.g. size 1), it should apply both transforms to the same element. I'm doing this because I have an iterator of time span tuples, and I want to clamp the first and last elements, but know any/all of the middle values are inherently in range. A silly example might be a process that given an iterator of strings, chops the the outer characters off of the value, and uppercases the final value. For example: def iterEmpty(): return iter([]) def iter1(): yield "howdy" def iter2(): yield "howdy" yield "byebye" def iterMany(): yield "howdy" yield "hope" yield "your" yield "day" yield "is" yield "swell" yield "byebye" def mapFirst(stream, transform): try: first = next(stream) except StopIteration: return yield transform(first) yield from stream def mapLast(stream, transform): try: previous = next(stream) except StopIteration: return for item in stream: yield previous previous = item yield transform(previous) def main(): for each in (iterEmpty, iter1, iter2, iterMany): baseIterator = each() chopFirst = mapFirst(baseIterator, lambda x: x[1:-1]) andCapLast = mapLast(chopFirst, lambda x: x.upper()) print(repr(" ".join(andCapLast))) This outputs: '' 'OWD' 'owd BYEBYE' 'owd hope your day is swell BYEBYE' Is this idiomatic? Especially my implementations of mapFirst and mapList there in the middle? Or is there some way to pull this off that is more elegant? I've been doing more with iterators and stacking them (probably because I've been playing with Elixir elsewhere), I am generally curious what the performance tradeoffs of heavy use of iterators and yield functions in python is. I know the argument for avoiding big list copies when moving between stages. Is it one of those things where there's also some overhead with them, where for small stuff, you'd just be better list-ifying the first iterator and then working with lists (where, for example, I could do the first/last clamp operation with just indexing operations). -- https://mail.python.org/mailman/listinfo/python-list