Re: [python-uk] A stack with better performance than using a list

Daniel Pope Sun, 11 Jun 2017 07:23:40 -0700

I was able to get about a 20% speed up over Steve's solution, on some
benchmark data I created, by:


* converting LOAD_GLOBAL to LOAD_FAST for __builtins__
* eliminating the conditional in each loop in favour of a conditional on
pop only
* eliminating string comparison for the operation in favour of testing
line[1] against byte values

None of these had a significant effect in either direction:

* replacing range objects with counts
* replacing the string.split()/int() calls with a hand-rolled
space-separated integer parser in Python.

Oh, in my benchmarking, the print() calls were huge and highly variable...
I stubbed them out first.

On Thu, 8 Jun 2017, 21:39 Stestagg, <stest...@gmail.com> wrote:

> Apologies, In my previous email, I meant 'insert a marker', rather than
> 'push a marker'
>
> On Thu, Jun 8, 2017 at 7:17 PM Stestagg <stest...@gmail.com> wrote:
>
>> I tracked down the challenge on the site, and have a working solution (I
>> won't share for obvious reasons). Basically the timeouts were being caused
>> by 'add_to_first_n' being called in horrible ways in the test cases.
>>
>> Because add_to_first_n alters the bottom of the stack, you can just push
>> a marker onto the stack rather than iterating and mutating each entry,
>> doing this made those test cases pass
>>
>> Personally, I think it's not a well-described problem, because it's
>> expecting you to tune the algo to specific shapes of data without allowing
>> any visibility on the data, or a description of what to code for.  An algo
>> junkie may jump straight to the optimized version, but a pragmatic
>> developer would, in my opinion, hesitate to do that without any actual
>> evidence that the problem required it.
>>
>> Steve
>>
>>
>>
>>
>>
>> On Thu, Jun 8, 2017 at 5:27 PM Jonathan Hartley <tart...@tartley.com>
>> wrote:
>>
>>> Yep, that's a great elimination of the suspicious small overheads.
>>>
>>> line_profiler is beautiful, I'll definitely be adding it to my toolbox,
>>> thanks for that!
>>>
>>> I tried a variant of accumulating the output and printing it all as a
>>> single string, but of course this didn't help, printing is already buffered.
>>>
>>>     Jonathan
>>>
>>> On 6/8/2017 03:54, Stestagg wrote:
>>>
>>> I honestly can't see a way to improve this in python.  My best solution
>>> is:
>>>
>>> def main(lines):
>>>     stack = []
>>>     sa = stack.append
>>>     sp = stack.pop
>>>     si = stack.__getitem__
>>>     for line in lines:
>>>         meth = line[:3]
>>>         if meth == b'pus':
>>>             sa(int(line[5:]))
>>>         elif meth == b'pop':
>>>             sp()
>>>         else:
>>>             parts = line[15:].split()
>>>             end = len(stack)-1
>>>             amount = int(parts[1])
>>>             for x in range(int(parts[0])):
>>>                 index = end - x
>>>                 stack[index] += amount
>>>         print(stack[-1] if stack else None)
>>>
>>> which comes out about 25% faster than your solution.
>>>
>>> One tool that's interesting to use here is: line_profiler:
>>> https://github.com/rkern/line_profiler
>>>
>>> putting a @profile decorator on the above main() call, and running with
>>> kernprof produces the following output:
>>>
>>> Line #      Hits         Time  Per Hit   % Time  Line Contents
>>>
>>> ==============================================================
>>>
>>>     12                                           @profile
>>>
>>>     13                                           def main(lines):
>>>
>>>     14         1            4      4.0      0.0      stack = []
>>>
>>>     15   2000001       949599      0.5     11.5      for line in lines:
>>>
>>>     16   2000000      1126944      0.6     13.7          meth = line[:3]
>>>
>>>
>>>     17   2000000       974635      0.5     11.8          if meth ==
>>> b'pus':
>>>
>>>     18   1000000      1002733      1.0     12.2
>>> stack.append(int(line[5:]))
>>>
>>>     19   1000000       478756      0.5      5.8          elif meth ==
>>> b'pop':
>>>
>>>     20    999999       597114      0.6      7.2              stack.pop()
>>>
>>>     21                                                   else:
>>>
>>>     22         1            6      6.0      0.0              parts =
>>> line[15:].split()
>>>
>>>     23         1            2      2.0      0.0              end =
>>> len(stack)-1
>>>
>>>     24         1            1      1.0      0.0              amount =
>>> int(parts[1])
>>>
>>>     25    500001       241227      0.5      2.9              for x in
>>> range(int(parts[0])):
>>>
>>>     26    500000       273477      0.5      3.3                  index
>>> = end - x
>>>
>>>     27    500000       309033      0.6      3.7                  
>>> stack[index]
>>> += amount
>>>
>>>     28   2000000      2295803      1.1     27.8
>>> print(stack[-1])
>>>
>>> which shows that there's no obvious bottleneck (line by line) here (for
>>> my sample data).
>>>
>>> Note the print() overhead dominates the runtime, and that's with me
>>> piping the output to /dev/null directly.
>>>
>>> I had a go at using arrays, deques, and numpy arrays in various ways
>>> without luck, but we're getting fairly close to the native python statement
>>> execution overhead here (hence folding it all into one function).
>>>
>>> My only thoughts would be to see if there were some magic that could be
>>> done by offloading the work onto a non-python library somehow.
>>>
>>> Another thing that might help some situations (hence my previous
>>> questions) would be to implement the add_to_first_n as a lazy operator
>>> (i.e. have a stack of the add_to_first_n values and dynamically add to the
>>> results of pop() but that would proabably be much slow in the average case.
>>>
>>> Steve
>>>
>>> On Wed, Jun 7, 2017 at 7:34 PM Jonathan Hartley <tart...@tartley.com>
>>> wrote:
>>>
>>>> Hey.
>>>>
>>>> Thanks for engaging, but I can't help with the most important of those
>>>> questions - the large data sets on which my solution failed due to timeout
>>>> are hidden from candidates. Not unreasonable to assume that they do
>>>> exercise deep stacks, and large args to add_to_first_n, etc.
>>>>
>>>> Yes, the input looks exactly like your example. All args are integers.
>>>> The question asked for output corresponding to the top of the stack after
>>>> every operation. I omitted this print from inside the 'for' loop in 'main',
>>>> thinking it irrelevant.
>>>>
>>>> I converted to integers inside 'dispatch'. 'args' must have actually
>>>> been created with:
>>>>
>>>> args = [int(i) for i in tokens[1:]]
>>>>
>>>> Where len(tokens) is never going to be bigger than 3.
>>>>
>>>> Return values (from 'pop') were unused.
>>>>
>>>>
>>>> On 6/7/2017 13:25, Stestagg wrote:
>>>>
>>>> Do you have any more context?
>>>> For example, is the add_to_first_n likely to be called with very large
>>>> numbers, or very often? Does the stack get very deep, or stay shallow?
>>>>
>>>> I'm assuming that lines look like this:
>>>>
>>>> push 1
>>>> push 2
>>>> add_to_first_n 2 10
>>>> pop
>>>> pop
>>>>
>>>> with all arguments as integers, and the final value being returned from
>>>> main()?
>>>> How did you convert from string inputs to numeric values?
>>>> How did you manage return values?
>>>>
>>>> :D
>>>>
>>>> On Wed, Jun 7, 2017 at 6:51 PM Jonathan Hartley <tart...@tartley.com>
>>>> wrote:
>>>>
>>>>> I recently submitted a solution to a coding challenge, in an
>>>>> employment context. One of the questions was to model a simple stack. I
>>>>> wrote a solution which appended and popped from the end of a list. This
>>>>> worked, but failed with timeouts on their last few automated tests with
>>>>> large (hidden) data sets.
>>>>>
>>>>> From memory, I think I had something pretty standard:
>>>>>
>>>>> class Stack:
>>>>>
>>>>>     def __init__(self):
>>>>>         self.storage = []
>>>>>
>>>>>     def push(arg):
>>>>>         self.storage.append(arg)
>>>>>
>>>>>     def pop():
>>>>>         return self.storage.pop() if self.storage else None
>>>>>
>>>>>     def add_to_first_n(n, amount):
>>>>>         for n in range(n):
>>>>>             self.storage[n] += amount
>>>>>
>>>>>     def dispatch(self, line)
>>>>>         tokens = line.split()
>>>>>         method = getattr(self, tokens[0])
>>>>>         args = tokens[1:]
>>>>>         method(*args)
>>>>>
>>>>> def main(lines):
>>>>>     stack = Stack()
>>>>>     for line in lines:
>>>>>         stack.dispatch(line)
>>>>>
>>>>>
>>>>> (will that formatting survive? Apologies if not)
>>>>>
>>>>> Subsequent experiments have confirmed that appending to and popping
>>>>> from the end of lists are O(1), amortized.
>>>>> So why is my solution too slow?
>>>>>
>>>>> This question was against the clock, 4th question of 4 in an hour. So
>>>>> I wasn't expecting to produce Cython or C optimised code in that timeframe
>>>>> (Besides, my submitted .py file runs on their servers, so the environment
>>>>> is limited.)
>>>>>
>>>>> So what am I missing, from a performance perspective? Are there other
>>>>> data structures in stdlib which are also O(1) but with a better constant?
>>>>>
>>>>> Ah. In writing this out, I have begun to suspect that my slicing of
>>>>> 'tokens' to produce 'args' in the dispatch is needlessly wasting time. Not
>>>>> much, but some.
>>>>>
>>>>> Thoughts welcome,
>>>>>
>>>>>     Jonathan
>>>>>
>>>>> --
>>>>> Jonathan Hartley    tart...@tartley.com    http://tartley.com
>>>>> Made out of meat.   +1 507-513-1101 <%28507%29%20513-1101>        
>>>>> twitter/skype: tartley
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> python-uk mailing list
>>>>> python-uk@python.org
>>>>> https://mail.python.org/mailman/listinfo/python-uk
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> python-uk mailing 
>>>> listpython-uk@python.orghttps://mail.python.org/mailman/listinfo/python-uk
>>>>
>>>>
>>>> --
>>>> Jonathan Hartley    tart...@tartley.com    http://tartley.com
>>>> Made out of meat.   +1 507-513-1101 <%28507%29%20513-1101>        
>>>> twitter/skype: tartley
>>>>
>>>>
>>>> _______________________________________________
>>>> python-uk mailing list
>>>> python-uk@python.org
>>>> https://mail.python.org/mailman/listinfo/python-uk
>>>>
>>>
>>>
>>> _______________________________________________
>>> python-uk mailing 
>>> listpython-uk@python.orghttps://mail.python.org/mailman/listinfo/python-uk
>>>
>>>
>>> --
>>> Jonathan Hartley    tart...@tartley.com    http://tartley.com
>>> Made out of meat.   +1 507-513-1101 <(507)%20513-1101>        
>>> twitter/skype: tartley
>>>
>>>
>>> _______________________________________________
>>> python-uk mailing list
>>> python-uk@python.org
>>> https://mail.python.org/mailman/listinfo/python-uk
>>>
>> _______________________________________________
> python-uk mailing list
> python-uk@python.org
> https://mail.python.org/mailman/listinfo/python-uk
>

_______________________________________________
python-uk mailing list
python-uk@python.org
https://mail.python.org/mailman/listinfo/python-uk

Re: [python-uk] A stack with better performance than using a list

Reply via email to