Apologies, In my previous email, I meant 'insert a marker', rather than 'push a marker'
On Thu, Jun 8, 2017 at 7:17 PM Stestagg <stest...@gmail.com> wrote: > I tracked down the challenge on the site, and have a working solution (I > won't share for obvious reasons). Basically the timeouts were being caused > by 'add_to_first_n' being called in horrible ways in the test cases. > > Because add_to_first_n alters the bottom of the stack, you can just push a > marker onto the stack rather than iterating and mutating each entry, doing > this made those test cases pass > > Personally, I think it's not a well-described problem, because it's > expecting you to tune the algo to specific shapes of data without allowing > any visibility on the data, or a description of what to code for. An algo > junkie may jump straight to the optimized version, but a pragmatic > developer would, in my opinion, hesitate to do that without any actual > evidence that the problem required it. > > Steve > > > > > > On Thu, Jun 8, 2017 at 5:27 PM Jonathan Hartley <tart...@tartley.com> > wrote: > >> Yep, that's a great elimination of the suspicious small overheads. >> >> line_profiler is beautiful, I'll definitely be adding it to my toolbox, >> thanks for that! >> >> I tried a variant of accumulating the output and printing it all as a >> single string, but of course this didn't help, printing is already buffered. >> >> Jonathan >> >> On 6/8/2017 03:54, Stestagg wrote: >> >> I honestly can't see a way to improve this in python. My best solution >> is: >> >> def main(lines): >> stack = [] >> sa = stack.append >> sp = stack.pop >> si = stack.__getitem__ >> for line in lines: >> meth = line[:3] >> if meth == b'pus': >> sa(int(line[5:])) >> elif meth == b'pop': >> sp() >> else: >> parts = line[15:].split() >> end = len(stack)-1 >> amount = int(parts[1]) >> for x in range(int(parts[0])): >> index = end - x >> stack[index] += amount >> print(stack[-1] if stack else None) >> >> which comes out about 25% faster than your solution. >> >> One tool that's interesting to use here is: line_profiler: >> https://github.com/rkern/line_profiler >> >> putting a @profile decorator on the above main() call, and running with >> kernprof produces the following output: >> >> Line # Hits Time Per Hit % Time Line Contents >> >> ============================================================== >> >> 12 @profile >> >> 13 def main(lines): >> >> 14 1 4 4.0 0.0 stack = [] >> >> 15 2000001 949599 0.5 11.5 for line in lines: >> >> 16 2000000 1126944 0.6 13.7 meth = line[:3] >> >> 17 2000000 974635 0.5 11.8 if meth == >> b'pus': >> >> 18 1000000 1002733 1.0 12.2 >> stack.append(int(line[5:])) >> >> 19 1000000 478756 0.5 5.8 elif meth == >> b'pop': >> >> 20 999999 597114 0.6 7.2 stack.pop() >> >> 21 else: >> >> 22 1 6 6.0 0.0 parts = >> line[15:].split() >> >> 23 1 2 2.0 0.0 end = >> len(stack)-1 >> >> 24 1 1 1.0 0.0 amount = >> int(parts[1]) >> >> 25 500001 241227 0.5 2.9 for x in >> range(int(parts[0])): >> >> 26 500000 273477 0.5 3.3 index = >> end - x >> >> 27 500000 309033 0.6 3.7 stack[index] >> += amount >> >> 28 2000000 2295803 1.1 27.8 print(stack[-1]) >> >> which shows that there's no obvious bottleneck (line by line) here (for >> my sample data). >> >> Note the print() overhead dominates the runtime, and that's with me >> piping the output to /dev/null directly. >> >> I had a go at using arrays, deques, and numpy arrays in various ways >> without luck, but we're getting fairly close to the native python statement >> execution overhead here (hence folding it all into one function). >> >> My only thoughts would be to see if there were some magic that could be >> done by offloading the work onto a non-python library somehow. >> >> Another thing that might help some situations (hence my previous >> questions) would be to implement the add_to_first_n as a lazy operator >> (i.e. have a stack of the add_to_first_n values and dynamically add to the >> results of pop() but that would proabably be much slow in the average case. >> >> Steve >> >> On Wed, Jun 7, 2017 at 7:34 PM Jonathan Hartley <tart...@tartley.com> >> wrote: >> >>> Hey. >>> >>> Thanks for engaging, but I can't help with the most important of those >>> questions - the large data sets on which my solution failed due to timeout >>> are hidden from candidates. Not unreasonable to assume that they do >>> exercise deep stacks, and large args to add_to_first_n, etc. >>> >>> Yes, the input looks exactly like your example. All args are integers. >>> The question asked for output corresponding to the top of the stack after >>> every operation. I omitted this print from inside the 'for' loop in 'main', >>> thinking it irrelevant. >>> >>> I converted to integers inside 'dispatch'. 'args' must have actually >>> been created with: >>> >>> args = [int(i) for i in tokens[1:]] >>> >>> Where len(tokens) is never going to be bigger than 3. >>> >>> Return values (from 'pop') were unused. >>> >>> >>> On 6/7/2017 13:25, Stestagg wrote: >>> >>> Do you have any more context? >>> For example, is the add_to_first_n likely to be called with very large >>> numbers, or very often? Does the stack get very deep, or stay shallow? >>> >>> I'm assuming that lines look like this: >>> >>> push 1 >>> push 2 >>> add_to_first_n 2 10 >>> pop >>> pop >>> >>> with all arguments as integers, and the final value being returned from >>> main()? >>> How did you convert from string inputs to numeric values? >>> How did you manage return values? >>> >>> :D >>> >>> On Wed, Jun 7, 2017 at 6:51 PM Jonathan Hartley <tart...@tartley.com> >>> wrote: >>> >>>> I recently submitted a solution to a coding challenge, in an employment >>>> context. One of the questions was to model a simple stack. I wrote a >>>> solution which appended and popped from the end of a list. This worked, but >>>> failed with timeouts on their last few automated tests with large (hidden) >>>> data sets. >>>> >>>> From memory, I think I had something pretty standard: >>>> >>>> class Stack: >>>> >>>> def __init__(self): >>>> self.storage = [] >>>> >>>> def push(arg): >>>> self.storage.append(arg) >>>> >>>> def pop(): >>>> return self.storage.pop() if self.storage else None >>>> >>>> def add_to_first_n(n, amount): >>>> for n in range(n): >>>> self.storage[n] += amount >>>> >>>> def dispatch(self, line) >>>> tokens = line.split() >>>> method = getattr(self, tokens[0]) >>>> args = tokens[1:] >>>> method(*args) >>>> >>>> def main(lines): >>>> stack = Stack() >>>> for line in lines: >>>> stack.dispatch(line) >>>> >>>> >>>> (will that formatting survive? Apologies if not) >>>> >>>> Subsequent experiments have confirmed that appending to and popping >>>> from the end of lists are O(1), amortized. >>>> So why is my solution too slow? >>>> >>>> This question was against the clock, 4th question of 4 in an hour. So I >>>> wasn't expecting to produce Cython or C optimised code in that timeframe >>>> (Besides, my submitted .py file runs on their servers, so the environment >>>> is limited.) >>>> >>>> So what am I missing, from a performance perspective? Are there other >>>> data structures in stdlib which are also O(1) but with a better constant? >>>> >>>> Ah. In writing this out, I have begun to suspect that my slicing of >>>> 'tokens' to produce 'args' in the dispatch is needlessly wasting time. Not >>>> much, but some. >>>> >>>> Thoughts welcome, >>>> >>>> Jonathan >>>> >>>> -- >>>> Jonathan Hartley tart...@tartley.com http://tartley.com >>>> Made out of meat. +1 507-513-1101 <%28507%29%20513-1101> >>>> twitter/skype: tartley >>>> >>>> >>>> _______________________________________________ >>>> python-uk mailing list >>>> python-uk@python.org >>>> https://mail.python.org/mailman/listinfo/python-uk >>>> >>> >>> >>> _______________________________________________ >>> python-uk mailing >>> listpython-uk@python.orghttps://mail.python.org/mailman/listinfo/python-uk >>> >>> >>> -- >>> Jonathan Hartley tart...@tartley.com http://tartley.com >>> Made out of meat. +1 507-513-1101 <%28507%29%20513-1101> >>> twitter/skype: tartley >>> >>> >>> _______________________________________________ >>> python-uk mailing list >>> python-uk@python.org >>> https://mail.python.org/mailman/listinfo/python-uk >>> >> >> >> _______________________________________________ >> python-uk mailing >> listpython-uk@python.orghttps://mail.python.org/mailman/listinfo/python-uk >> >> >> -- >> Jonathan Hartley tart...@tartley.com http://tartley.com >> Made out of meat. +1 507-513-1101 <(507)%20513-1101> twitter/skype: >> tartley >> >> >> _______________________________________________ >> python-uk mailing list >> python-uk@python.org >> https://mail.python.org/mailman/listinfo/python-uk >> >
_______________________________________________ python-uk mailing list python-uk@python.org https://mail.python.org/mailman/listinfo/python-uk