no, the problem here is definitely that you're generating a lot of garbage that survives the young generation. you can always try playing with PYPY_GC_NURSERY envirtonment var (defaults to 4M I think)
On Mon, Mar 17, 2014 at 9:04 PM, Martin Koch <m...@issuu.com> wrote: > Well, it would appear that we have the problem because we're generating a > lot of garbage in the young generation, just like we're doing in the example > we've been studying here. I'm unsure how we can avoid that in our real > implementation. Can we force gc of the young generation? Either by > gc.collect() or implcitly somehow (does the gc e.g. kick in across function > calls?). > > Thanks, > /Martin > > > On Mon, Mar 17, 2014 at 5:39 PM, Maciej Fijalkowski <fij...@gmail.com> > wrote: >> >> not sure how more we can help without looking into the code >> >> On Mon, Mar 17, 2014 at 6:05 PM, Martin Koch <m...@issuu.com> wrote: >> > Thanks :) >> > >> > /Martin >> > >> > >> >> On 17/03/2014, at 16.41, Maciej Fijalkowski <fij...@gmail.com> wrote: >> >> >> >> ok. >> >> >> >> so as you can probably see, the max is not that big, which means the >> >> GC is really incremental. What happens is you get tons of garbage that >> >> survives minor collection every now and then. I don't exactly know >> >> why, but you should look what objects can potentially survive for too >> >> long. >> >> >> >>> On Mon, Mar 17, 2014 at 5:37 PM, Martin Koch <m...@issuu.com> wrote: >> >>> Ah - it just occured to me that the first runs may be slow anyway: >> >>> Since we >> >>> take the average of the last 100 runs as the benchmark, then the first >> >>> 100 >> >>> runs are not classified as slow. Indeed, the first three runs with >> >>> many >> >>> collections are in the first 100 runs. >> >>> >> >>> >> >>>> On Mon, Mar 17, 2014 at 4:35 PM, Martin Koch <m...@issuu.com> wrote: >> >>>> >> >>>> Here are the total and max times in millions of units; 30000 units is >> >>>> approximately 13 seconds. I have extracted the runs where there are >> >>>> many >> >>>> gc-collect-steps. These are in execution order, so the first runs >> >>>> with many >> >>>> gc-collect-steps aren't slow. >> >>>> >> >>>> Totals: gc-minor:418 gc-minor-walkroots:0 gc-collect-step:28797 Max: >> >>>> gc-minor:10 gc-collect-step:247 >> >>>> Totals: gc-minor:562 gc-minor-walkroots:0 gc-collect-step:30282 Max: >> >>>> gc-minor:10 gc-collect-step:245 >> >>>> Totals: gc-minor:434 gc-minor-walkroots:0 gc-collect-step:31040 Max: >> >>>> gc-minor:11 gc-collect-step:244 >> >>>> Totals: gc-minor:417 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:31270 >> >>>> Max: gc-minor:17 gc-collect-step:244 >> >>>> Totals: gc-minor:435 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30365 >> >>>> Max: gc-minor:11 gc-collect-step:248 >> >>>> Totals: gc-minor:389 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:31235 >> >>>> Max: gc-minor:8 gc-collect-step:299 >> >>>> Totals: gc-minor:434 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:31124 >> >>>> Max: gc-minor:11 gc-collect-step:246 >> >>>> Totals: gc-minor:386 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30541 >> >>>> Max: gc-minor:8 gc-collect-step:244 >> >>>> Totals: gc-minor:410 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:31427 >> >>>> Max: gc-minor:36 gc-collect-step:248 >> >>>> Totals: gc-minor:390 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30743 >> >>>> Max: gc-minor:8 gc-collect-step:244 >> >>>> Totals: gc-minor:380 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30207 >> >>>> Max: gc-minor:8 gc-collect-step:245 >> >>>> Totals: gc-minor:387 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30837 >> >>>> Max: gc-minor:8 gc-collect-step:244 >> >>>> Totals: gc-minor:412 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30898 >> >>>> Max: gc-minor:38 gc-collect-step:244 >> >>>> Totals: gc-minor:415 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30407 >> >>>> Max: gc-minor:23 gc-collect-step:245 >> >>>> Totals: gc-minor:380 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30591 >> >>>> Max: gc-minor:8 gc-collect-step:246 >> >>>> Totals: gc-minor:387 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:31193 >> >>>> Max: gc-minor:9 gc-collect-step:244 >> >>>> Totals: gc-minor:379 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30026 >> >>>> Max: gc-minor:8 gc-collect-step:246 >> >>>> Totals: gc-minor:388 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:31179 >> >>>> Max: gc-minor:8 gc-collect-step:248 >> >>>> Totals: gc-minor:378 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30674 >> >>>> Max: gc-minor:8 gc-collect-step:250 >> >>>> Totals: gc-minor:385 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30413 >> >>>> Max: gc-minor:8 gc-collect-step:245 >> >>>> Totals: gc-minor:915 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:30830 >> >>>> Max: gc-minor:543 gc-collect-step:244 >> >>>> Totals: gc-minor:405 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:31153 >> >>>> Max: gc-minor:20 gc-collect-step:246 >> >>>> Totals: gc-minor:408 slow:1 gc-minor-walkroots:0 >> >>>> gc-collect-step:29815 >> >>>> Max: gc-minor:25 gc-collect-step:245 >> >>>> >> >>>> Thanks, >> >>>> /Martin >> >>>> >> >>>> >> >>>>> On Mon, Mar 17, 2014 at 3:24 PM, Martin Koch <m...@issuu.com> wrote: >> >>>>> >> >>>>> Ah. I had misunderstood. I'll get back to you on that :) thanks >> >>>>> >> >>>>> /Martin >> >>>>> >> >>>>> >> >>>>>> On 17/03/2014, at 15.21, Maciej Fijalkowski <fij...@gmail.com> >> >>>>>> wrote: >> >>>>>> >> >>>>>> eh, this is not what I need >> >>>>>> >> >>>>>> I need a max of TIME it took for a gc-minor and the TOTAL time it >> >>>>>> took >> >>>>>> for a gc-minor (per query) (ideally same for gc-walkroots and >> >>>>>> gc-collect-step) >> >>>>>> >> >>>>>>> On Mon, Mar 17, 2014 at 4:19 PM, Martin Koch <m...@issuu.com> >> >>>>>>> wrote: >> >>>>>>> Here are the collated results of running each query. For each run, >> >>>>>>> I >> >>>>>>> count >> >>>>>>> how many of each of the pypy debug lines i get. I.e. there were >> >>>>>>> 668 >> >>>>>>> runs >> >>>>>>> that printed 58 loglines that contain "{gc-minor" which was >> >>>>>>> eventually >> >>>>>>> followed by "gc-minor}". I have also counted if the query was >> >>>>>>> slow; >> >>>>>>> interestingly, not all the queries with many gc-minors were slow >> >>>>>>> (but >> >>>>>>> all >> >>>>>>> slow queries had a gc-minor). >> >>>>>>> >> >>>>>>> Please let me know if this is unclear :) >> >>>>>>> >> >>>>>>> 668 gc-minor:58 gc-minor-walkroots:58 >> >>>>>>> 10 gc-minor:58 gc-minor-walkroots:58 gc-collect-step:5 >> >>>>>>> 140 gc-minor:59 gc-minor-walkroots:59 >> >>>>>>> 1 gc-minor:8441 gc-minor-walkroots:8441 gc-collect-step:8403 >> >>>>>>> 1 gc-minor:9300 gc-minor-walkroots:9300 gc-collect-step:9249 >> >>>>>>> 9 gc-minor:9643 slow:1 gc-minor-walkroots:9643 >> >>>>>>> gc-collect-step:9589 >> >>>>>>> 1 gc-minor:9644 slow:1 gc-minor-walkroots:9644 >> >>>>>>> gc-collect-step:9590 >> >>>>>>> 10 gc-minor:9647 slow:1 gc-minor-walkroots:9647 >> >>>>>>> gc-collect-step:9609 >> >>>>>>> 1 gc-minor:9663 gc-minor-walkroots:9663 gc-collect-step:9614 >> >>>>>>> 1 jit-backend-dump:5 gc-minor:58 gc-minor-walkroots:58 >> >>>>>>> 1 jit-log-compiling-loop:1 gc-collect-step:8991 >> >>>>>>> jit-backend-dump:78 >> >>>>>>> jit-backend:3 jit-log-noopt-loop:6 jit-log-virtualstate:3 >> >>>>>>> gc-minor:9030 >> >>>>>>> jit-tracing:3 gc-minor-walkroots:9030 jit-optimize:6 >> >>>>>>> jit-log-short-preamble:2 jit-backend-addr:3 jit-log-opt-loop:1 >> >>>>>>> jit-mem-looptoken-alloc:3 jit-abort:3 jit-log-rewritten-bridge:2 >> >>>>>>> jit-log-rewritten-loop:1 jit-log-opt-bridge:2 >> >>>>>>> jit-log-compiling-bridge:2 >> >>>>>>> jit-resume:84 >> >>>>>>> 1 jit-log-compiling-loop:1 jit-backend-dump:13 jit-backend:1 >> >>>>>>> jit-log-noopt-loop:2 gc-minor:60 jit-tracing:1 >> >>>>>>> gc-minor-walkroots:60 >> >>>>>>> jit-optimize:2 jit-log-short-preamble:1 jit-backend-addr:1 >> >>>>>>> jit-log-opt-loop:1 jit-mem-looptoken-alloc:1 >> >>>>>>> jit-log-rewritten-loop:1 >> >>>>>>> jit-resume:14 >> >>>>>>> 1 jit-log-compiling-loop:1 jit-backend-dump:73 jit-backend:3 >> >>>>>>> jit-log-noopt-loop:6 jit-log-virtualstate:3 gc-minor:60 >> >>>>>>> jit-tracing:3 >> >>>>>>> gc-minor-walkroots:60 jit-optimize:6 jit-log-short-preamble:2 >> >>>>>>> jit-backend-addr:3 jit-log-opt-loop:1 jit-mem-looptoken-alloc:3 >> >>>>>>> jit-abort:3 >> >>>>>>> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:1 >> >>>>>>> jit-log-opt-bridge:2 >> >>>>>>> jit-log-compiling-bridge:2 jit-resume:84 >> >>>>>>> 2 jit-log-compiling-loop:1 jit-backend-dump:78 jit-backend:3 >> >>>>>>> jit-log-noopt-loop:6 jit-log-virtualstate:3 gc-minor:61 >> >>>>>>> jit-tracing:3 >> >>>>>>> gc-minor-walkroots:61 jit-optimize:6 jit-log-short-preamble:2 >> >>>>>>> jit-backend-addr:3 jit-log-opt-loop:1 jit-mem-looptoken-alloc:3 >> >>>>>>> jit-abort:3 >> >>>>>>> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:1 >> >>>>>>> jit-log-opt-bridge:2 >> >>>>>>> jit-log-compiling-bridge:2 jit-resume:84 >> >>>>>>> 1 jit-log-short-preamble:2 jit-log-compiling-loop:2 >> >>>>>>> jit-backend-dump:92 jit-log-noopt-loop:7 jit-log-virtualstate:3 >> >>>>>>> gc-minor:61 >> >>>>>>> jit-tracing:4 gc-minor-walkroots:61 jit-optimize:7 jit-backend:4 >> >>>>>>> jit-backend-addr:4 jit-log-opt-loop:2 jit-mem-looptoken-alloc:4 >> >>>>>>> jit-abort:3 >> >>>>>>> jit-log-rewritten-bridge:2 jit-log-rewritten-loop:2 >> >>>>>>> jit-log-opt-bridge:2 >> >>>>>>> jit-log-compiling-bridge:2 jit-resume:104 >> >>>>>>> >> >>>>>>> >> >>>>>>> Thanks, >> >>>>>>> /Martin >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> On Mon, Mar 17, 2014 at 2:23 PM, Maciej Fijalkowski >> >>>>>>> <fij...@gmail.com> >> >>>>>>> wrote: >> >>>>>>>> >> >>>>>>>> On Mon, Mar 17, 2014 at 3:20 PM, Maciej Fijalkowski >> >>>>>>>> <fij...@gmail.com> >> >>>>>>>> wrote: >> >>>>>>>>> are you *sure* it's the walkroots that take that long and not >> >>>>>>>>> something else (like gc-minor)? More of those mean that you >> >>>>>>>>> allocate >> >>>>>>>>> a >> >>>>>>>>> lot more surviving objects. Can you do two things: >> >>>>>>>>> >> >>>>>>>>> a) take a max of gc-minor (and gc-minor-stackwalk), per request >> >>>>>>>>> b) take the sum of those >> >>>>>>>>> >> >>>>>>>>> and plot them >> >>>>>>>> >> >>>>>>>> ^^^ or just paste the results actually >> >>>>>>>> >> >>>>>>>>> >> >>>>>>>>>> On Mon, Mar 17, 2014 at 3:18 PM, Martin Koch <m...@issuu.com> >> >>>>>>>>>> wrote: >> >>>>>>>>>> Well, then it works out to around 2.5GHz, which seems >> >>>>>>>>>> reasonable. >> >>>>>>>>>> But >> >>>>>>>>>> it >> >>>>>>>>>> doesn't alter the conclusion from the previous email: The slow >> >>>>>>>>>> queries >> >>>>>>>>>> then >> >>>>>>>>>> all have a duration around 34*10^9 units, 'normal' queries >> >>>>>>>>>> 1*10^9 >> >>>>>>>>>> units, or >> >>>>>>>>>> .4 seconds at this conversion. Also, the log shows that a slow >> >>>>>>>>>> query >> >>>>>>>>>> performs many more gc-minor operations than a 'normal' one: >> >>>>>>>>>> 9600 >> >>>>>>>>>> gc-collect-step/gc-minor/gc-minor-walkroots operations vs 58. >> >>>>>>>>>> >> >>>>>>>>>> So the question becomes: Why do we get this large spike in >> >>>>>>>>>> gc-minor-walkroots, and, in particular, is there any way to >> >>>>>>>>>> avoid >> >>>>>>>>>> it :) >> >>>>>>>>>> ? >> >>>>>>>>>> >> >>>>>>>>>> Thanks, >> >>>>>>>>>> /Martin >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> On Mon, Mar 17, 2014 at 1:53 PM, Maciej Fijalkowski >> >>>>>>>>>> <fij...@gmail.com> >> >>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> I think it's the cycles of your CPU >> >>>>>>>>>>> >> >>>>>>>>>>>> On Mon, Mar 17, 2014 at 2:48 PM, Martin Koch <m...@issuu.com> >> >>>>>>>>>>>> wrote: >> >>>>>>>>>>>> What is the unit? Perhaps I'm being thick here, but I can't >> >>>>>>>>>>>> correlate it >> >>>>>>>>>>>> with seconds (which the program does print out). Slow runs >> >>>>>>>>>>>> are >> >>>>>>>>>>>> around 13 >> >>>>>>>>>>>> seconds, but are around 34*10^9(dec), 0x800000000 timestamp >> >>>>>>>>>>>> units >> >>>>>>>>>>>> (e.g. >> >>>>>>>>>>>> from >> >>>>>>>>>>>> 0x2b994c9d31889c to 0x2b9944ab8c4f49). >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> On Mon, Mar 17, 2014 at 12:09 PM, Maciej Fijalkowski >> >>>>>>>>>>>> <fij...@gmail.com> >> >>>>>>>>>>>> wrote: >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> The number of lines is nonsense. This is a timestamp in hex. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> On Mon, Mar 17, 2014 at 12:46 PM, Martin Koch >> >>>>>>>>>>>>> <m...@issuu.com> >> >>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>> Based On Maciej's suggestion, I tried the following >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> PYPYLOG=- pypy mem.py 10000000 > out >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> This generates a logfile which looks something like this >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> start--> >> >>>>>>>>>>>>>> [2b99f1981b527e] {gc-minor >> >>>>>>>>>>>>>> [2b99f1981ba680] {gc-minor-walkroots >> >>>>>>>>>>>>>> [2b99f1981c2e02] gc-minor-walkroots} >> >>>>>>>>>>>>>> [2b99f19890d750] gc-minor} >> >>>>>>>>>>>>>> [snip] >> >>>>>>>>>>>>>> ... >> >>>>>>>>>>>>>> <--stop >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> It turns out that the culprit is a lot of MINOR >> >>>>>>>>>>>>>> collections. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> I base this on the following observations: >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> I can't understand the format of the timestamp on each >> >>>>>>>>>>>>>> logline >> >>>>>>>>>>>>>> (the >> >>>>>>>>>>>>>> "[2b99f1981b527e]"). From what I can see in the code, this >> >>>>>>>>>>>>>> should >> >>>>>>>>>>>>>> be >> >>>>>>>>>>>>>> output >> >>>>>>>>>>>>>> from time.clock(), but that doesn't return a number like >> >>>>>>>>>>>>>> that >> >>>>>>>>>>>>>> when I >> >>>>>>>>>>>>>> run >> >>>>>>>>>>>>>> pypy interactively >> >>>>>>>>>>>>>> Instead, I count the number of debug lines between start--> >> >>>>>>>>>>>>>> and >> >>>>>>>>>>>>>> the >> >>>>>>>>>>>>>> corresponding <--stop. >> >>>>>>>>>>>>>> Most runs have a few hundred lines of output between >> >>>>>>>>>>>>>> start/stop >> >>>>>>>>>>>>>> All slow runs have very close to 57800 lines out output >> >>>>>>>>>>>>>> between >> >>>>>>>>>>>>>> start/stop >> >>>>>>>>>>>>>> One such sample does 9609 gc-collect-step operations, 9647 >> >>>>>>>>>>>>>> gc-minor >> >>>>>>>>>>>>>> operations, and 9647 gc-minor-walkroots operations. >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> Thanks, >> >>>>>>>>>>>>>> /Martin >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> On Mon, Mar 17, 2014 at 8:21 AM, Maciej Fijalkowski >> >>>>>>>>>>>>>> <fij...@gmail.com> >> >>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> there is an environment variable PYPYLOG=gc:- (where - is >> >>>>>>>>>>>>>>> stdout) >> >>>>>>>>>>>>>>> which will do that for you btw. >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> maybe you can find out what's that using profiling or >> >>>>>>>>>>>>>>> valgrind? >> >>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>> On Sun, Mar 16, 2014 at 11:34 PM, Martin Koch >> >>>>>>>>>>>>>>> <m...@issuu.com> >> >>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>> I have tried getting the pypy source and building my own >> >>>>>>>>>>>>>>>> version >> >>>>>>>>>>>>>>>> of >> >>>>>>>>>>>>>>>> pypy. I >> >>>>>>>>>>>>>>>> have modified >> >>>>>>>>>>>>>>>> rpython/memory/gc/incminimark.py:major_collection_step() >> >>>>>>>>>>>>>>>> to >> >>>>>>>>>>>>>>>> print out when it starts and when it stops. Apparently, >> >>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>> slow >> >>>>>>>>>>>>>>>> queries >> >>>>>>>>>>>>>>>> do >> >>>>>>>>>>>>>>>> NOT occur during major_collection_step; at least, I have >> >>>>>>>>>>>>>>>> not >> >>>>>>>>>>>>>>>> observed >> >>>>>>>>>>>>>>>> major >> >>>>>>>>>>>>>>>> step output during a query execution. So, apparently, >> >>>>>>>>>>>>>>>> something >> >>>>>>>>>>>>>>>> else >> >>>>>>>>>>>>>>>> is >> >>>>>>>>>>>>>>>> blocking. This could be another aspect of the GC, but it >> >>>>>>>>>>>>>>>> could >> >>>>>>>>>>>>>>>> also >> >>>>>>>>>>>>>>>> be >> >>>>>>>>>>>>>>>> anything else. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> Just to be sure, I have tried running the same >> >>>>>>>>>>>>>>>> application in >> >>>>>>>>>>>>>>>> python >> >>>>>>>>>>>>>>>> with >> >>>>>>>>>>>>>>>> garbage collection disabled. I don't see the problem >> >>>>>>>>>>>>>>>> there, >> >>>>>>>>>>>>>>>> so >> >>>>>>>>>>>>>>>> it >> >>>>>>>>>>>>>>>> is >> >>>>>>>>>>>>>>>> somehow >> >>>>>>>>>>>>>>>> related to either GC or the runtime somehow. >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> Cheers, >> >>>>>>>>>>>>>>>> /Martin >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> On Fri, Mar 14, 2014 at 4:19 PM, Martin Koch >> >>>>>>>>>>>>>>>> <m...@issuu.com> >> >>>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> We have hacked up a small sample that seems to exhibit >> >>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>> same >> >>>>>>>>>>>>>>>>> issue. >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> We basically generate a linked list of objects. To >> >>>>>>>>>>>>>>>>> increase >> >>>>>>>>>>>>>>>>> connectedness, >> >>>>>>>>>>>>>>>>> elements in the list hold references (dummy_links) to 10 >> >>>>>>>>>>>>>>>>> randomly >> >>>>>>>>>>>>>>>>> chosen >> >>>>>>>>>>>>>>>>> previous elements in the list. >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> We then time a function that traverses 50000 elements >> >>>>>>>>>>>>>>>>> from >> >>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>> list >> >>>>>>>>>>>>>>>>> from a >> >>>>>>>>>>>>>>>>> random start point. If the traversal reaches the end of >> >>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>> list, >> >>>>>>>>>>>>>>>>> we >> >>>>>>>>>>>>>>>>> instead >> >>>>>>>>>>>>>>>>> traverse one of the dummy links. Thus, exactly 50K >> >>>>>>>>>>>>>>>>> elements >> >>>>>>>>>>>>>>>>> are >> >>>>>>>>>>>>>>>>> traversed >> >>>>>>>>>>>>>>>>> every time. To generate some garbage, we build a list >> >>>>>>>>>>>>>>>>> holding >> >>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>> traversed >> >>>>>>>>>>>>>>>>> elements and a dummy list of characters. >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> Timings for the last 100 runs are stored in a circular >> >>>>>>>>>>>>>>>>> buffer. If >> >>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>> elapsed time for the last run is more than twice the >> >>>>>>>>>>>>>>>>> average >> >>>>>>>>>>>>>>>>> time, >> >>>>>>>>>>>>>>>>> we >> >>>>>>>>>>>>>>>>> print >> >>>>>>>>>>>>>>>>> out a line with the elapsed time, the threshold, and the >> >>>>>>>>>>>>>>>>> 90% >> >>>>>>>>>>>>>>>>> runtime >> >>>>>>>>>>>>>>>>> (we >> >>>>>>>>>>>>>>>>> would like to see that the mean runtime does not >> >>>>>>>>>>>>>>>>> increase >> >>>>>>>>>>>>>>>>> with >> >>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>> number of >> >>>>>>>>>>>>>>>>> elements in the list, but that the max time does >> >>>>>>>>>>>>>>>>> increase >> >>>>>>>>>>>>>>>>> (linearly >> >>>>>>>>>>>>>>>>> with the >> >>>>>>>>>>>>>>>>> number of object, i guess); traversing 50K elements >> >>>>>>>>>>>>>>>>> should >> >>>>>>>>>>>>>>>>> be >> >>>>>>>>>>>>>>>>> independent of >> >>>>>>>>>>>>>>>>> the memory size). >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> We have tried monitoring memory consumption by external >> >>>>>>>>>>>>>>>>> inspection, >> >>>>>>>>>>>>>>>>> but >> >>>>>>>>>>>>>>>>> cannot consistently verify that memory is deallocated at >> >>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>> same >> >>>>>>>>>>>>>>>>> time >> >>>>>>>>>>>>>>>>> that >> >>>>>>>>>>>>>>>>> we see slow requests. Perhaps the pypy runtime doesn't >> >>>>>>>>>>>>>>>>> always >> >>>>>>>>>>>>>>>>> return >> >>>>>>>>>>>>>>>>> freed >> >>>>>>>>>>>>>>>>> pages back to the OS? >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> Using top, we observe that 10M elements allocates around >> >>>>>>>>>>>>>>>>> 17GB >> >>>>>>>>>>>>>>>>> after >> >>>>>>>>>>>>>>>>> building, 20M elements 26GB, 30M elements 28GB (and >> >>>>>>>>>>>>>>>>> grows to >> >>>>>>>>>>>>>>>>> 35GB >> >>>>>>>>>>>>>>>>> shortly >> >>>>>>>>>>>>>>>>> after building). >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> Here is output from a few runs with different number of >> >>>>>>>>>>>>>>>>> elements: >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> pypy mem.py 10000000 >> >>>>>>>>>>>>>>>>> start build >> >>>>>>>>>>>>>>>>> end build 84.142424 >> >>>>>>>>>>>>>>>>> that took a long time elapsed: 13.230586 >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> 1.495401 >> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.421558 >> >>>>>>>>>>>>>>>>> that took a long time elapsed: 13.016531 >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> 1.488160 >> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.423441 >> >>>>>>>>>>>>>>>>> that took a long time elapsed: 13.032537 >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> 1.474563 >> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.419817 >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> pypy mem.py 20000000 >> >>>>>>>>>>>>>>>>> start build >> >>>>>>>>>>>>>>>>> end build 180.823105 >> >>>>>>>>>>>>>>>>> that took a long time elapsed: 27.346064 >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> 2.295146 >> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.434726 >> >>>>>>>>>>>>>>>>> that took a long time elapsed: 26.028852 >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> 2.283927 >> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.374190 >> >>>>>>>>>>>>>>>>> that took a long time elapsed: 25.432279 >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> 2.279631 >> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.371502 >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> pypy mem.py 30000000 >> >>>>>>>>>>>>>>>>> start build >> >>>>>>>>>>>>>>>>> end build 276.217811 >> >>>>>>>>>>>>>>>>> that took a long time elapsed: 40.993855 >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> 3.188464 >> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.459891 >> >>>>>>>>>>>>>>>>> that took a long time elapsed: 41.693553 >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> 3.183003 >> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.393654 >> >>>>>>>>>>>>>>>>> that took a long time elapsed: 39.679769 >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> 3.190782 >> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.393677 >> >>>>>>>>>>>>>>>>> that took a long time elapsed: 43.573411 >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> 3.239637 >> >>>>>>>>>>>>>>>>> 90th_quantile_runtime: 0.393654 >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> Code below >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> -------------------------------------------------------------- >> >>>>>>>>>>>>>>>>> import time >> >>>>>>>>>>>>>>>>> from random import randint, choice >> >>>>>>>>>>>>>>>>> import sys >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> allElems = {} >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> class Node: >> >>>>>>>>>>>>>>>>> def __init__(self, v_): >> >>>>>>>>>>>>>>>>> self.v = v_ >> >>>>>>>>>>>>>>>>> self.next = None >> >>>>>>>>>>>>>>>>> self.dummy_data = [randint(0,100) >> >>>>>>>>>>>>>>>>> for _ in >> >>>>>>>>>>>>>>>>> xrange(randint(50,100))] >> >>>>>>>>>>>>>>>>> allElems[self.v] = self >> >>>>>>>>>>>>>>>>> if self.v > 0: >> >>>>>>>>>>>>>>>>> self.dummy_links = [allElems[randint(0, >> >>>>>>>>>>>>>>>>> self.v-1)] >> >>>>>>>>>>>>>>>>> for _ >> >>>>>>>>>>>>>>>>> in >> >>>>>>>>>>>>>>>>> xrange(10)] >> >>>>>>>>>>>>>>>>> else: >> >>>>>>>>>>>>>>>>> self.dummy_links = [self] >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> def set_next(self, l): >> >>>>>>>>>>>>>>>>> self.next = l >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> def follow(node): >> >>>>>>>>>>>>>>>>> acc = [] >> >>>>>>>>>>>>>>>>> count = 0 >> >>>>>>>>>>>>>>>>> cur = node >> >>>>>>>>>>>>>>>>> assert node.v is not None >> >>>>>>>>>>>>>>>>> assert cur is not None >> >>>>>>>>>>>>>>>>> while count < 50000: >> >>>>>>>>>>>>>>>>> # return a value; generate some garbage >> >>>>>>>>>>>>>>>>> acc.append((cur.v, >> >>>>>>>>>>>>>>>>> [choice("abcdefghijklmnopqrstuvwxyz") >> >>>>>>>>>>>>>>>>> for >> >>>>>>>>>>>>>>>>> x >> >>>>>>>>>>>>>>>>> in >> >>>>>>>>>>>>>>>>> xrange(100)])) >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> # if we have reached the end, chose a random link >> >>>>>>>>>>>>>>>>> cur = choice(cur.dummy_links) if cur.next is None >> >>>>>>>>>>>>>>>>> else >> >>>>>>>>>>>>>>>>> cur.next >> >>>>>>>>>>>>>>>>> count += 1 >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> return acc >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> def build(num_elems): >> >>>>>>>>>>>>>>>>> start = time.time() >> >>>>>>>>>>>>>>>>> print "start build" >> >>>>>>>>>>>>>>>>> root = Node(0) >> >>>>>>>>>>>>>>>>> cur = root >> >>>>>>>>>>>>>>>>> for x in xrange(1, num_elems): >> >>>>>>>>>>>>>>>>> e = Node(x) >> >>>>>>>>>>>>>>>>> cur.next = e >> >>>>>>>>>>>>>>>>> cur = e >> >>>>>>>>>>>>>>>>> print "end build %f" % (time.time() - start) >> >>>>>>>>>>>>>>>>> return root >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> num_timings = 100 >> >>>>>>>>>>>>>>>>> if __name__ == "__main__": >> >>>>>>>>>>>>>>>>> num_elems = int(sys.argv[1]) >> >>>>>>>>>>>>>>>>> build(num_elems) >> >>>>>>>>>>>>>>>>> total = 0 >> >>>>>>>>>>>>>>>>> timings = [0.0] * num_timings # run times for the last >> >>>>>>>>>>>>>>>>> num_timings >> >>>>>>>>>>>>>>>>> runs >> >>>>>>>>>>>>>>>>> i = 0 >> >>>>>>>>>>>>>>>>> beginning = time.time() >> >>>>>>>>>>>>>>>>> while time.time() - beginning < 600: >> >>>>>>>>>>>>>>>>> start = time.time() >> >>>>>>>>>>>>>>>>> elem = allElems[randint(0, num_elems - 1)] >> >>>>>>>>>>>>>>>>> assert(elem is not None) >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> lst = follow(elem) >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> total += choice(lst)[0] # use the return value for >> >>>>>>>>>>>>>>>>> something >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> end = time.time() >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> elapsed = end-start >> >>>>>>>>>>>>>>>>> timings[i % num_timings] = elapsed >> >>>>>>>>>>>>>>>>> if (i > num_timings): >> >>>>>>>>>>>>>>>>> slow_time = 2 * sum(timings)/num_timings # >> >>>>>>>>>>>>>>>>> slow >> >>>>>>>>>>>>>>>>> defined >> >>>>>>>>>>>>>>>>> as >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> 2*avg run time >> >>>>>>>>>>>>>>>>> if (elapsed > slow_time): >> >>>>>>>>>>>>>>>>> print "that took a long time elapsed: %f >> >>>>>>>>>>>>>>>>> slow_threshold: >> >>>>>>>>>>>>>>>>> %f 90th_quantile_runtime: %f" % \ >> >>>>>>>>>>>>>>>>> (elapsed, slow_time, >> >>>>>>>>>>>>>>>>> sorted(timings)[int(num_timings*.9)]) >> >>>>>>>>>>>>>>>>> i += 1 >> >>>>>>>>>>>>>>>>> print total >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> On Thu, Mar 13, 2014 at 7:45 PM, Maciej Fijalkowski >> >>>>>>>>>>>>>>>>> <fij...@gmail.com> >> >>>>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> On Thu, Mar 13, 2014 at 1:45 PM, Martin Koch >> >>>>>>>>>>>>>>>>>> <m...@issuu.com> >> >>>>>>>>>>>>>>>>>> wrote: >> >>>>>>>>>>>>>>>>>>> Hi Armin, Maciej >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> Thanks for responding. >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> I'm in the process of trying to determine what (if >> >>>>>>>>>>>>>>>>>>> any) of >> >>>>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>>>> code >> >>>>>>>>>>>>>>>>>>> I'm >> >>>>>>>>>>>>>>>>>>> in a >> >>>>>>>>>>>>>>>>>>> position to share, and I'll get back to you. >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> Allowing hinting to the GC would be good. Even better >> >>>>>>>>>>>>>>>>>>> would be >> >>>>>>>>>>>>>>>>>>> a >> >>>>>>>>>>>>>>>>>>> means >> >>>>>>>>>>>>>>>>>>> to >> >>>>>>>>>>>>>>>>>>> allow me to (transparently) allocate objects in >> >>>>>>>>>>>>>>>>>>> unmanaged >> >>>>>>>>>>>>>>>>>>> memory, >> >>>>>>>>>>>>>>>>>>> but I >> >>>>>>>>>>>>>>>>>>> would expect that to be a tall order :) >> >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> Thanks, >> >>>>>>>>>>>>>>>>>>> /Martin >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> Hi Martin. >> >>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>> Note that in case you want us to do the work of >> >>>>>>>>>>>>>>>>>> isolating >> >>>>>>>>>>>>>>>>>> the >> >>>>>>>>>>>>>>>>>> problem, >> >>>>>>>>>>>>>>>>>> we do offer paid support to do that (then we can sign >> >>>>>>>>>>>>>>>>>> NDAs >> >>>>>>>>>>>>>>>>>> and >> >>>>>>>>>>>>>>>>>> stuff). >> >>>>>>>>>>>>>>>>>> Otherwise we would be more than happy to fix bugs once >> >>>>>>>>>>>>>>>>>> you >> >>>>>>>>>>>>>>>>>> isolate >> >>>>>>>>>>>>>>>>>> a >> >>>>>>>>>>>>>>>>>> part you can share freely :) >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>> >> >>>> >> >>> > > _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev