I think it's the cycles of your CPU
On Mon, Mar 17, 2014 at 2:48 PM, Martin Koch <[email protected]> wrote: > What is the unit? Perhaps I'm being thick here, but I can't correlate it > with seconds (which the program does print out). Slow runs are around 13 > seconds, but are around 34*10^9(dec), 0x800000000 timestamp units (e.g. from > 0x2b994c9d31889c to 0x2b9944ab8c4f49). > > > > On Mon, Mar 17, 2014 at 12:09 PM, Maciej Fijalkowski <[email protected]> > wrote: >> >> The number of lines is nonsense. This is a timestamp in hex. >> >> On Mon, Mar 17, 2014 at 12:46 PM, Martin Koch <[email protected]> wrote: >> > Based On Maciej's suggestion, I tried the following >> > >> > PYPYLOG=- pypy mem.py 10000000 > out >> > >> > This generates a logfile which looks something like this >> > >> > start--> >> > [2b99f1981b527e] {gc-minor >> > [2b99f1981ba680] {gc-minor-walkroots >> > [2b99f1981c2e02] gc-minor-walkroots} >> > [2b99f19890d750] gc-minor} >> > [snip] >> > ... >> > <--stop >> > >> > >> > It turns out that the culprit is a lot of MINOR collections. >> > >> > I base this on the following observations: >> > >> > I can't understand the format of the timestamp on each logline (the >> > "[2b99f1981b527e]"). From what I can see in the code, this should be >> > output >> > from time.clock(), but that doesn't return a number like that when I run >> > pypy interactively >> > Instead, I count the number of debug lines between start--> and the >> > corresponding <--stop. >> > Most runs have a few hundred lines of output between start/stop >> > All slow runs have very close to 57800 lines out output between >> > start/stop >> > One such sample does 9609 gc-collect-step operations, 9647 gc-minor >> > operations, and 9647 gc-minor-walkroots operations. >> > >> > >> > Thanks, >> > /Martin >> > >> > >> > On Mon, Mar 17, 2014 at 8:21 AM, Maciej Fijalkowski <[email protected]> >> > wrote: >> >> >> >> there is an environment variable PYPYLOG=gc:- (where - is stdout) >> >> which will do that for you btw. >> >> >> >> maybe you can find out what's that using profiling or valgrind? >> >> >> >> On Sun, Mar 16, 2014 at 11:34 PM, Martin Koch <[email protected]> wrote: >> >> > I have tried getting the pypy source and building my own version of >> >> > pypy. I >> >> > have modified >> >> > rpython/memory/gc/incminimark.py:major_collection_step() >> >> > to >> >> > print out when it starts and when it stops. Apparently, the slow >> >> > queries >> >> > do >> >> > NOT occur during major_collection_step; at least, I have not observed >> >> > major >> >> > step output during a query execution. So, apparently, something else >> >> > is >> >> > blocking. This could be another aspect of the GC, but it could also >> >> > be >> >> > anything else. >> >> > >> >> > Just to be sure, I have tried running the same application in python >> >> > with >> >> > garbage collection disabled. I don't see the problem there, so it is >> >> > somehow >> >> > related to either GC or the runtime somehow. >> >> > >> >> > Cheers, >> >> > /Martin >> >> > >> >> > >> >> > On Fri, Mar 14, 2014 at 4:19 PM, Martin Koch <[email protected]> wrote: >> >> >> >> >> >> We have hacked up a small sample that seems to exhibit the same >> >> >> issue. >> >> >> >> >> >> We basically generate a linked list of objects. To increase >> >> >> connectedness, >> >> >> elements in the list hold references (dummy_links) to 10 randomly >> >> >> chosen >> >> >> previous elements in the list. >> >> >> >> >> >> We then time a function that traverses 50000 elements from the list >> >> >> from a >> >> >> random start point. If the traversal reaches the end of the list, we >> >> >> instead >> >> >> traverse one of the dummy links. Thus, exactly 50K elements are >> >> >> traversed >> >> >> every time. To generate some garbage, we build a list holding the >> >> >> traversed >> >> >> elements and a dummy list of characters. >> >> >> >> >> >> Timings for the last 100 runs are stored in a circular buffer. If >> >> >> the >> >> >> elapsed time for the last run is more than twice the average time, >> >> >> we >> >> >> print >> >> >> out a line with the elapsed time, the threshold, and the 90% runtime >> >> >> (we >> >> >> would like to see that the mean runtime does not increase with the >> >> >> number of >> >> >> elements in the list, but that the max time does increase (linearly >> >> >> with the >> >> >> number of object, i guess); traversing 50K elements should be >> >> >> independent of >> >> >> the memory size). >> >> >> >> >> >> We have tried monitoring memory consumption by external inspection, >> >> >> but >> >> >> cannot consistently verify that memory is deallocated at the same >> >> >> time >> >> >> that >> >> >> we see slow requests. Perhaps the pypy runtime doesn't always return >> >> >> freed >> >> >> pages back to the OS? >> >> >> >> >> >> Using top, we observe that 10M elements allocates around 17GB after >> >> >> building, 20M elements 26GB, 30M elements 28GB (and grows to 35GB >> >> >> shortly >> >> >> after building). >> >> >> >> >> >> Here is output from a few runs with different number of elements: >> >> >> >> >> >> >> >> >> pypy mem.py 10000000 >> >> >> start build >> >> >> end build 84.142424 >> >> >> that took a long time elapsed: 13.230586 slow_threshold: 1.495401 >> >> >> 90th_quantile_runtime: 0.421558 >> >> >> that took a long time elapsed: 13.016531 slow_threshold: 1.488160 >> >> >> 90th_quantile_runtime: 0.423441 >> >> >> that took a long time elapsed: 13.032537 slow_threshold: 1.474563 >> >> >> 90th_quantile_runtime: 0.419817 >> >> >> >> >> >> pypy mem.py 20000000 >> >> >> start build >> >> >> end build 180.823105 >> >> >> that took a long time elapsed: 27.346064 slow_threshold: 2.295146 >> >> >> 90th_quantile_runtime: 0.434726 >> >> >> that took a long time elapsed: 26.028852 slow_threshold: 2.283927 >> >> >> 90th_quantile_runtime: 0.374190 >> >> >> that took a long time elapsed: 25.432279 slow_threshold: 2.279631 >> >> >> 90th_quantile_runtime: 0.371502 >> >> >> >> >> >> pypy mem.py 30000000 >> >> >> start build >> >> >> end build 276.217811 >> >> >> that took a long time elapsed: 40.993855 slow_threshold: 3.188464 >> >> >> 90th_quantile_runtime: 0.459891 >> >> >> that took a long time elapsed: 41.693553 slow_threshold: 3.183003 >> >> >> 90th_quantile_runtime: 0.393654 >> >> >> that took a long time elapsed: 39.679769 slow_threshold: 3.190782 >> >> >> 90th_quantile_runtime: 0.393677 >> >> >> that took a long time elapsed: 43.573411 slow_threshold: 3.239637 >> >> >> 90th_quantile_runtime: 0.393654 >> >> >> >> >> >> Code below >> >> >> -------------------------------------------------------------- >> >> >> import time >> >> >> from random import randint, choice >> >> >> import sys >> >> >> >> >> >> >> >> >> allElems = {} >> >> >> >> >> >> class Node: >> >> >> def __init__(self, v_): >> >> >> self.v = v_ >> >> >> self.next = None >> >> >> self.dummy_data = [randint(0,100) >> >> >> for _ in xrange(randint(50,100))] >> >> >> allElems[self.v] = self >> >> >> if self.v > 0: >> >> >> self.dummy_links = [allElems[randint(0, self.v-1)] for _ >> >> >> in >> >> >> xrange(10)] >> >> >> else: >> >> >> self.dummy_links = [self] >> >> >> >> >> >> def set_next(self, l): >> >> >> self.next = l >> >> >> >> >> >> >> >> >> def follow(node): >> >> >> acc = [] >> >> >> count = 0 >> >> >> cur = node >> >> >> assert node.v is not None >> >> >> assert cur is not None >> >> >> while count < 50000: >> >> >> # return a value; generate some garbage >> >> >> acc.append((cur.v, [choice("abcdefghijklmnopqrstuvwxyz") for >> >> >> x >> >> >> in >> >> >> xrange(100)])) >> >> >> >> >> >> # if we have reached the end, chose a random link >> >> >> cur = choice(cur.dummy_links) if cur.next is None else >> >> >> cur.next >> >> >> count += 1 >> >> >> >> >> >> return acc >> >> >> >> >> >> >> >> >> def build(num_elems): >> >> >> start = time.time() >> >> >> print "start build" >> >> >> root = Node(0) >> >> >> cur = root >> >> >> for x in xrange(1, num_elems): >> >> >> e = Node(x) >> >> >> cur.next = e >> >> >> cur = e >> >> >> print "end build %f" % (time.time() - start) >> >> >> return root >> >> >> >> >> >> >> >> >> num_timings = 100 >> >> >> if __name__ == "__main__": >> >> >> num_elems = int(sys.argv[1]) >> >> >> build(num_elems) >> >> >> total = 0 >> >> >> timings = [0.0] * num_timings # run times for the last >> >> >> num_timings >> >> >> runs >> >> >> i = 0 >> >> >> beginning = time.time() >> >> >> while time.time() - beginning < 600: >> >> >> start = time.time() >> >> >> elem = allElems[randint(0, num_elems - 1)] >> >> >> assert(elem is not None) >> >> >> >> >> >> lst = follow(elem) >> >> >> >> >> >> total += choice(lst)[0] # use the return value for something >> >> >> >> >> >> end = time.time() >> >> >> >> >> >> elapsed = end-start >> >> >> timings[i % num_timings] = elapsed >> >> >> if (i > num_timings): >> >> >> slow_time = 2 * sum(timings)/num_timings # slow defined >> >> >> as >> >> >> > >> >> >> 2*avg run time >> >> >> if (elapsed > slow_time): >> >> >> print "that took a long time elapsed: %f >> >> >> slow_threshold: >> >> >> %f 90th_quantile_runtime: %f" % \ >> >> >> (elapsed, slow_time, >> >> >> sorted(timings)[int(num_timings*.9)]) >> >> >> i += 1 >> >> >> print total >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Thu, Mar 13, 2014 at 7:45 PM, Maciej Fijalkowski >> >> >> <[email protected]> >> >> >> wrote: >> >> >>> >> >> >>> On Thu, Mar 13, 2014 at 1:45 PM, Martin Koch <[email protected]> wrote: >> >> >>> > Hi Armin, Maciej >> >> >>> > >> >> >>> > Thanks for responding. >> >> >>> > >> >> >>> > I'm in the process of trying to determine what (if any) of the >> >> >>> > code >> >> >>> > I'm >> >> >>> > in a >> >> >>> > position to share, and I'll get back to you. >> >> >>> > >> >> >>> > Allowing hinting to the GC would be good. Even better would be a >> >> >>> > means >> >> >>> > to >> >> >>> > allow me to (transparently) allocate objects in unmanaged memory, >> >> >>> > but I >> >> >>> > would expect that to be a tall order :) >> >> >>> > >> >> >>> > Thanks, >> >> >>> > /Martin >> >> >>> >> >> >>> Hi Martin. >> >> >>> >> >> >>> Note that in case you want us to do the work of isolating the >> >> >>> problem, >> >> >>> we do offer paid support to do that (then we can sign NDAs and >> >> >>> stuff). >> >> >>> Otherwise we would be more than happy to fix bugs once you isolate >> >> >>> a >> >> >>> part you can share freely :) >> >> >> >> >> >> >> >> > >> > >> > > > _______________________________________________ pypy-dev mailing list [email protected] https://mail.python.org/mailman/listinfo/pypy-dev
