Hi, this is the twelfth weekly report on my Summer of Code project 'Provide some metrics in Debile'[1].
(Previous report: http://lists.alioth.debian.org/pipermail/soc-coordination/2014-August/002253.html) uniquify -------- My implementation of uniquify is finally functional. I bypassed the ORM entirely so I could do "insert into... where not exists..." statements (this took most of last week). With my usual test firehose file, the new uniquify is twice faster. With larger files, the time increases linearly, which is much better than the original implementation. For example a file twice as large as my initial test file results in x2 time instead of x4; I haven't tested the original uniquify for much larger files since it would have taken too much time on my laptop. debile-incoming has now been been able to import the huge firehose files (10-40MB) that would originally make it go OOM after a dozen hours of work. There is still room for improvement, as suggested by the call graph[2] generated by the python profiler and gprof2dot[3]. Indeed, only 20% of the time is spent executing SQL statements, while 60% is spent on creating those queries. However, since it is usable, I'm now focusing on lauchning the full rebuild, since the end of the GSoC is approaching very fast. full rebuild ------------ When I did what was supposed to be the last test build before lauching the full rebuild, we realized a new release of dpkg broke our use of sbuild[4]. I've started looking into sbuild to patch it, but I'm still not sure exactly how to fix that. Thanks for reading, ------- Clément [1] [https://wiki.debian.org/SummerOfCode2014/StudentApplications/ClementSchreiner] [2] http://www.mux.me/debile/profile.png [3] https://code.google.com/p/jrfonseca/wiki/Gprof2Dot [4] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757795 _______________________________________________ Soc-coordination mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/soc-coordination
