[gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis)
Hi, On 11/06/2014 02:43 PM, Ciaran McCreesh wrote: If you're going to go the toolkit route, you should be using a CP solver, not a SAT solver. But even then you'd be better off making some changes and not using plain old MAC, so you're back to writing the algorithms yourself. What you need is for someone who understands CP and SAT to write a resolver using algorithms inspired by how CP and SAT solvers work, but not just blindly copying them. Doing this well is at least a full year Masters level project... Yeah, you are right. What I am interested in is an overview of what algorithm we are using now. Do we have any documentation about it? As I really would like to look at some concise document rather than sources. Also may be we need to discuss how can we improve it, as at the moment for me it seems one of the biggest problems with Gentoo. And afaik paludis does not solve it (or am I wrong?) -- Jauhien signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Regarding my final year thesis
On 07/11/14 06:06, Harsh Bhatt wrote: This idea seems bit interesting, about how the bug tracker works. In this i just need to confirm that how much mathematical aspect can be included. It's a good idea to work on. Also make might enjoy improvements. lu
Re: [gentoo-dev] RFC: future.eclass
On Thu, Nov 6, 2014 at 5:09 PM, Andreas K. Huettel dilfri...@gentoo.org wrote: Am Donnerstag, 6. November 2014, 22:56:21 schrieb Rich Freeman: I think we are well-served by taking Ciaran's advice here. Utility eclasses should just passively export functions. Anything that does overrides should really be designed for special situations and not widespread use where it would potentially conflict with other eclasses that do the same. So, a KDE all-in-one eclass might not be bad. A perl all-in-one eclass would be more troublesome, Bad example. :) We have ca 1800 packages in the portage tree inheriting perl- module.eclass and most of them do not declare any phases themselves but just inherit eclass phases. Which works fine and reduces most ebuilds to a bare minimum. I don't see perl MODULES as being a bad use of this, but an all-in-one eclass that was intended for packages that were written (partially or totally) in perl would not be a good thing IMO. The problem comes when you get into situations where the perl gurus wanted a fancy eclass, and the python maintainers wanted a fancy eclass, and the games maintainers wanted a fancy eclass, and your package is a game that includes some files written in python and perl. When you have a bunch of packages that tend to come from the same upstream with the same development/release/packaging practices then sure, an all-in-one can make the ebuilds a lot cleaner. I think we're on the same page in any case. -- Rich
Re: [gentoo-dev] RFC: future.eclass
On Thu, Nov 6, 2014 at 5:03 PM, Zac Medico zmed...@gentoo.org wrote: On 11/06/2014 01:53 PM, Rich Freeman wrote: On Thu, Nov 6, 2014 at 3:11 PM, Michał Górny mgo...@gentoo.org wrote: # This eclass contains backports of functions that were accepted # by the Council for the EAPI following the EAPI used by ebuild, # and can be implemented in pure shell script. I'm not sure that I like this sort of a moving-target definition. When EAPI6 is out, do you intend to have the eclass die at some point for any packages using EAPI5? We should be able to simply migrate consumers to the new EAPI, then deprecate future.eclass. Deprecate it? But what about providing EAPI7 support for EAPI6 packages? The description doesn't say that the eclass is intended to provide EAPI6 support for EAPI5 packages - it says that it is intended to provide EAPIn+1 support for EAPIn packages. Of course, this approach tends to make the assumption that EAPIs are orderable, which isn't actually something anybody has committed to as far as I'm aware. The next EAPI /could/ be named webapp-1 and only be used for web applications. Granted, there have been no plans to date to deviate from the linear EAPI history we've maintained so far. I'm still concerned that in general we tend to have packages hang around at older EAPIs for a long time as it is. That isn't really a problem if those EAPIs are stable and supported for a while. This seems likely to complicate things. There is no guarantee that moving to the actual new EAPI won't break something, and packages that don't move become blockers for the eclass being able to move on to the next EAPI. -- Rich
Re: [gentoo-dev] [RFC] =udev-217 or =eudev-2.1 upgrade news item
On 29/10/14 13:42, Alex Xu wrote: On 29/10/14 07:28 AM, Samuli Suominen wrote: request for review before committing, suggestions welcome since it's rather short what i got to say thanks, Samuli typical news items are in the format packages no longer/now do thing. [thing is description of thing.] if you need thing, do steps. if you do not need thing, do steps. [blah blah metadata, I hereby assign all copyright for the following text to the Gentoo Foundation] sys-fs/udev-217 and sys-fs/eudev-2.1 no longer provide a userspace firmware loader. If you require firmware loading support, you must use kernel 3.7 or greater with CONFIG_FW_LOADER_USER_HELPER=n. No action is required if none of your kernel modules need firmware. See [1] for more information on the upgrade. [1]: https://wiki.gentoo.org/wiki/Udev/upgrade#udev_216_to_217 The news item has been committed today. :-) Sorry for the delay. I'm running out of excuses with my health issues. :-( Thanks and sorry, Samuli
Re: [gentoo-dev] [RFC] =udev-217 or =eudev-2.1 upgrade news item
On 07/11/14 07:13 AM, Samuli Suominen wrote: On 29/10/14 13:42, Alex Xu wrote: On 29/10/14 07:28 AM, Samuli Suominen wrote: request for review before committing, suggestions welcome since it's rather short what i got to say thanks, Samuli typical news items are in the format packages no longer/now do thing. [thing is description of thing.] if you need thing, do steps. if you do not need thing, do steps. [blah blah metadata, I hereby assign all copyright for the following text to the Gentoo Foundation] sys-fs/udev-217 and sys-fs/eudev-2.1 no longer provide a userspace firmware loader. If you require firmware loading support, you must use kernel 3.7 or greater with CONFIG_FW_LOADER_USER_HELPER=n. No action is required if none of your kernel modules need firmware. See [1] for more information on the upgrade. [1]: https://wiki.gentoo.org/wiki/Udev/upgrade#udev_216_to_217 The news item has been committed today. :-) Sorry for the delay. I'm running out of excuses with my health issues. :-( Thanks and sorry, Samuli oh, I just figured something. what about systemd? looks like IUSE=firmware-loader was removed in 217. signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] [RFC] =udev-217 or =eudev-2.1 upgrade news item
On 07/11/14 14:21, Alex Xu wrote: On 07/11/14 07:13 AM, Samuli Suominen wrote: On 29/10/14 13:42, Alex Xu wrote: On 29/10/14 07:28 AM, Samuli Suominen wrote: request for review before committing, suggestions welcome since it's rather short what i got to say thanks, Samuli typical news items are in the format packages no longer/now do thing. [thing is description of thing.] if you need thing, do steps. if you do not need thing, do steps. [blah blah metadata, I hereby assign all copyright for the following text to the Gentoo Foundation] sys-fs/udev-217 and sys-fs/eudev-2.1 no longer provide a userspace firmware loader. If you require firmware loading support, you must use kernel 3.7 or greater with CONFIG_FW_LOADER_USER_HELPER=n. No action is required if none of your kernel modules need firmware. See [1] for more information on the upgrade. [1]: https://wiki.gentoo.org/wiki/Udev/upgrade#udev_216_to_217 The news item has been committed today. :-) Sorry for the delay. I'm running out of excuses with my health issues. :-( Thanks and sorry, Samuli oh, I just figured something. what about systemd? looks like IUSE=firmware-loader was removed in 217. Linux 3.7 has been minimum req. for systemd for quite a while now, and I consider systemd in Gentoo still to be more of an work-in-progress And someone agreed with me on #gentoo-systemd, Freenode, that it's not necessary to include it, so I didn't However if the maintainers want to add it, that's fine by me, easy to add one line to the news item... I'll CC them to this mail just in case As in, noted - Samuli
Re: [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis)
On 11/07/2014 01:42 AM, Jauhien Piatlicki wrote: Hi, On 11/06/2014 02:43 PM, Ciaran McCreesh wrote: If you're going to go the toolkit route, you should be using a CP solver, not a SAT solver. But even then you'd be better off making some changes and not using plain old MAC, so you're back to writing the algorithms yourself. What you need is for someone who understands CP and SAT to write a resolver using algorithms inspired by how CP and SAT solvers work, but not just blindly copying them. Doing this well is at least a full year Masters level project... Yeah, you are right. What I am interested in is an overview of what algorithm we are using now. Do we have any documentation about it? As I really would like to look at some concise document rather than sources. If you install sys-apps/portage with USE=doc, it includes this documentation which gives an overview of the portage's dependency resolver algorithms: http://dev.gentoo.org/~zmedico/portage/doc/pt02.html Also may be we need to discuss how can we improve it, as at the moment for me it seems one of the biggest problems with Gentoo. And afaik paludis does not solve it (or am I wrong?) -- Thanks, Zac
Re: [gentoo-dev] RFC: future.eclass
On 11/07/2014 03:13 AM, Rich Freeman wrote: On Thu, Nov 6, 2014 at 5:03 PM, Zac Medico zmed...@gentoo.org wrote: On 11/06/2014 01:53 PM, Rich Freeman wrote: On Thu, Nov 6, 2014 at 3:11 PM, Michał Górny mgo...@gentoo.org wrote: # This eclass contains backports of functions that were accepted # by the Council for the EAPI following the EAPI used by ebuild, # and can be implemented in pure shell script. I'm not sure that I like this sort of a moving-target definition. When EAPI6 is out, do you intend to have the eclass die at some point for any packages using EAPI5? We should be able to simply migrate consumers to the new EAPI, then deprecate future.eclass. Deprecate it? But what about providing EAPI7 support for EAPI6 packages? The description doesn't say that the eclass is intended to provide EAPI6 support for EAPI5 packages - it says that it is intended to provide EAPIn+1 support for EAPIn packages. Okay, then we could number the future eclasses by EAPI. Like future-eapi-6, future-eapi-7, and so on. Of course, this approach tends to make the assumption that EAPIs are orderable, which isn't actually something anybody has committed to as far as I'm aware. The next EAPI /could/ be named webapp-1 and only be used for web applications. Granted, there have been no plans to date to deviate from the linear EAPI history we've maintained so far. We could also add a future-eapi-webapp-1 eclass. I'm still concerned that in general we tend to have packages hang around at older EAPIs for a long time as it is. That isn't really a problem if those EAPIs are stable and supported for a while. This seems likely to complicate things. Sure, it could. However, it should be pretty manageable if we use a separate future eclass for each EAPI. There is no guarantee that moving to the actual new EAPI won't break something, and packages that don't move become blockers for the eclass being able to move on to the next EAPI. -- Thanks, Zac
Re: [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis)
On Fri, 07 Nov 2014 10:42:39 +0100 Jauhien Piatlicki jauh...@gentoo.org wrote: Also may be we need to discuss how can we improve it, as at the moment for me it seems one of the biggest problems with Gentoo. And afaik paludis does not solve it (or am I wrong?) Paludis solves it. However, Paludis will only ever produce a correct resolution, which can be a problem since ebuild dependencies are often garbage... -- Ciaran McCreesh signature.asc Description: PGP signature
Re: [gentoo-dev] RFC: future.eclass
On Fri, Nov 7, 2014 at 1:01 PM, Zac Medico zmed...@gentoo.org wrote: I'm still concerned that in general we tend to have packages hang around at older EAPIs for a long time as it is. That isn't really a problem if those EAPIs are stable and supported for a while. This seems likely to complicate things. Sure, it could. However, it should be pretty manageable if we use a separate future eclass for each EAPI. I am still a bit uneasy, but I definitely agree that if we do this I'd much rather see a series of versioned eclasses than an eclass whose functionality changes in place over time. Ulm's point still exists that technically EAPI6 isn't actually approved yet, in part because the agreement was that nothing gets approved for good without a reference implementation in portage. So, there is some risk that it could change, which might mean that ebuilds that use future.eclass would need more work when moving them to an EAPI that no longer contains the function they call. That said, the whole point of the council vote was to avoid having the PM teams spending time on features that were going to get voted out at the last minute. Assuming that all goes as planned the actual PMS vote should be a formality, but you know how plans go... :) -- Rich
Re: [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis)
On 11/07/2014 07:07 PM, Ciaran McCreesh wrote: On Fri, 07 Nov 2014 10:42:39 +0100 Jauhien Piatlicki jauh...@gentoo.org wrote: Also may be we need to discuss how can we improve it, as at the moment for me it seems one of the biggest problems with Gentoo. And afaik paludis does not solve it (or am I wrong?) Paludis solves it. However, Paludis will only ever produce a correct resolution, which can be a problem since ebuild dependencies are often garbage... Then the same question for you: where can one read about the algorithm Paludis uses? And, again, I have herd (did not try myself) that Paludis is as slow as Portage. -- Jauhien signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Portage dependency solving algorithm (WAS: Regarding my final year thesis)
On Fri, 07 Nov 2014 19:11:04 +0100 Jauhien Piatlicki jauh...@gentoo.org wrote: Then the same question for you: where can one read about the algorithm Paludis uses? It's basically a two stage process: simple constraint solving using value ordering heuristics to enforce don't do unnecessary work, then ordering (which is not quite a graph process, and which is not as simple as a topological sort, because the tree is full of circular dependencies). But the interesting question isn't what's the algorithm?, it's what's the model?. That's where the complexity lies: figuring out how to turn *DEPEND specifications into constraints is an utter pain, and it isn't clean or easily understandable. The primary reason is || dependencies: developers like to write not-really-correct and utterly unobvious dependency strings rather than asking for new syntax so they can just say what they mean... And, again, I have herd (did not try myself) that Paludis is as slow as Portage. Well, you're not comparing like with like. Paludis with everything turned off does more than Portage with everything turned on. If all you're looking for is the wrong answer as fast as possible, there are easier ways of getting it... -- Ciaran McCreesh signature.asc Description: PGP signature
Re: [gentoo-dev] RFC: future.eclass
On Fri, 7 Nov 2014, Rich Freeman wrote: I am still a bit uneasy, but I definitely agree that if we do this I'd much rather see a series of versioned eclasses than an eclass whose functionality changes in place over time. Ulm's point still exists that technically EAPI6 isn't actually approved yet, in part because the agreement was that nothing gets approved for good without a reference implementation in portage. So, there is some risk that it could change, which might mean that ebuilds that use future.eclass would need more work when moving them to an EAPI that no longer contains the function they call. That said, the whole point of the council vote was to avoid having the PM teams spending time on features that were going to get voted out at the last minute. Assuming that all goes as planned the actual PMS vote should be a formality, but you know how plans go... :) I had thought that the lesson from premature implementation of the einstalldocs function in an eclass had been learned. There we have the problem that the eclass function is incompatible with what will be implemented in the package manager. Now we will have a third implementation of einstalldocs, along with a third implementation of the patch applying function. (The whole point of eapply is that it will be implemented in the PM; in eclasses we already have epatch which is more sophisticated.) Also I still don't see what problem future.eclass would solve. It doesn't save the EAPI bump, so the maintainer will have to update the ebuild twice, users will have to rebuild the package twice, and arch teams will have to stabilise twice. Besides, an eclass like this would also undermine the council's and QA team's efforts to keep the number of EAPIs in tree limited. Ulrich pgpOCTDmB9yz6.pgp Description: PGP signature
Re: [gentoo-dev] Portage dependency solving algorithm
Am 07. Nov 2014, 19:30 schrieb Ciaran McCreesh ciaran.mccre...@googlemail.com: On Fri, 07 Nov 2014 19:11:04 +0100 Jauhien Piatlicki jauh...@gentoo.org wrote: Then the same question for you: where can one read about the algorithm Paludis uses? It's basically a two stage process: simple constraint solving using value ordering heuristics to enforce don't do unnecessary work, then ordering (which is not quite a graph process, and which is not as simple as a topological sort, because the tree is full of circular dependencies). But the interesting question isn't what's the algorithm?, it's what's the model?. That's where the complexity lies: figuring out how to turn *DEPEND specifications into constraints is an utter pain, and it isn't clean or easily understandable. The primary reason is || dependencies: developers like to write not-really-correct and utterly unobvious dependency strings rather than asking for new syntax so they can just say what they mean... Currently, for portage just to decide that nothing has to be done on my machine takes around 1 minute. What is in your opinion the main reason for this? And how can we knock this down to reasonable speed? - Is our dependency model that more complex than the problem resolvers of other package managers for other distributions solve? - Is it the algorithm that is implemented for the dependency model? - Is it its implementation? And, again, I have herd (did not try myself) that Paludis is as slow as Portage. Well, you're not comparing like with like. Paludis with everything turned off does more than Portage with everything turned on. If all you're looking for is the wrong answer as fast as possible, there are easier ways of getting it... The last time I compared the resolver speed of portage and paludis both needed almost the same time. Do you have a speed comparison with a similar feature set of both? (Or, alternatively, the speedup one gains by tuning paludis to be as fast as possible). Best, Matthias
[gentoo-dev] RFC: new QA_NEEDED variable for files installed by pre-built binary packages with broken soname dependencies
Hi, In bug 528086 [1] we have a pre-built games package with a soname dependency on libSDL_mixer-1.2.so.0. The maintainer reports that the game works fine without this library, so he doesn't want to add a dependency on sdl-mixer. In order to satisfy this unneeded soname dependency, preserve-libs will preserve libSDL_mixer-1.2.so.0 when sdl-mixer is uninstalled. So, it would be nice if we had a way to tell preserve-libs not to satisfy unneeded soname dependencies. I would prefer not to ignore soname dependencies for all pre-built files, since some pre-built files may have library dependencies which are considered valid. Therefore, I suggest that we add a QA_NEEDED variable so that specific files with unneeded soname dependencies can be distinguished from files with soname dependencies that are actually needed. [1] https://bugs.gentoo.org/show_bug.cgi?id=528086 -- Thanks, Zac
Re: [gentoo-dev] Portage dependency solving algorithm
On 11/07/2014 07:54 PM, Matthias Maier wrote: Well, you're not comparing like with like. Paludis with everything turned off does more than Portage with everything turned on. If all you're looking for is the wrong answer as fast as possible, there are easier ways of getting it... The last time I compared the resolver speed of portage and paludis both needed almost the same time. Do you have a speed comparison with a similar feature set of both? (Or, alternatively, the speedup one gains by tuning paludis to be as fast as possible). I think you didn't get the idea: it doesn't make much sense to compare the speed if the correctness differs. Also, I don't understand these discussions. The time dependency resolving takes is marginal compared to the whole update process, no matter what PM you use.
Re: [gentoo-dev] RFC: new QA_NEEDED variable for files installed by pre-built binary packages with broken soname dependencies
On 7 November 2014 13:04, Zac Medico zmed...@gentoo.org wrote: In bug 528086 [1] we have a pre-built games package with a soname dependency on libSDL_mixer-1.2.so.0. The maintainer reports that the game works fine without this library, so he doesn't want to add a dependency on sdl-mixer. Ehm no this is absolutely ludicrously a bad idea. If the library is in NEEDED of binary, there is no need that said binary will run. If whatever package works fine without that binary being able to run, maybe the binary should not be installed. Just shoving this under the rug is preposterous. Diego Elio Pettenò — Flameeyes flamee...@flameeyes.eu — http://blog.flameeyes.eu/
Re: [gentoo-dev] Portage dependency solving algorithm
On Fri, 07 Nov 2014 19:54:08 +0100 Matthias Maier tam...@gentoo.org wrote: Currently, for portage just to decide that nothing has to be done on my machine takes around 1 minute. Are you running with or without metadata cache? If you're running without, it's going to be slow independently of the resolution algorithm... If you're not: - Is our dependency model that more complex than the problem resolvers of other package managers for other distributions solve? Yes, massively so. - Is it the algorithm that is implemented for the dependency model? Also a contributing factor, for certain cases. You may see Portage doing a lot of backtracking sometimes. There's a much better typical-case algorithm for this. - Is it its implementation? Also a factor. The main issue, though, is that getting a good resolution out of crappy data is extremely difficult. There's the Babbage quote: | On two occasions I have been asked, — Pray, Mr. Babbage, if you put | into the machine wrong figures, will the right answers come out? In | one case a member of the Upper, and in the other a member of the | Lower, House put this question. I am not able rightly to apprehend | the kind of confusion of ideas that could provoke such a question. Yet this is *exactly* what a dependency resolver has to do for Gentoo, and it's why dependency resolvers are so complicated. (For comparison, Paludis on Exherbo will run an order of magnitude faster for the same set of installed packages, simply because on Exherbo the input is correct.) Well, you're not comparing like with like. Paludis with everything turned off does more than Portage with everything turned on. If all you're looking for is the wrong answer as fast as possible, there are easier ways of getting it... The last time I compared the resolver speed of portage and paludis both needed almost the same time. To do different things, though. Portage doesn't have a produce a correct resolution switch. Paludis doesn't (really) have a produce an illegal resolution switch. (Again, assuming you have metadata cache. If you don't, whole other story.) -- Ciaran McCreesh signature.asc Description: PGP signature
Re: [gentoo-dev] RFC: new QA_NEEDED variable for files installed by pre-built binary packages with broken soname dependencies
On 11/07/2014 11:18 AM, Diego Elio Pettenò wrote: On 7 November 2014 13:04, Zac Medico zmed...@gentoo.org wrote: In bug 528086 [1] we have a pre-built games package with a soname dependency on libSDL_mixer-1.2.so.0. The maintainer reports that the game works fine without this library, so he doesn't want to add a dependency on sdl-mixer. Ehm no this is absolutely ludicrously a bad idea. If the library is in NEEDED of binary, there is no need that said binary will run. If whatever package works fine without that binary being able to run, maybe the binary should not be installed. Just shoving this under the rug is preposterous. Yeah, I figured that we'd get a reaction like this. I just thought I'd start by proposing some sort of compromise, and then let others fight it out. :) -- Thanks, Zac
Re: [gentoo-dev] Portage dependency solving algorithm
On 11/07/2014 08:08 PM, hasufell wrote: On 11/07/2014 07:54 PM, Matthias Maier wrote: Well, you're not comparing like with like. Paludis with everything turned off does more than Portage with everything turned on. If all you're looking for is the wrong answer as fast as possible, there are easier ways of getting it... The last time I compared the resolver speed of portage and paludis both needed almost the same time. Do you have a speed comparison with a similar feature set of both? (Or, alternatively, the speedup one gains by tuning paludis to be as fast as possible). I think you didn't get the idea: it doesn't make much sense to compare the speed if the correctness differs. Also, I don't understand these discussions. The time dependency resolving takes is marginal compared to the whole update process, no matter what PM you use. When it compiles in background after all dependencies was solved, it needs no user intervention. But when I need to solve some blocks or do some tests during maintaining work, the dependency solving time is what I care about, as I need to wait for it and then investigate the results. signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Portage dependency solving algorithm
On 11/07/2014 08:21 PM, Ciaran McCreesh wrote: The main issue, though, is that getting a good resolution out of crappy data is extremely difficult. There's the Babbage quote: | On two occasions I have been asked, — Pray, Mr. Babbage, if you put | into the machine wrong figures, will the right answers come out? In | one case a member of the Upper, and in the other a member of the | Lower, House put this question. I am not able rightly to apprehend | the kind of confusion of ideas that could provoke such a question. Yet this is *exactly* what a dependency resolver has to do for Gentoo, and it's why dependency resolvers are so complicated. (For comparison, Paludis on Exherbo will run an order of magnitude faster for the same set of installed packages, simply because on Exherbo the input is correct.) What;s wrong with input? PMS itself or how do maintainers write ebuilds? Could you explain? signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Portage dependency solving algorithm
Am 07. Nov 2014, 20:21 schrieb Ciaran McCreesh ciaran.mccre...@googlemail.com: On Fri, 07 Nov 2014 19:54:08 +0100 Matthias Maier tam...@gentoo.org wrote: Currently, for portage just to decide that nothing has to be done on my machine takes around 1 minute. Are you running with or without metadata cache? If you're running without, it's going to be slow independently of the resolution algorithm... Yes, I run with metadata cache. Without, ... well I never waited for it to finish. [...] Thank you very much for the detailed explanation. This helped a lot :-] (For comparison, Paludis on Exherbo will run an order of magnitude faster for the same set of installed packages, simply because on Exherbo the input is correct.) This might be a problem that we can tackle, though... Best, Matthias pgpGIY5YAKi8b.pgp Description: PGP signature
Re: [gentoo-dev] Portage dependency solving algorithm
On 11/07/2014 08:56 PM, Jauhien Piatlicki wrote: I think you didn't get the idea: it doesn't make much sense to compare the speed if the correctness differs. Also, I don't understand these discussions. The time dependency resolving takes is marginal compared to the whole update process, no matter what PM you use. When it compiles in background after all dependencies was solved, it needs no user intervention. But when I need to solve some blocks or do some tests during maintaining work, the dependency solving time is what I care about, as I need to wait for it and then investigate the results. I see, however... I prefer to have a correct answer instead of an incorrect one, even if the correct one takes longer. That goes _especially_ for testing and maintaining work. Every time people compare portage to paludis I read stuff like but paludis is slower. That is incomplete information to put it diplomatic. Do you really care so much about speed that you don't mind wrong results?
Re: [gentoo-dev] Portage dependency solving algorithm
07.11.14 21:44, hasufell написав(ла): On 11/07/2014 08:56 PM, Jauhien Piatlicki wrote: Every time people compare portage to paludis I read stuff like but paludis is slower. That is incomplete information to put it diplomatic. Do you really care so much about speed that you don't mind wrong results? My original question was about Portage being too slow. And Paludis came out just as an alternative. And I would like to see a detailed discussion about what's wrong from the point of view of correctness with: 1. PMS 2. ebuilds in tree 3. Portage dependency solving Was this discussed somewhere? Could you point me there? -- Jauhien signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Portage dependency solving algorithm
On 11/07/2014 09:55 PM, Jauhien Piatlicki wrote: 07.11.14 21:44, hasufell написав(ла): On 11/07/2014 08:56 PM, Jauhien Piatlicki wrote: Every time people compare portage to paludis I read stuff like but paludis is slower. That is incomplete information to put it diplomatic. Do you really care so much about speed that you don't mind wrong results? My original question was about Portage being too slow. And Paludis came out just as an alternative. And I would like to see a detailed discussion about what's wrong from the point of view of correctness with: 1. PMS 2. ebuilds in tree 3. Portage dependency solving Was this discussed somewhere? Could you point me there? The first thing that comes to my mind is dynamic dependencies. They are wrong and that has been discussed recently on this ML. If you have ever switched from portage to paludis on a full-grown system, then you know how much bad data and missing updates/blockers/dependencies are hidden. However, it seems that this issue is being addressed by the portage team, afair. Next thing that comes to my mind is: indeterministic results. I'v had LOTS of them with portage. You run an emerge, abort. You run it again... and woosh, different result. I'v not hit that case yet with paludis, unless I ran it with different configuration options.
Re: [gentoo-dev] Portage dependency solving algorithm
On 11/07/2014 01:04 PM, hasufell wrote: Next thing that comes to my mind is: indeterministic results. I'v had LOTS of them with portage. You run an emerge, abort. You run it again... and woosh, different result. This is a result of the solution space being quite large, combined with hash randomization (and possibly some other forms of randomization). You will probably notice this sort of randomization more for failed dependency calculations than for successful dependency calculations. Successful dependency calculations will almost always result in reproducible results. -- Thanks, Zac
Re: [gentoo-dev] RFC: new QA_NEEDED variable for files installed by pre-built binary packages with broken soname dependencies
On 7 November 2014 13:50, Zac Medico zmed...@gentoo.org wrote: Yeah, I figured that we'd get a reaction like this. I just thought I'd start by proposing some sort of compromise, and then let others fight it out. :) Since we got to a positive conclusion on the bug, let's not consider this proposal worth our time any more, shall we? Diego Elio Pettenò — Flameeyes flamee...@flameeyes.eu — http://blog.flameeyes.eu/
Re: [gentoo-dev] RFC: new QA_NEEDED variable for files installed by pre-built binary packages with broken soname dependencies
On 11/07/2014 03:25 PM, Diego Elio Pettenò wrote: On 7 November 2014 13:50, Zac Medico zmed...@gentoo.org wrote: Yeah, I figured that we'd get a reaction like this. I just thought I'd start by proposing some sort of compromise, and then let others fight it out. :) Since we got to a positive conclusion on the bug, let's not consider this proposal worth our time any more, shall we? Okay, sure. I'll save it for the day when someone finds a valid reason to install binaries with broken soname deps (not likely). -- Thanks, Zac
Re: [gentoo-dev] RFC: new QA_NEEDED variable for files installed by pre-built binary packages with broken soname dependencies
On 8 November 2014 13:59, Zac Medico zmed...@gentoo.org wrote: Okay, sure. I'll save it for the day when someone finds a valid reason to install binaries with broken soname deps (not likely). Another candidate for a possible valid reason: https://bugs.gentoo.org/show_bug.cgi?id=460468 There's probably a better way of solving that too. -- Kent *KENTNL* - https://metacpan.org/author/KENTNL
Re: [gentoo-dev] RFC: new QA_NEEDED variable for files installed by pre-built binary packages with broken soname dependencies
On 8 Nov 2014 01:35, Kent Fredric kentfred...@gmail.com wrote: On 8 November 2014 13:59, Zac Medico zmed...@gentoo.org wrote: Okay, sure. I'll save it for the day when someone finds a valid reason to install binaries with broken soname deps (not likely). Another candidate for a possible valid reason: https://bugs.gentoo.org/show_bug.cgi?id=460468 There's probably a better way of solving that too. Don't make the javafx install automagic
[gentoo-dev] Last rites: razorqt-base/*
# Ben de Groot yng...@gentoo.org (7 Nov 2014) # Unmaintained, no longer supported, and starting to throw compilation # errors (bug #513906, bug #528372). Masked for removal in 30 days. # Update to lxqt-base/* packages. razorqt-base/libqtxdg razorqt-base/razorqt-appswitcher razorqt-base/razorqt-autosuspend razorqt-base/razorqt-config razorqt-base/razorqt-data razorqt-base/razorqt-desktop razorqt-base/razorqt-kbshortcuts razorqt-base/razorqt-libs razorqt-base/razorqt-lightdm-greeter razorqt-base/razorqt-meta razorqt-base/razorqt-notifications razorqt-base/razorqt-openssh-askpass razorqt-base/razorqt-panel razorqt-base/razorqt-policykit razorqt-base/razorqt-power razorqt-base/razorqt-runner razorqt-base/razorqt-session -- Cheers, Ben | yngwin Gentoo developer
[gentoo-portage-dev] [PATCH] Log changes between vdb_metadata.pickle updates
This adds add support to generate a vdb_metadata_delta.json file which tracks package merges / unmerges that occur between updates to vdb_metadata.pickle. IndexedVardb can use the delta together with vdb_metadata.pickle to reconstruct a complete view of /var/db/pkg, so that it can avoid expensive listdir calls in /var/db/pkg/*. Note that vdb_metadata.pickle is only updated periodically, in order to avoid excessive re-writes of a large file. In order to test the performance gains from this patch, you need to generate /var/cache/edb/vdb_metadata_delta.json first, which will happen automatically if you run 'emerge -p anything' with root privileges. --- pym/portage/dbapi/IndexedVardb.py | 35 - pym/portage/dbapi/vartree.py | 161 +++--- 2 files changed, 185 insertions(+), 11 deletions(-) diff --git a/pym/portage/dbapi/IndexedVardb.py b/pym/portage/dbapi/IndexedVardb.py index 424defc..e225ca1 100644 --- a/pym/portage/dbapi/IndexedVardb.py +++ b/pym/portage/dbapi/IndexedVardb.py @@ -3,6 +3,7 @@ import portage from portage.dep import Atom +from portage.exception import InvalidData from portage.versions import _pkg_str class IndexedVardb(object): @@ -42,7 +43,39 @@ class IndexedVardb(object): if self._cp_map is not None: return iter(sorted(self._cp_map)) - return self._iter_cp_all() + cache_delta = self._vardb._cache_delta_load_race() + if cache_delta is None: + return self._iter_cp_all() + + packages = self._vardb._aux_cache[packages] + for delta in cache_delta[deltas]: + cpv = delta[package] + - + delta[version] + event = delta[event] + if event == add: + # Use aux_get to populate the cache + # for this cpv. + if cpv not in packages: + try: + self._vardb.aux_get(cpv, [DESCRIPTION]) + except KeyError: + pass + elif event == remove: + packages.pop(cpv, None) + + self._cp_map = cp_map = {} + for cpv in packages: + try: + cpv = _pkg_str(cpv) + except InvalidData: + continue + + cp_list = cp_map.get(cpv.cp) + if cp_list is None: + cp_list = [] + cp_map[cpv.cp] = cp_list + cp_list.append(cpv) + + return iter(sorted(self._cp_map)) def _iter_cp_all(self): self._cp_map = cp_map = {} diff --git a/pym/portage/dbapi/vartree.py b/pym/portage/dbapi/vartree.py index 6ab4b92..fd4b099 100644 --- a/pym/portage/dbapi/vartree.py +++ b/pym/portage/dbapi/vartree.py @@ -76,6 +76,7 @@ import gc import grp import io from itertools import chain +import json import logging import os as _os import platform @@ -109,6 +110,7 @@ class vardbapi(dbapi): |.join(_excluded_dirs) + r')$') _aux_cache_version= 1 + _aux_cache_delta_version = 1 _owners_cache_version = 1 # Number of uncached packages to trigger cache update, since @@ -177,6 +179,8 @@ class vardbapi(dbapi): self._aux_cache_obj = None self._aux_cache_filename = os.path.join(self._eroot, CACHE_PATH, vdb_metadata.pickle) + self._cache_delta_filename = os.path.join(self._eroot, + CACHE_PATH, vdb_metadata_delta.json) self._counter_path = os.path.join(self._eroot, CACHE_PATH, counter) @@ -511,6 +515,120 @@ class vardbapi(dbapi): self.cpcache.pop(pkg_dblink.mysplit[0], None) dircache.pop(pkg_dblink.dbcatdir, None) + def _cache_delta(self, event, cpv, slot, counter): + + self.lock() + try: + deltas_obj = self._cache_delta_load() + + if deltas_obj is None: + # We can't record meaningful deltas without + # a pre-existing state. + return + + delta_node = { + event: event, + package: cpv.cp, + version: cpv.version, + slot: slot, + counter: %s % counter + } + + deltas_obj[deltas].append(delta_node) + + # Eliminate
Re: [gentoo-portage-dev] [PATCH] Log changes between vdb_metadata.pickle updates
On Fri, 7 Nov 2014 00:45:55 -0800 Zac Medico zmed...@gentoo.org wrote: This adds add support to generate a vdb_metadata_delta.json file which tracks package merges / unmerges that occur between updates to vdb_metadata.pickle. IndexedVardb can use the delta together with vdb_metadata.pickle to reconstruct a complete view of /var/db/pkg, so that it can avoid expensive listdir calls in /var/db/pkg/*. Note that vdb_metadata.pickle is only updated periodically, in order to avoid excessive re-writes of a large file. In order to test the performance gains from this patch, you need to generate /var/cache/edb/vdb_metadata_delta.json first, which will happen automatically if you run 'emerge -p anything' with root privileges. --- pym/portage/dbapi/IndexedVardb.py | 35 - pym/portage/dbapi/vartree.py | 161 +++--- 2 files changed, 185 insertions(+), 11 deletions(-) diff --git a/pym/portage/dbapi/IndexedVardb.py b/pym/portage/dbapi/IndexedVardb.py index 424defc..e225ca1 100644 --- a/pym/portage/dbapi/IndexedVardb.py +++ b/pym/portage/dbapi/IndexedVardb.py @@ -3,6 +3,7 @@ import portage from portage.dep import Atom +from portage.exception import InvalidData from portage.versions import _pkg_str class IndexedVardb(object): @@ -42,7 +43,39 @@ class IndexedVardb(object): if self._cp_map is not None: return iter(sorted(self._cp_map)) - return self._iter_cp_all() + cache_delta = self._vardb._cache_delta_load_race() + if cache_delta is None: + return self._iter_cp_all() + + packages = self._vardb._aux_cache[packages] + for delta in cache_delta[deltas]: + cpv = delta[package] + - + delta[version] + event = delta[event] + if event == add: + # Use aux_get to populate the cache + # for this cpv. + if cpv not in packages: + try: + self._vardb.aux_get(cpv, [DESCRIPTION]) + except KeyError: + pass + elif event == remove: + packages.pop(cpv, None) + + self._cp_map = cp_map = {} + for cpv in packages: + try: + cpv = _pkg_str(cpv) + except InvalidData: + continue + + cp_list = cp_map.get(cpv.cp) + if cp_list is None: + cp_list = [] + cp_map[cpv.cp] = cp_list + cp_list.append(cpv) + + return iter(sorted(self._cp_map)) def _iter_cp_all(self): self._cp_map = cp_map = {} looks good diff --git a/pym/portage/dbapi/vartree.py b/pym/portage/dbapi/vartree.py index 6ab4b92..fd4b099 100644 --- a/pym/portage/dbapi/vartree.py +++ b/pym/portage/dbapi/vartree.py @@ -76,6 +76,7 @@ import gc import grp import io from itertools import chain +import json import logging import os as _os import platform @@ -109,6 +110,7 @@ class vardbapi(dbapi): |.join(_excluded_dirs) + r')$') _aux_cache_version= 1 + _aux_cache_delta_version = 1 _owners_cache_version = 1 # Number of uncached packages to trigger cache update, since @@ -177,6 +179,8 @@ class vardbapi(dbapi): self._aux_cache_obj = None self._aux_cache_filename = os.path.join(self._eroot, CACHE_PATH, vdb_metadata.pickle) + self._cache_delta_filename = os.path.join(self._eroot, + CACHE_PATH, vdb_metadata_delta.json) self._counter_path = os.path.join(self._eroot, CACHE_PATH, counter) @@ -511,6 +515,120 @@ class vardbapi(dbapi): self.cpcache.pop(pkg_dblink.mysplit[0], None) dircache.pop(pkg_dblink.dbcatdir, None) The following code I would like to see either as an independant class and file if possible, then just instantiated here in the main vardbapi. Looking over the code, I didn't see much use of other class functions. This class is already too large in many ways. Also is there a possibility this code could be re-used as a generic delta cache anywhere else? Another possibility is moving this code and the aux_cache code to another class that the vardbapi class also subclasses. This would move all the cache code to a small class easily viewed, edited, maintained. This file is already 5k+ LOC and primarily the vardbapi class + def _cache_delta(self, event, cpv, slot, counter): + + self.lock() + try: +
Re: [gentoo-portage-dev] [PATCH] Log changes between vdb_metadata.pickle updates
On 11/07/2014 08:51 AM, Brian Dolbec wrote: On Fri, 7 Nov 2014 00:45:55 -0800 Zac Medico zmed...@gentoo.org wrote: This adds add support to generate a vdb_metadata_delta.json file which tracks package merges / unmerges that occur between updates to vdb_metadata.pickle. IndexedVardb can use the delta together with vdb_metadata.pickle to reconstruct a complete view of /var/db/pkg, so that it can avoid expensive listdir calls in /var/db/pkg/*. Note that vdb_metadata.pickle is only updated periodically, in order to avoid excessive re-writes of a large file. In order to test the performance gains from this patch, you need to generate /var/cache/edb/vdb_metadata_delta.json first, which will happen automatically if you run 'emerge -p anything' with root privileges. --- The following code I would like to see either as an independant class and file if possible, then just instantiated here in the main vardbapi. Looking over the code, I didn't see much use of other class functions. This class is already too large in many ways. Yeah, I definitely want to split it out. Also is there a possibility this code could be re-used as a generic delta cache anywhere else? Maybe. For example, the PreservedLibsRegistry and WorldSelectedSet classes both have similarities in the way that encapsulate an on-disk data store and manage concurrency. Maybe I'll create a helper class that can be utilized by these classes to manage concurrency with on-disk data stores. Another possibility is moving this code and the aux_cache code to another class that the vardbapi class also subclasses. This would move all the cache code to a small class easily viewed, edited, maintained. In this case, I think a helper class will work just fine, so there will be no need for inheritance. -- Thanks, Zac