Re: [gentoo-dev] Six month major project on Gentoo
On 12/19/11 7:14 PM, Sébastien Fabbro wrote: One project that could be very useful for Gentoo is an automated stabilization/testing for ebuilds. Obviously it will require some work from the ebuild maintainers, but the ability to distribute the stabilization recipes across a volunteering Gentoo community via something like BOINC could be worth looking at. I can help with guidance and mentoring here. I'm maintaining a set of scripts http://git.overlays.gentoo.org/gitweb/?p=proj/arch-tools.git;a=summary that I use for mass-stabilization and related tasks. If someone wants to extend it so the testing can be distributed, that's great. signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Six month major project on Gentoo
On Wed, Dec 21, 2011 at 11:43 PM, Donnie Berkholz dberkh...@gentoo.org wrote: I looked into this 6 or 7 years ago. It wasn't feasible unless you were on an extremely high-speed, low-latency network, beyond what was typically accessible at the time outside of universities and LANs. Could be worth exploring again now that 25-100 mbps connections are becoming more common. I tried messing around with this with Amazon EC2. The problem was that due to latency I only really saw the benefit for VERY high levels of parallelization (think -j25+).. However, make isn't actually distcc-aware so it just runs 25 jobs of anything in parallel. So, anytime a makefile launched a ton of java or python jobs the host ground to a halt as it wasn't distributed and it was way more than the host could handle (especially java - which swapped like there was no tomorrow). If somebody were to do a distcc-ng for a large cluster one of the problems to solve would be having it not run jobs in parallel if it couldn't actually distribute them. Rich
Re: [gentoo-dev] Six month major project on Gentoo
2011/12/22 Rich Freeman ri...@gentoo.org: On Wed, Dec 21, 2011 at 11:43 PM, Donnie Berkholz dberkh...@gentoo.org wrote: I looked into this 6 or 7 years ago. It wasn't feasible unless you were on an extremely high-speed, low-latency network, beyond what was typically accessible at the time outside of universities and LANs. Could be worth exploring again now that 25-100 mbps connections are becoming more common. I tried messing around with this with Amazon EC2. The problem was that due to latency I only really saw the benefit for VERY high levels of parallelization (think -j25+).. However, make isn't actually distcc-aware so it just runs 25 jobs of anything in parallel. So, anytime a makefile launched a ton of java or python jobs the host ground to a halt as it wasn't distributed and it was way more than the host could handle (especially java - which swapped like there was no tomorrow). If somebody were to do a distcc-ng for a large cluster one of the problems to solve would be having it not run jobs in parallel if it couldn't actually distribute them. Rich Just wanted to point out that (if there is enough memory) recent kernels manage much better parallelism, even excess of it, once reached the maximum load augmenting threads only bring minimal loss of real time.
Re: [gentoo-dev] Six month major project on Gentoo
On Thu, 22 Dec 2011 12:11:32 +0100 Francesco Riosa viv...@gmail.com wrote: I tried messing around with this with Amazon EC2. The problem was that due to latency I only really saw the benefit for VERY high levels of parallelization (think -j25+).. However, make isn't actually distcc-aware so it just runs 25 jobs of anything in parallel. So, anytime a makefile launched a ton of java or python jobs the host ground to a halt as it wasn't distributed and it was way more than the host could handle (especially java - which swapped like there was no tomorrow). Just wanted to point out that (if there is enough memory) recent kernels manage much better parallelism, even excess of it, once reached the maximum load augmenting threads only bring minimal loss of real time. Does that include handling complete lack of memory and heavy swapping? -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Six month major project on Gentoo
On Thursday 15 December 2011 23:37:10 Gaurav Saxena wrote: Hello all , Thanks a lot for your replies. Christian, I am interested in Open RC, it sounds interesting to me, I would like to know more details regarding what type of projects are there that could be done. On Wed, Dec 14, 2011 at 11:35 PM, Christian Ruppert id...@gentoo.org wrote: On Wednesday 14 December 2011 16:36:42 Gaurav Saxena wrote: Hello all, I am interested in doing my final year computer scence project on gentoo. I would be having a duration of six months to work on the project. Could you please suggest me some good project ideas that would be helpful to me as well as gentoo. I am interested in parallel computing, data structures , operating system. I am well versed in C/C++. I think there might be projects which need to be done, I would like to work on them. What about OpenRC? :) We could need some help. http://www.gentoo.org/proj/en/base/openrc/ #openrc or #gentoo-base (IRC) via FreeNode Or ope...@gentoo.org. Let me know if you're interested or need more details :) -- Regards, Christian Ruppert Gentoo Linux developer, Bugzilla administrator and Infrastructure member Fingerprint: EEB1 C341 7C84 B274 6C59 F243 5EAB 0C62 B427 ABC8 Hi Gaurav, you may know OpenRC already from your Gentoo machines, it's the service management fronted of sysvinit that handles the startup/shutdown of services in various runlevels incl. dependencies. What I can say is: The rc_parallel (so starting services parallel) feature needs some love. There are some issues re service dependencies, locking etc. https://bugs.gentoo.org/391945 https://bugs.gentoo.org/360013 and some more. We also have some issues with links of init scripts: There are a lot of other major and minor bugs. See http://preview.tinyurl.com/openrc-bugs You can also ask us via IRC if you want (just stay longer, remind the different timezones so it may take some time till one replies :P) -- Regards, Christian Ruppert Gentoo Linux developer, Bugzilla administrator and Infrastructure member Fingerprint: EEB1 C341 7C84 B274 6C59 F243 5EAB 0C62 B427 ABC8 signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] Six month major project on Gentoo
Gaurav Saxena grvsaxena...@gmail.com wrote: I am interested in doing my final year computer scence project on gentoo. I would be having a duration of six months to work on the project. Could you please suggest me some good project ideas that would be helpful to me as well as gentoo. I am interested in parallel computing, data structures , operating system. I am well versed in C/C++. I think there might be projects which need to be done, I would like to work on them. One project that could be very useful for Gentoo is an automated stabilization/testing for ebuilds. Obviously it will require some work from the ebuild maintainers, but the ability to distribute the stabilization recipes across a volunteering Gentoo community via something like BOINC could be worth looking at. -- Sébastien
Re: [gentoo-dev] Six month major project on Gentoo
On 14.12.2011 13:06, Gaurav Saxena wrote: Hello all, I am interested in doing my final year computer scence project on gentoo. I would be having a duration of six months to work on the project. Could you please suggest me some good project ideas that would be helpful to me as well as gentoo. I am interested in parallel computing, data structures , operating system. I am well versed in C/C++. I think there might be projects which need to be done, I would like to work on them. There are parallel computing aspects in libbash for metadata generation, data structures in AST building for bash and it's quite low level. Feel free to contact me off list if you are interested. It would be nice to get that project back on track again after the last GSoC. http://www.gentoo.org/proj/en/libbash/index.xml Regards, Petteri signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Six month major project on Gentoo
On 12/18/11 6:02 PM, Petteri Räty wrote: There are parallel computing aspects in libbash for metadata generation, data structures in AST building for bash and it's quite low level. By the way, I've always wondered why libbash is separate from the upstream bash. Have you considered contributing to the upstream bash to convert the shell itself to a more library-oriented design (somewhat similar to LLVM), so that you have a guarantee that the lib and the shell stay in sync? Feel free to change the subject when responding. signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Six month major project on Gentoo
On Sun, 18 Dec 2011 18:13:42 +0100 Paweł Hajdan, Jr. phajdan...@gentoo.org wrote: On 12/18/11 6:02 PM, Petteri Räty wrote: There are parallel computing aspects in libbash for metadata generation, data structures in AST building for bash and it's quite low level. By the way, I've always wondered why libbash is separate from the upstream bash. Have you considered contributing to the upstream bash to convert the shell itself to a more library-oriented design (somewhat similar to LLVM), so that you have a guarantee that the lib and the shell stay in sync? I don't think upstream bash would be interested in converting bash into C++. I wouldn't be interested in running such a bash, for instance. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Six month major project on Gentoo
On Thursday 15 December 2011 00:39:44 Nirbheek Chauhan wrote: On Thu, Dec 15, 2011 at 5:28 AM, Mike Frysinger vap...@gentoo.org wrote: On Wednesday 14 December 2011 18:43:33 Alec Warner wrote: On Wed, Dec 14, 2011 at 1:25 PM, Leho Kraav l...@kraav.com wrote: i'd be really happy if someone took care of https://bugs.gentoo.org/150031 : include more info about binpkg in file name That is great, but its not a 6 month project... is it though ? i'm inclined to mark INVALID. hijacking filenames for metadata is a tuuurrible idea. I agree. It's along the same lines as only using file extensions for determining the filetype (and we all know how that turned out...). It *does* have the advantage of being really fast, though. it just doesn't scale though (encoding all metadata into the filename quickly hits filesystem limits on name length), and i think the speed increase is only to a limit. once you get into larger repos, using the already existing Packages file would be faster. and since that compresses, it should scale a lot nicer. Nevertheless, the basic bug is about changing the distfile repository format in such a way that a single repo can contain several distfiles built with differing build conditions. Putting metadata in the filename is only one way of ensuring that. sounds like the summary needs updating then by someone who has waded through all the followup comments ;) -mike signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] Six month major project on Gentoo
On Thu, Dec 15, 2011 at 5:57 PM, Mike Frysinger vap...@gentoo.org wrote: On Thursday 15 December 2011 00:39:44 Nirbheek Chauhan wrote: Nevertheless, the basic bug is about changing the distfile repository format in such a way that a single repo can contain several distfiles built with differing build conditions. Putting metadata in the filename is only one way of ensuring that. sounds like the summary needs updating then by someone who has waded through all the followup comments ;) I didn't read every word, but I think I got the gist. I've changed the subject accordingly. :) -- ~Nirbheek Chauhan Gentoo GNOME+Mozilla Team
Re: [gentoo-dev] Six month major project on Gentoo
On Thu, Dec 15, 2011 at 12:39 AM, Nirbheek Chauhan nirbh...@gentoo.org wrote: Nevertheless, the basic bug is about changing the distfile repository format in such a way that a single repo can contain several distfiles built with differing build conditions. Putting metadata in the filename is only one way of ensuring that. Well, having the filename vary when the metadata changes is the only way of ensuring that. Putting the metadata in the filename is just one of many ways to make the filename vary. Another solution (which I can already sense the objections to), would be to content-hash the files and use that as the filename. Then use indexes to point to the files. You could use symlink indexes to point to the files so that superficially it looks the same as it does now for the last version emerged. Then people looking for a particular set of metadata could use more detailed indexes to find the right file. Portage could look for an exact match when trying to merge a binpkg since searching indexes is a trivial problem. The indexes could be anything from text files to binary files to databases to a couple of directory trees full of symlinks (like /dev/disk/by-*). The symlinks could get tricky with all the metadata - it might make more sense to just keep it simple and use something more like a database for the full details and symlinks for the basics. There are countless variations on this as well - like sticking a copy of the environment for each package in a separate text file with the same base name so that it is easy to grep/search/etc. You can also make it more user-friendly by keeping the PF in the filename followed by the hash - like gvim-1.23-r1-723ba298d92f. In such a case you probably don't even need to index the PFs. Rich
Re: [gentoo-dev] Six month major project on Gentoo
On Thursday 15 December 2011 07:43:26 Rich Freeman wrote: On Thu, Dec 15, 2011 at 12:39 AM, Nirbheek Chauhan wrote: Nevertheless, the basic bug is about changing the distfile repository format in such a way that a single repo can contain several distfiles built with differing build conditions. Putting metadata in the filename is only one way of ensuring that. Well, having the filename vary when the metadata changes is the only way of ensuring that. Putting the metadata in the filename is just one of many ways to make the filename vary. there is more raw metadata available than fits into a filename. so that is already a non-starter. if people want to post multiple binpkgs with different metadata Another solution (which I can already sense the objections to), would be to content-hash the files and use that as the filename. Then use indexes to point to the files. You could use symlink indexes to point to the files so that superficially it looks the same as it does now for the last version emerged. Then people looking for a particular set of metadata could use more detailed indexes to find the right file. Portage could look for an exact match when trying to merge a binpkg since searching indexes is a trivial problem. we already hash all the files (i.e. the CONTENTS file), so using the hash of that file alone wouldn't be a bad idea. although i think that file gets generated on the fly when merging the binpkg (seems like a waste to not cache that in the binary package ...). There are countless variations on this as well - like sticking a copy of the environment for each package in a separate text file with the same base name so that it is easy to grep/search/etc. the env is already trivial to extract: qtbz2 -x -O bison-2.5.tbz2 | qxpak -O -x - environment.bz2 | bzgrep foo You can also make it more user-friendly by keeping the PF in the filename followed by the hash - like gvim-1.23-r1-723ba298d92f. In such a case you probably don't even need to index the PFs. portage should be using the generated Packages index to look up actual tbz2 filenames, so having ${PF}-hash.tbz2 shouldn't be too painful. -mike signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] Six month major project on Gentoo
On Thursday 15 December 2011 11:30:46 Mike Frysinger wrote: if people want to post multiple binpkgs with different metadata err, half formed thought here ... ignore -mike signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] Six month major project on Gentoo
Hello all , Thanks a lot for your replies. Christian, I am interested in Open RC, it sounds interesting to me, I would like to know more details regarding what type of projects are there that could be done. On Wed, Dec 14, 2011 at 11:35 PM, Christian Ruppert id...@gentoo.org wrote: On Wednesday 14 December 2011 16:36:42 Gaurav Saxena wrote: Hello all, I am interested in doing my final year computer scence project on gentoo. I would be having a duration of six months to work on the project. Could you please suggest me some good project ideas that would be helpful to me as well as gentoo. I am interested in parallel computing, data structures , operating system. I am well versed in C/C++. I think there might be projects which need to be done, I would like to work on them. What about OpenRC? :) We could need some help. http://www.gentoo.org/proj/en/base/openrc/ #openrc or #gentoo-base (IRC) via FreeNode Or ope...@gentoo.org. Let me know if you're interested or need more details :) -- Regards, Christian Ruppert Gentoo Linux developer, Bugzilla administrator and Infrastructure member Fingerprint: EEB1 C341 7C84 B274 6C59 F243 5EAB 0C62 B427 ABC8 -- Thanks and Regards , Gaurav
Re: [gentoo-dev] Six month major project on Gentoo
Hello Alec, I am interested in a distributed compiler like this ,this involves quite a good project it seems. I would like to work on it, if possible could you please give me some pointers regarding more details about the project so that I could decide what to work upon. On Wed, Dec 14, 2011 at 11:59 PM, Alec Warner anta...@gentoo.org wrote: On Wed, Dec 14, 2011 at 3:06 AM, Gaurav Saxena grvsaxena...@gmail.com wrote: Hello all, I am interested in doing my final year computer scence project on gentoo. I would be having a duration of six months to work on the project. Could you please suggest me some good project ideas that would be helpful to me as well as gentoo. I am interested in parallel computing, data structures , operating system. I am well versed in C/C++. I think there might be projects which need to be done, I would like to work on them. The only idea I can think of for parallel computing / distributed systems would be at the build level. distcc-ng, a farm of user-controlled machines that compile your code in a p2p fashion. a distributed hash table of input, output tuples (basically .o caching so users can fetch the .o from the DHT) Both of these have *massive* trust issues. When random guys on the internet are compiling your code you have to be very careful about how you verify and execute that code. When you fetch .o files from a DHT you have the same problem. Almost every other problem I can think of at the Gentoo OS level can fit on a meager sized machine (i.e. it is not a distributed systems nor a parallel computing problem.) Many of the annoying parts of Gentoo are merely tools problems; the existing tools are poor / under-maintained or standard tools do not exist (so users / developers roll their own.) You may have success in the tools arena if you talk to mgorny or portage-utils@; I know mgorny has written a few C tools and might have sufficient 'gentoo' C libraries you could utilize; the portage-utils alias holds the portage-utils authors (portage-utils being another set of tools written in C for gentoo.) I actually liked cbergstrom's idea of toolchain-type stuff; but I'm not really sure how easy it is to on-board with those communities (lord knows in my senior year of CS I would have been useless working on a compiler.) -- Thanks and Regards , Gaurav -- Thanks and Regards , Gaurav
Re: [gentoo-dev] Six month major project on Gentoo
On Wednesday 14 December 2011 16:36:42 Gaurav Saxena wrote: Hello all, I am interested in doing my final year computer scence project on gentoo. I would be having a duration of six months to work on the project. Could you please suggest me some good project ideas that would be helpful to me as well as gentoo. I am interested in parallel computing, data structures , operating system. I am well versed in C/C++. I think there might be projects which need to be done, I would like to work on them. What about OpenRC? :) We could need some help. http://www.gentoo.org/proj/en/base/openrc/ #openrc or #gentoo-base (IRC) via FreeNode Or ope...@gentoo.org. Let me know if you're interested or need more details :) -- Regards, Christian Ruppert Gentoo Linux developer, Bugzilla administrator and Infrastructure member Fingerprint: EEB1 C341 7C84 B274 6C59 F243 5EAB 0C62 B427 ABC8 signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] Six month major project on Gentoo
On 12/15/11 01:05 AM, Christian Ruppert wrote: On Wednesday 14 December 2011 16:36:42 Gaurav Saxena wrote: Hello all, I am interested in doing my final year computer scence project on gentoo. I would be having a duration of six months to work on the project. Could you please suggest me some good project ideas that would be helpful to me as well as gentoo. I am interested in parallel computing, data structures , operating system. I am well versed in C/C++. I think there might be projects which need to be done, I would like to work on them. Not directly gentoo, but certainly would impact all gentoo users - toolchain/compilers/path64 There's a number of ways the compiler could be improved for better SIMD vectorization and parallel computing /* I'm biased and work on this project - ping me on irc if you're interested */ ./C #pathscale
Re: [gentoo-dev] Six month major project on Gentoo
On Wed, Dec 14, 2011 at 3:06 AM, Gaurav Saxena grvsaxena...@gmail.com wrote: Hello all, I am interested in doing my final year computer scence project on gentoo. I would be having a duration of six months to work on the project. Could you please suggest me some good project ideas that would be helpful to me as well as gentoo. I am interested in parallel computing, data structures , operating system. I am well versed in C/C++. I think there might be projects which need to be done, I would like to work on them. The only idea I can think of for parallel computing / distributed systems would be at the build level. distcc-ng, a farm of user-controlled machines that compile your code in a p2p fashion. a distributed hash table of input, output tuples (basically .o caching so users can fetch the .o from the DHT) Both of these have *massive* trust issues. When random guys on the internet are compiling your code you have to be very careful about how you verify and execute that code. When you fetch .o files from a DHT you have the same problem. Almost every other problem I can think of at the Gentoo OS level can fit on a meager sized machine (i.e. it is not a distributed systems nor a parallel computing problem.) Many of the annoying parts of Gentoo are merely tools problems; the existing tools are poor / under-maintained or standard tools do not exist (so users / developers roll their own.) You may have success in the tools arena if you talk to mgorny or portage-utils@; I know mgorny has written a few C tools and might have sufficient 'gentoo' C libraries you could utilize; the portage-utils alias holds the portage-utils authors (portage-utils being another set of tools written in C for gentoo.) I actually liked cbergstrom's idea of toolchain-type stuff; but I'm not really sure how easy it is to on-board with those communities (lord knows in my senior year of CS I would have been useless working on a compiler.) -- Thanks and Regards , Gaurav
Re: [gentoo-dev] Six month major project on Gentoo
i'd be really happy if someone took care of https://bugs.gentoo.org/150031 : include more info about binpkg in file name
Re: [gentoo-dev] Six month major project on Gentoo
On Wed, Dec 14, 2011 at 1:25 PM, Leho Kraav l...@kraav.com wrote: i'd be really happy if someone took care of https://bugs.gentoo.org/150031 : include more info about binpkg in file name That is great, but its not a 6 month project... -A
Re: [gentoo-dev] Six month major project on Gentoo
On Wednesday 14 December 2011 18:43:33 Alec Warner wrote: On Wed, Dec 14, 2011 at 1:25 PM, Leho Kraav l...@kraav.com wrote: i'd be really happy if someone took care of https://bugs.gentoo.org/150031 : include more info about binpkg in file name That is great, but its not a 6 month project... is it though ? i'm inclined to mark INVALID. hijacking filenames for metadata is a tuuurrible idea. -mike signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] Six month major project on Gentoo
On Thu, Dec 15, 2011 at 5:28 AM, Mike Frysinger vap...@gentoo.org wrote: On Wednesday 14 December 2011 18:43:33 Alec Warner wrote: On Wed, Dec 14, 2011 at 1:25 PM, Leho Kraav l...@kraav.com wrote: i'd be really happy if someone took care of https://bugs.gentoo.org/150031 : include more info about binpkg in file name That is great, but its not a 6 month project... is it though ? i'm inclined to mark INVALID. hijacking filenames for metadata is a tuuurrible idea. I agree. It's along the same lines as only using file extensions for determining the filetype (and we all know how that turned out...). It *does* have the advantage of being really fast, though. Nevertheless, the basic bug is about changing the distfile repository format in such a way that a single repo can contain several distfiles built with differing build conditions. Putting metadata in the filename is only one way of ensuring that. -- ~Nirbheek Chauhan Gentoo GNOME+Mozilla Team