-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The following message is intended to describe how I prioritize GSoC applications; it is intended to be useful as a basis for students to improve their chances of getting accepted. It's my assumption that serious GSoC applicants will already be subscribed to this list, or at least be reading it through the archives or gmane's news portal.
The GNU Project will not be informed on the number of student slots it will receive, until some time after the application period has closed. Once the GNU Project's GSoC administrator has been informed of this, the slots will be doled out to individual GNU projects as appropriate. I have been informed that, based on experience from last year, the administrator believes it is likely that Wget will receive just one student slot. I am personally hopeful that we'll be allotted at least two, but I may be forced to choose just one. If that's the case, it will be a truly heart-wrenching decision, as there are already several quite-excellent proposals. This being the case, please bear in mind that if you are not accepted this year, it doesn't mean that I wasn't interested in your application. You may have missed getting a slot by a very thin margin: competition is close. Please don't get discouraged, and please do consider submitting your proposal to Wget again next year, if you are still an eligible student. The primary motivation for the policies by which I intend to rate applications, is selfish: what is most likely to benefit Wget, both now and in the long term? To this end, the better an asset a given student appears phe may become to the GNU Wget development team, and the more important the proposed work is to Wget, the more attractive the application is to me. So, without further ado, the specific factors I've been using to evaluate existing applications. These are _not_ listed in order of priority: they all contribute together towards the final decision. Proficiency as a Programmer - --------------------------- People with significant experience and expertise in coding with the C programming language, who can demonstrate that they've learned good coding habits and can avoid common pitfalls in C, are likelier to get more work done at a higher level of quality, in the period of time allotted to them for coding. It also means less hand-holding to me, which is attractive because I already have insufficient time to do the work that needs to be done for Wget. An alternative approach would of course be that I should give consideration to willing and eager, but less-experienced students, so that they may have the benefit of enhanced knowledge and experience, and become overall better coders. However, the reason for Wget's involvement in GSoC is a selfish one: to better Wget. Bettering others is of course desirable as a secondary goal; but it cannot be our primary goal. Therefore, we are looking for students who are already well-studied, wherever possible. It is of course impossible for me to establish how proficient someone is with C unless they supply me with example source code. This can be examples of patches, contributed to Wget or other projects; however, unless these patches contribute significant additional code, rather than being mainly tweaks to existing functionality, they'll tell me relatively little. The ideal would be to link to complete programs or libraries that the student has written. It's OK if it's code you now find embarrassing: I'm interested in what kind of a coder you are _now_, not what kind of a coder you once were. If you'd do things differently today, explain _how_. But please give me code. :) Aptitude for the Proposed Task - ------------------------------ In addition to understanding how to write in C, you need to understand the problem domain of whatever it is you're proposing to do, and how it might be solved. Are you working on HTTP authentication? Your proposal should demonstrate understanding of RFC 2617, and probably the underlying security principles. Implementing internationalization enhancements? I need to see, in the detailed description, or at least the public comment threads, a sufficient description of the solution as to instill confidence that you understand the basics of technologies such as UTF-8 encoding, handling transcoding where appropriate, etc. Is your proposal likely to require organization of medium-to-large quantities of data? I'd like to see that you have a solid understanding of algorithms and ADTs, and that you have an idea as to which ones might be appropriate for storing and looking up the data you'll be using. * Please note, the above examples are _examples_; they are not meant to imply anything about whether the students who have actually submitted proposals for those features have failed to demonstrate aptitude. The specific cases of HTTP auth and i18n each have exactly one proposal so far, and I already have confidence in both students involved that they possess the requisite understanding and skillset. Understanding of Wget's Source Base - ----------------------------------- Prior involvement and familiarity with Wget is desirable. My evaluation of a student application will definitely be augmented if I know that a student has already familiarized perself with Wget's internals. Obviously, having submitted or discussed patches on the list demonstrates such familiarity; if that's not applicable, then briefly touching on how your enhancements will interact with or change existing components within Wget will be helpful. I do _not_ want low-level details: just a very high-level description that is sufficient to demonstrate that you understand what you'll need to do. Community Involvement - --------------------- Likewise, participation and communication with the existing Wget community (such as it is) is important. If you start posting questions related to your project, to the mailing list and on IRC, it demonstrates to me that you are actively engaged, and especially that my communication as a GSoC mentor with you will be easy, since we're already in communication on a regular basis. You will also earn brownie points if I see that you are answering other people's questions on the list or on IRC. It's a clear indication that you will have value to Wget above and beyond your ability to write code for us. :) It's also a good opportunity to demonstrate an existing understanding of Wget. Importance of Enhancement to Wget - --------------------------------- Again, in the interests of selfishness, I need to give priority to enhancements that Wget _really_ needs _right now_, over those that Wget will need at some point, or ones that are sexy but not critical. If push comes to shove, improving Wget's security (HTTP Auth or SSL/TLS) is more important than getting sexy features in like regex support in acc/rej lists. The MetaDataBase (session info db) probably falls somewhere between: it's a mostly a "sexy" feature; but it offers some fairly huge potential benefits (particularly, the potential to unambiguously determine what the local filename is for a given URI, and the ability to continue from an aborted Wget session). OTOH, it's not critical to Wget in the way that working authentication and internationalization (given that non-ASCII TLDs are being introduced this year) are. Obviously, importance by itself can't be the only factor. If a student's proposal for a critical feature is of significantly lower quality than a proposal for a "nice-to-have" feature, I'll probably go for the "nice-to-have", and implement the critical one myself. Also, if I feel that the student with a critical-feature proposal is less likely to continue their involvement with Wget in the long term, than a student proposing a less-critical feature, it's not unlikely I'll go with the student who I expect to be a long-term asset to the project: the ideal goal is not that we get two months of free work from a couple of students, but that each year at GSoC we will have added new and valuable members to the Wget community. (It is of course my wish that some students will opt to contribute in helpful ways to Wget even if their proposal is not accepted, which would certainly reflect well for them when next year's GSoC comes around.) Sufficient Amount of Work - ------------------------- Obviously, I'm less likely to approve a proposal whose workload I do not believe will fill at least most of a two-month development period. If your proposed enhancements sound to me like something I could do in two weeks, I'm not apt to go for it. This is mainly a theoretical observation: if I feel that your proposal does not represent an appropriate workload, I will post a comment indicating this, and all you have to do to make things right is to add enough additional material based on my feedback. On the other hand, I have received proposals that offer the moon and the stars, and maybe a good-sized bit of the sun as well. A little over-zealousness is fine: I'll post a comment suggesting that you ease your workload a bit. There are some very excellent proposals that fit that category, and the students have worked with me to balance the amount of work involved. However, a student whose proposal includes fifteen or twenty enhancements - some of which would take an entire GSoC or more just take individually - demonstrates to me a basic lack of understanding of the problems and their solutions (lack of "Aptitude for the Proposed Task(s)"). There do exist coders capable of near-superhuman feats of productivity: certainly the hacker culture tends to encourage such people; and I do not wish to underestimate the creative power of caffeinated-and-enthusiastic students on summer break, engaged in coding for Wget full-time (or, if particularly enthusiastic, quite possibly more than full-time). But, if you want me to believe that you have shockingly prodigious levels of productivity, you need to prove it to me. If you have promised me what looks like two years of work in two-to-three months, I need you to give me either documented proof that you have performed equivalently in the past, or you should spend a couple days to work on the code _now_, so I can see the resulting two weeks' worth of productivity for myself. Barring that, I am very likely to treat enormous workloads as a sign of ineptitude rather than of extreme productivity. Call me a skeptic. Back to ensuring a _sufficient_ quantity of work, though: I understand that summer here is not summer everywhere: it may be Winter where you are, with school responsibilities still going full-bore. Or you may live in a country where school exams are still in progress until nearly up to the GSoC midterm evaluations. These are not the end of the world; in particular, if you have already familiarized yourself with Wget's source code and begun communication with me about how to go about it, you can probably take advantage of the month Google has graciously provided as the "Community Bonding" period to get some actual code accomplished. Be that as it may, please note that the GSoC program expects that students are working _full-time_ on their proposed projects, and that submitting a proposal is an implicit affirmation that you have the free time to get the work done. Yes, you need to eat. Yes, you need to do well in school so you can succeed in life. Yes, it sucks if you live in a country where it's not summer and it's not a break, and it's not fair that you're made to compete with North Americans who have all the time they need to dedicate to GSoC. But none of this means that I can afford to give you special consideration: if you can't do the job, don't apply for it. As I've said, our interests are intrinsically selfish, and it is not in our best interests to consider someone who has four free hours a day to dedicate to a full-time job, and probably isn't taking into account the unfairly large amounts of homework their professors will assign to them. Nor is it in Google's best interests if they're spending money on stipends for people who will not accomplish their tasks. If you submit a great proposal for something that might be 7 weeks' work rather than 9, I'm prepared to be lenient. If you inform me that you will not have a lot of free time to dedicate to Wget during final exams, but will be able to get some extra work done ahead of the official GSoC start, this too may be alright. I might even be prepared to go on two weeks of quality work for your midterm evaluation instead of four, on the understanding (and confidence) that you will be able to dedicate yourself to Wget and produce six weeks' work out of the next four. What I am _not_ prepared to do is agree to let you shift your responsibilities by a few weeks so that you don't have much to show for the midterm evaluations and I'm forced to "take it on faith" that you'll perform as promised for two months starting there. You and I can negotiate on what exactly is to be done by midterm, but there must be something substantial enough for me to actually evaluate your work and progress. Students who fall well short of the work they agreed to have done by midterm, _will_ be dropped, and will not receive the rest of their stipend from Google. Informative Proposal - -------------------- Related to the previous point, it is very important that your proposal specify, very clearly, what will have been accomplished by the midterm and final evaluations. The evaluation of your performance, and my recommendation that you receive your stipend, will be based on the work you and I have agreed upon for you to do. If this information is missing from your proposal, or too vague, you'll be asked to clarify it. As previously mentioned, your proposal needs to demonstrate a basic understanding of the problem you're trying to solve. It's an informal requirements document, but should also be a very high-level design document. I need to understand not only what you are going to do, but how you are going to do it. As mentioned before, I will generally expect to know what types of structures and algorithmic tools you'll need to use to accomplish your job, and how your work will incorporate relevant standards and RFCs. Clarifying what you do _not_ intend to accomplish might also be a good idea. When I say I want to know what types of structures you'll use, I'm talking about Abstract Data Types (see your Sedgewick or Knuth textbooks, or check Wikipedia). It does _not_ mean that I want details of what your C structs will be named, or what your functions will be named, or what the names of your files will be, or snippets of C code. That's all good stuff for us to use in discussion; it's generally not appropriate for a proposal. Please try to focus on how Wget will use it, as opposed to how it will look in code. I am not concerned with minor typographical errors and grammatical errors. Your English should be good enough that it's not a hindrance to our communication; it does not need to be excellent. . Ideas for Next Year - ------------------- While I think much of the above should have been relatively obvious to one degree or another, some of it may reflect things you didn't necessarily think of when you first submitted your proposal, and I apologize that it was not available to you sooner (I was not entirely aware of all of these things from the start, myself). If you have ideas about ways to improve the information I give you next year, so you have the tools at your disposal to write the most attractive proposal possible, please feel free to suggest them. It occurs to me that, since I've placed some importance on the amount of workload, it might be useful in the future if I attach estimated completion times for the features I list on our ideas page. I'll try to do this in the future. It might also be a good idea for me to prepare a questionnaire of programming- and HTTP-related questions, to discover a student's knowledge and understanding of essential protocols and technologies. Explicitly noting the degree of importance for each proposal idea would help students to choose features that give them a better chance of being accepted. I'm not sure I want to do that, though, because I'm worried that this might result in all the proposals being for the top couple of features; if I'm able to get more than a couple slots in future years, it would make it more difficult for me to fill third and fourth slots if there are only proposals for two features. In other words, it may give an advantage to students in that they can help ensure they're on an even-footing in terms of what feature they've chosen to implement, but it may prove to be a disadvantage to Wget, in terms of the variety of applications to choose from. I'm inclined not to advertise the priority explicitly, and let the students themselves figure out the likely importance (though I'm not adverse to spelling it out when asked). It's not all that difficult to reason out, at any rate. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer, and GNU Wget Project Maintainer. http://micah.cowan.name/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFH8/uk7M8hyUobTrERAizZAJ95PJcwSBoLvpp1ov7xo5VGIeZAtgCggCKk Ar2VHnUAFpsjSUPFUaODMfw= =pmPz -----END PGP SIGNATURE-----