-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The following message is intended to describe how I prioritize GSoC
applications; it is intended to be useful as a basis for students to
improve their chances of getting accepted. It's my assumption that
serious GSoC applicants will already be subscribed to this list, or at
least be reading it through the archives or gmane's news portal.

The GNU Project will not be informed on the number of student slots it
will receive, until some time after the application period has closed.
Once the GNU Project's GSoC administrator has been informed of this, the
slots will be doled out to individual GNU projects as appropriate.

I have been informed that, based on experience from last year, the
administrator believes it is likely that Wget will receive just one
student slot. I am personally hopeful that we'll be allotted at least
two, but I may be forced to choose just one. If that's the case, it will
be a truly heart-wrenching decision, as there are already several
quite-excellent proposals.

This being the case, please bear in mind that if you are not accepted
this year, it doesn't mean that I wasn't interested in your application.
You may have missed getting a slot by a very thin margin: competition is
close. Please don't get discouraged, and please do consider submitting
your proposal to Wget again next year, if you are still an eligible student.

The primary motivation for the policies by which I intend to rate
applications, is selfish: what is most likely to benefit Wget, both now
and in the long term? To this end, the better an asset a given student
appears phe may become to the GNU Wget development team, and the more
important the proposed work is to Wget, the more attractive the
application is to me.

So, without further ado, the specific factors I've been using to
evaluate existing applications. These are _not_ listed in order of
priority: they all contribute together towards the final decision.

Proficiency as a Programmer
- ---------------------------

People with significant experience and expertise in coding with the C
programming language, who can demonstrate that they've learned good
coding habits and can avoid common pitfalls in C, are likelier to get
more work done at a higher level of quality, in the period of time
allotted to them for coding. It also means less hand-holding to me,
which is attractive because I already have insufficient time to do the
work that needs to be done for Wget.

An alternative approach would of course be that I should give
consideration to willing and eager, but less-experienced students, so
that they may have the benefit of enhanced knowledge and experience, and
become overall better coders. However, the reason for Wget's involvement
in GSoC is a selfish one: to better Wget. Bettering others is of course
desirable as a secondary goal; but it cannot be our primary goal.
Therefore, we are looking for students who are already well-studied,
wherever possible.

It is of course impossible for me to establish how proficient someone is
with C unless they supply me with example source code. This can be
examples of patches, contributed to Wget or other projects; however,
unless these patches contribute significant additional code, rather than
being mainly tweaks to existing functionality, they'll tell me
relatively little. The ideal would be to link to complete programs or
libraries that the student has written. It's OK if it's code you now
find embarrassing: I'm interested in what kind of a coder you are _now_,
not what kind of a coder you once were. If you'd do things differently
today, explain _how_. But please give me code. :)

Aptitude for the Proposed Task
- ------------------------------

In addition to understanding how to write in C, you need to understand
the problem domain of whatever it is you're proposing to do, and how it
might be solved. Are you working on HTTP authentication? Your proposal
should demonstrate understanding of RFC 2617, and probably the
underlying security principles. Implementing internationalization
enhancements? I need to see, in the detailed description, or at least
the public comment threads, a sufficient description of the solution as
to instill confidence that you understand the basics of technologies
such as UTF-8 encoding, handling transcoding where appropriate, etc. Is
your proposal likely to require organization of medium-to-large
quantities of data? I'd like to see that you have a solid understanding
of algorithms and ADTs, and that you have an idea as to which ones might
be appropriate for storing and looking up the data you'll be using.

* Please note, the above examples are _examples_; they are not meant to
imply anything about whether the students who have actually submitted
proposals for those features have failed to demonstrate aptitude. The
specific cases of HTTP auth and i18n each have exactly one proposal so
far, and I already have confidence in both students involved that they
possess the requisite understanding and skillset.

Understanding of Wget's Source Base
- -----------------------------------

Prior involvement and familiarity with Wget is desirable. My evaluation
of a student application will definitely be augmented if I know that a
student has already familiarized perself with Wget's internals.
Obviously, having submitted or discussed patches on the list
demonstrates such familiarity; if that's not applicable, then briefly
touching on how your enhancements will interact with or change existing
components within Wget will be helpful. I do _not_ want low-level
details: just a very high-level description that is sufficient to
demonstrate that you understand what you'll need to do.

Community Involvement
- ---------------------

Likewise, participation and communication with the existing Wget
community (such as it is) is important. If you start posting questions
related to your project, to the mailing list and on IRC, it demonstrates
to me that you are actively engaged, and especially that my
communication as a GSoC mentor with you will be easy, since we're
already in communication on a regular basis.

You will also earn brownie points if I see that you are answering other
people's questions on the list or on IRC. It's a clear indication that
you will have value to Wget above and beyond your ability to write code
for us. :) It's also a good opportunity to demonstrate an existing
understanding of Wget.

Importance of Enhancement to Wget
- ---------------------------------

Again, in the interests of selfishness, I need to give priority to
enhancements that Wget _really_ needs _right now_, over those that Wget
will need at some point, or ones that are sexy but not critical. If push
comes to shove, improving Wget's security (HTTP Auth or SSL/TLS) is more
important than getting sexy features in like regex support in acc/rej
lists. The MetaDataBase (session info db) probably falls somewhere
between: it's a mostly a "sexy" feature; but it offers some fairly huge
potential benefits (particularly, the potential to unambiguously
determine what the local filename is for a given URI, and the ability to
continue from an aborted Wget session). OTOH, it's not critical to Wget
in the way that working authentication and internationalization (given
that non-ASCII TLDs are being introduced this year) are.

Obviously, importance by itself can't be the only factor. If a student's
proposal for a critical feature is of significantly lower quality than a
proposal for a "nice-to-have" feature, I'll probably go for the
"nice-to-have", and implement the critical one myself. Also, if I feel
that the student with a critical-feature proposal is less likely to
continue their involvement with Wget in the long term, than a student
proposing a less-critical feature, it's not unlikely I'll go with the
student who I expect to be a long-term asset to the project: the ideal
goal is not that we get two months of free work from a couple of
students, but that each year at GSoC we will have added new and valuable
members to the Wget community. (It is of course my wish that some
students will opt to contribute in helpful ways to Wget even if their
proposal is not accepted, which would certainly reflect well for them
when next year's GSoC comes around.)

Sufficient Amount of Work
- -------------------------

Obviously, I'm less likely to approve a proposal whose workload I do not
believe will fill at least most of a two-month development period. If
your proposed enhancements sound to me like something I could do in two
weeks, I'm not apt to go for it. This is mainly a theoretical
observation: if I feel that your proposal does not represent an
appropriate workload, I will post a comment indicating this, and all you
have to do to make things right is to add enough additional material
based on my feedback.

On the other hand, I have received proposals that offer the moon and the
stars, and maybe a good-sized bit of the sun as well. A little
over-zealousness is fine: I'll post a comment suggesting that you ease
your workload a bit. There are some very excellent proposals that fit
that category, and the students have worked with me to balance the
amount of work involved.

However, a student whose proposal includes fifteen or twenty
enhancements - some of which would take an entire GSoC or more just take
individually - demonstrates to me a basic lack of understanding of the
problems and their solutions (lack of "Aptitude for the Proposed
Task(s)"). There do exist coders capable of near-superhuman feats of
productivity: certainly the hacker culture tends to encourage such
people; and I do not wish to underestimate the creative power of
caffeinated-and-enthusiastic students on summer break, engaged in coding
for Wget full-time (or, if particularly enthusiastic, quite possibly
more than full-time).

But, if you want me to believe that you have shockingly prodigious
levels of productivity, you need to prove it to me. If you have promised
me what looks like two years of work in two-to-three months, I need you
to give me either documented proof that you have performed equivalently
in the past, or you should spend a couple days to work on the code
_now_, so I can see the resulting two weeks' worth of productivity for
myself. Barring that, I am very likely to treat enormous workloads as a
sign of ineptitude rather than of extreme productivity. Call me a skeptic.

Back to ensuring a _sufficient_ quantity of work, though: I understand
that summer here is not summer everywhere: it may be Winter where you
are, with school responsibilities still going full-bore. Or you may live
in a country where school exams are still in progress until nearly up to
the GSoC midterm evaluations. These are not the end of the world; in
particular, if you have already familiarized yourself with Wget's source
code and begun communication with me about how to go about it, you can
probably take advantage of the month Google has graciously provided as
the "Community Bonding" period to get some actual code accomplished.

Be that as it may, please note that the GSoC program expects that
students are working _full-time_ on their proposed projects, and that
submitting a proposal is an implicit affirmation that you have the free
time to get the work done. Yes, you need to eat. Yes, you need to do
well in school so you can succeed in life. Yes, it sucks if you live in
a country where it's not summer and it's not a break, and it's not fair
that you're made to compete with North Americans who have all the time
they need to dedicate to GSoC. But none of this means that I can afford
to give you special consideration: if you can't do the job, don't apply
for it. As I've said, our interests are intrinsically selfish, and it is
not in our best interests to consider someone who has four free hours a
day to dedicate to a full-time job, and probably isn't taking into
account the unfairly large amounts of homework their professors will
assign to them. Nor is it in Google's best interests if they're spending
money on stipends for people who will not accomplish their tasks.

If you submit a great proposal for something that might be 7 weeks' work
rather than 9, I'm prepared to be lenient. If you inform me that you
will not have a lot of free time to dedicate to Wget during final exams,
but will be able to get some extra work done ahead of the official GSoC
start, this too may be alright. I might even be prepared to go on two
weeks of quality work for your midterm evaluation instead of four, on
the understanding (and confidence) that you will be able to dedicate
yourself to Wget and produce six weeks' work out of the next four. What
I am _not_ prepared to do is agree to let you shift your
responsibilities by a few weeks so that you don't have much to show for
the midterm evaluations and I'm forced to "take it on faith" that you'll
perform as promised for two months starting there. You and I can
negotiate on what exactly is to be done by midterm, but there must be
something substantial enough for me to actually evaluate your work and
progress. Students who fall well short of the work they agreed to have
done by midterm, _will_ be dropped, and will not receive the rest of
their stipend from Google.

Informative Proposal
- --------------------

Related to the previous point, it is very important that your proposal
specify, very clearly, what will have been accomplished by the midterm
and final evaluations. The evaluation of your performance, and my
recommendation that you receive your stipend, will be based on the work
you and I have agreed upon for you to do. If this information is missing
from your proposal, or too vague, you'll be asked to clarify it.

As previously mentioned, your proposal needs to demonstrate a basic
understanding of the problem you're trying to solve. It's an informal
requirements document, but should also be a very high-level design
document. I need to understand not only what you are going to do, but
how you are going to do it. As mentioned before, I will generally expect
to know what types of structures and algorithmic tools you'll need to
use to accomplish your job, and how your work will incorporate relevant
standards and RFCs. Clarifying what you do _not_ intend to accomplish
might also be a good idea.

When I say I want to know what types of structures you'll use, I'm
talking about Abstract Data Types (see your Sedgewick or Knuth
textbooks, or check Wikipedia). It does _not_ mean that I want details
of what your C structs will be named, or what your functions will be
named, or what the names of your files will be, or snippets of C code.
That's all good stuff for us to use in discussion; it's generally not
appropriate for a proposal. Please try to focus on how Wget will use it,
as opposed to how it will look in code.

I am not concerned with minor typographical errors and grammatical
errors. Your English should be good enough that it's not a hindrance to
our communication; it does not need to be excellent.

.

Ideas for Next Year
- -------------------

While I think much of the above should have been relatively obvious to
one degree or another, some of it may reflect things you didn't
necessarily think of when you first submitted your proposal, and I
apologize that it was not available to you sooner (I was not entirely
aware of all of these things from the start, myself). If you have ideas
about ways to improve the information I give you next year, so you have
the tools at your disposal to write the most attractive proposal
possible, please feel free to suggest them.

It occurs to me that, since I've placed some importance on the amount of
workload, it might be useful in the future if I attach estimated
completion times for the features I list on our ideas page. I'll try to
do this in the future.

It might also be a good idea for me to prepare a questionnaire of
programming- and HTTP-related questions, to discover a student's
knowledge and understanding of essential protocols and technologies.

Explicitly noting the degree of importance for each proposal idea would
help students to choose features that give them a better chance of being
accepted. I'm not sure I want to do that, though, because I'm worried
that this might result in all the proposals being for the top couple of
features; if I'm able to get more than a couple slots in future years,
it would make it more difficult for me to fill third and fourth slots if
there are only proposals for two features. In other words, it may give
an advantage to students in that they can help ensure they're on an
even-footing in terms of what feature they've chosen to implement, but
it may prove to be a disadvantage to Wget, in terms of the variety of
applications to choose from.

I'm inclined not to advertise the priority explicitly, and let the
students themselves figure out the likely importance (though I'm not
adverse to spelling it out when asked). It's not all that difficult to
reason out, at any rate.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer,
and GNU Wget Project Maintainer.
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFH8/uk7M8hyUobTrERAizZAJ95PJcwSBoLvpp1ov7xo5VGIeZAtgCggCKk
Ar2VHnUAFpsjSUPFUaODMfw=
=pmPz
-----END PGP SIGNATURE-----

Reply via email to