[nlpatumd] Advice for future MS thesis students

2018-02-12 Thread Ted Pedersen tpede...@d.umn.edu [nlpatumd]
These are just a a few notes for new CS graduate students who are
working with me on their MS thesis research, or would like to do so.

Your most important first step will be to take CS 5761, Introduction
to Natural Language Processing, in the Fall semester of your first
year. This class is typically offered each Fall, and it is important
that you take it in your first semster. This will help you get
oriented to NLP, and will in the end save you a lot of time as you get
up to speed preparing to do your thesis research. This class will
count towards your CS electives in your MS degree.

Any CS graduate student working with me as their advisor will be
expected to complete a Plan A Thesis. While there is a Plan B project
option, I have only allowed that in a few (two!) very special cases
where a student joins me only during their second and final year in
the CS program.

We will work together to identify a suitable research topic, but in
general it will be in the area of Natural Language Processing and/or
Computational Linguistics. You can get a very good idea of the kind of
research we do by looking at previous MS student's work :

http://www.d.umn.edu/~tpederse/masters.html

You should also look at the various software tools we have created
over the years, and even give them a try. Some of them have web
interfaces, others of them must be installed on a Linux system and run
from the command line. You can find those by browsing around my web
pages ( http://www.d.umn.edu/~tpederse ) If we have done our job well,
you should be able to install and use these packages simply by
consulting the documentation. Let us know how well we have met that
goal, and suggestions for improvement are always welcome.

Most of the software packages have mailing lists associated with them,
and you are encouraged to join any and all that you find interesting.
In addition, there is a NLP @ UMD mailing list that I ask all students
who work with me to join, and anyone else who is interested in what we
do here is surely welcome. This is a place where we make announcements
and point out issues in the news that might be of interest. Please
sign up here :

https://groups.google.com/forum/#!forum/duluthnlp

In addition to your required classes, it is possible to take free
electives while you are a graduate student - this is included in your
TA or RA funding in fact! These are typically 5000 level classes (or
below) that do not count towards your degree (but you can still take
them for credit and grades). This could include areas of personal
interest (such as Theater or Physical Education), or other academic
areas (such as Mathematics, Economics, Linguistics, Psychology, or
Engineering Management). As long as you are doing well in your
required classes and your thesis research, I will very likely approve.

It will take you four semesters of hard work to complete your thesis.
We will start working in the first weeks of your first year, and we'll
continue through to the end of your second year. Your course load at
UMD is relatively light (only 6 courses over 4 semesters) - that is to
allow you to spend significant time on your thesis work each semester.

You should plan on finishing your thesis by May of your second year.
This will require you to work steadily on your thesis through your
four semesters at UMD. If you do this, you will certainly be able to
finish your thesis by May of your second year. This will enable you to
move on to whatever comes next most conveniently and without
distractions or interruption. Students who delay finishing their
thesis until the summer of their second year or later often have
unexpected difficulties in making the transition to whatever is coming
next.

We will create a plan during your first semester that will give you a
realistic schedule that allows you to finish by May of your second
year. It is really your responsibility to make sure you both
understand and follow that plan. Also note that in some summers I may
not be available to supervise or review your thesis work, so if you do
not finish by May I may not be available to you until the following
September.

Your thesis work will inevitably include programming (typically in
Perl or Python). Any code that is used to produce results that are
either published or that appear in your thesis *must* be released as
open-source. The motivations behind this policy are described in a
short piece that appeared in Computational Linguistics in 2008 :

http://www.d.umn.edu/~tpederse/Pubs/pedersen-last-word-2008.pdf

This philosophy is central to much of what we do, so please make sure
you go over the above very carefully.

We will write your thesis as we go, and will initially focus on
outlining the background of your selected thesis topic - what is the
problem, how are you intending to solve that problem, how have others
tried to solve it, and what resources do you need in order to carry
out your work. As we go on we will add details about our methods and

[nlpatumd] Advice for Future MS Thesis Students

2015-06-14 Thread Ted Pedersen duluth...@gmail.com [nlpatumd]
These are just a a few notes for new CS graduate students who are
working with me on their MS thesis research, or would like to do so.


Your most important first step will be to take CS 5761, Introduction
to Natural Language Processing, in the Fall semester of your first
year. This class is typically offered each Fall, and it is important
that you take it in your first semster. This will help you get
oriented to NLP, and will in the end save you a lot of time as you get
up to speed preparing to do your thesis research. This class will
count towards your CS electives in your major, and does  NOT duplicate
CS 8761, Natural Language Processing, which is a graduate level class
you will take (in the Fall of your second year).


Any CS graduate student working with me as their advisor will be
required to complete a Plan A Thesis. While there is a Plan B project
option, I have only allowed that in a few (two!) very special cases
where a student joins me only during their second and final year in
the CS program.


We will work together to identify a suitable research topic, but in
general it will be in the area of Natural Language Processing and/or
Computational Linguistics. You can get a very good idea of the kind of
research we do by looking at previous MS student's work :


http://www.d.umn.edu/~tpederse/masters.html


You should also look at the various software tools we have created
over the years, and even give them a try. Some of them have web
interfaces, others of them must be installed on a Linux system and run
from the command line. You can find those by browsing around my web
pages ( http://www.d.umn.edu/~tpederse ) If we have done our job well,
you should be able to install and use these packages simply by
consulting the documentation. Let us know how well we have met that
goal, and suggestions for improvement are always welcome.


Most of the software packages have mailing lists associated with them,
and you are encouraged to join any and all that you find interesting.
In addition, there is a NLP @ UMD mailing list that I ask all students
who work with me to join, and anyone else who is interested in what we
do here is surely welcome. This is a place where we make announcements
and point out issues in the news that might be of interest. Please
sign up here :


http://groups.yahoo.com/group/nlpatumd/


In addition to completing an original piece of research (ie your
thesis), you will be required to take four 8000 level CS classes,
plus two additional classes that may be either out of department or
within the CS department. One of your additional classes should be CS
5761, Introduction to Natural Language Processing, and the other may
be of your own choosing.


It is possible to take free electives while you are a graduate student
- this is included in your TA or RA funding in fact! These are
typically 5000 level classes (or below) that do not count towards your
degree (but you can still take them for credit and grades). This could
include areas of personal interest (such as Theater or Physical
Education), or other academic areas (such as Mathematics, Economics,
Linguistics, Psychology, or Engineering Management). As long as you
are doing well in your required classes and your thesis research, I
will very likely approve.


It will take you four semesters of hard work to complete your thesis.
We will start working in the first weeks of your first year, and we'll
continue through to the end of your second year. Your course load at
UMD is relatively light (only 6 courses over 4 semesters) - that is to
allow you to spend significant time on your thesis work each semester.


You should plan on finishing your thesis by May of your second year.
This will require you to work steadily on your thesis through your
four semesters at UMD. If you do this, you will certainly be able to
finish your thesis by May of your second year. This will enable you to
move on to whatever comes next most conveniently and without
distractions or interruption. Students who delay finishing their
thesis until the summer of their second year or later often have
unexpected difficulties in making the transition to whatever is coming
next. We will create a plan during your first year (in your thesis
proposal) that will give you a realistic schedule that allows you to
finish by May of your second year. It is really your responsibility to
make sure you both understand and follow that plan. Also note that in
some summers I may not be available to supervise or review your thesis
work, so if you do not finish by May I may not be available to you
until the following September.


Your thesis work will inevitably include programming (typically in
Perl or Python). Any code that is used to produce results that are
either published or that appear in your thesis *must* be released as
open-source. The motivations behind this policy are described in a
short piece that appeared in Computational Linguistics in 2008 :