INTRODUCTION TO PYTHON FOR BIOLOGISTS
This course will run from 21st - 25th May 2018 in Glasgow and is being
delivered by Dr Martin Jones, an expert in Python and author of two
Python for Biologists
Advanced Python for Biologists
Course overview: Python is a dynamic, readable language that is a
popular platform for all types of bioinformatics work, from simple
one-off scripts to large, complex software projects. This workshop is
aimed at complete beginners and assumes no prior programming
experience. It gives an overview of the language with an emphasis on
practical problem-solving, using examples and exercises drawn from
various aspects of bioinformatics work. After completing the workshop,
students should be in a position to (1) apply the skills they have
learned to tackle problems in their own research and (2) continue their
Python education in a self-directed way.
This workshop is aimed at all researchers and technical workers with a
background in biology who want to learn programming. The syllabus has
been planned with complete beginners in mind; people with previous
programming experience are welcome to attend as a refresher but may
find the pace a bit slow.
The workshop is delivered over ten half-day sessions (see the detailed
curriculum below). Each session consists of roughly a one hour lecture
followed by two hours of practical exercises, with breaks at the
organizer’s discretion. There will also be plenty of time for students
to discuss their own problems and data.
Students should have enough biological background to appreciate the
examples and exercise problems (i.e. they should know about DNA and
protein sequences, what translation is, and what introns and exons
are). No previous programming experience or computer skills (beyond the
ability to use a text editor) are necessary, but you'll need to have a
laptop with Python installed.
Module 1: Introduction.
We will start with a general introduction to Python and explain why it
is useful and how learning to program can benefit your research. Some
time will be taken to explain the format of the course. We will outline
the edit-run-fix cycle of software development and talk about how to
avoid common text editing errors. In this session, we also check that
the computing infrastructure for the rest of the course is in
Core concepts introduced: source code; text editors; whitespace;
syntax and syntax error; and Python versions.
Module 2: Output and text manipulation.
This session will show students how to write very simple programs that
produce output to the terminal and in doing so become comfortable with
editing and running Python code. This session also introduces many of
the technical terms that we’ll rely on in future sessions. We will run
through some examples of tools for working with text and show how they
work in the context of biological sequence manipulation. We also cover
different types of errors and error messages and learn how to go about
fixing them methodically.
Core concepts introduced: terminals; standard
output; variables and naming;
strings and characters; special
characters; output formatting; statements; functions; methods;
Module 3: File IO and user interfaces.
We will discuss about the importance of files in bioinformatics
pipelines and workflows during this session, and we then explore the
Python interfaces for reading from and writing to files. This involves
introducing the idea of types and objects and a bit of discussion about
how Python interacts with the operating system. The practical session
is spent combining the techniques from session 2 with the file IO tools
to create basic file-processing scripts.
Core concepts introduced:
objects and classes; paths and folders; relationships between variables
and values; text and binary files; newlines.
Module 4: Flow control 1: loops.
A discussion of the limitations of the techniques learned in session 3
quickly reveals that flow control is required to write more
sophisticated file-processing programs, at this point we will progress
on to the concept of loops. We look at the way in which Python loops
work, and how they can be used in a variety of contexts.
We explore the
use of loops and lists together to tackle some more difficult
Core concepts introduced: lists and arrays; blocks and
indentation; variable scoping; iteration and the iteration interface;
Module 5: Flow control 2: conditionals.
We will use the idea of decision-making in session 5 as a way to
introduce conditional tests and outline the different building-blocks
of conditions before showing how conditions can be combined in an
expressive way. We look at the different ways that we can use
conditions to control program flow, and how we can structure conditions
to keep programs readable.
Core concepts introduced: Truth and
falsehood; Boolean logic; identity
and equality; evaluation of
Module 6: Organizing and structuring code.
In session 6 we will discuss functions that we would like to see in
Python before considering how we can add to our computational toolbox
by creating our own. We examine the nuts and bolts of writing functions
before looking at best-practice ways of making them usable. We also
look at a couple of advanced features of Python – named arguments and
Core concepts introduced: argument passing; encapsulation;
data flow through a program.
Module 7: Regular expressions.
A range of common problems in bioinformatics can be described in terms
of pattern matching; we will discuss these and give an overview of
Python’s regex tools. We look at the building blocks of regular
expressions themselves, and learn how they are a general solution to
the problem of describing patterns in strings, before practising
writing some specific examples of regular expressions.
introduced: domain-specific languages; sessions and namespaces.
Module 8: Dictionaries.
We discuss a few examples of key-value data and see how the problem of
storing them is a common one across bioinformatics and programming in
general. We learn about the syntax for dictionary creation and
manipulation before talking about the situations in which dictionaries
are a better fit that the data structures we have learned about thus
Core concepts introduced: paired data types; hashing; key
uniqueness; argument unpacking and tuples.
Module 9: Interaction with the file system.
In the final session e discuss the role of Python in the context of a
bioinformatics workflow, and how it is often used as a language to
“glue” various other components together. We then look at the Python
tools for carrying out file and directory manipulation, and for running
external programs – two tasks that are often necessary in order to
integrate our own programs with existing ones.
introduced: processes and sub-processes; the shell and shell utilities;
program return values.
Please email any inquiries to oliverhoo...@informatics.com or visit our
Please feel free to distribute this material anywhere you feel is
other upcoming courses
1. February 19^th – 23^rd 2018
MOVEMENT ECOLOGY (MOVE01)
Margam Discovery Centre, Wales, Dr Luca Borger, Dr Ronny Wilson, Dr
2. February 19^th – 23^rd 2018
GEOMETRIC MORPHOMETRICS USING R (GMMR01)
Margam Discovery Centre, Wales, Prof. Dean Adams, Prof. Michael
Collyer, Dr. Antigoni Kaliontzopoulou
3. March 5^th - 9^th 2018
SPATIAL PRIORITIZATION USING MARXAN (MRXN01)
Margam Discovery Centre, Wales, Jennifer McGowan
4. March 12^th - 16^th 2018
ECOLOGICAL NICHE MODELLING USING R (ENMR02)
Glasgow, Scotland, Dr. Neftali Sillero
5. March 19^th – 23^rd 2018
BEHAVIOURAL DATA ANALYSIS USING MAXIMUM LIKLIHOOD IN R (BDML01)
Glasgow, Scotland, Dr William Hoppitt
6. April 9^th – 13^th 2018
NETWORK ANAYLSIS FOR ECOLOGISTS USING R (NTWA02
Glasgow, Scotland, Dr. Marco Scotti
7. April 16^th – 20^th 2018
INTRODUCTION TO STATISTICAL MODELLING FOR PSYCHOLOGISTS USING R
Glasgow, Scotland, Dr. Dale Barr, Dr Luc Bussierre
8. April 23rd – 27th 2018
MULTIVARIATE ANALYSIS OF ECOLOGICAL COMMUNITIES USING THE VEGAN PACKAGE
Glasgow, Scotland, Dr. Peter Solymos, Dr. Guillaume
9. April 30^th – 4^th May 2018
QUANTITATIVE GEOGRAPHIC ECOLOGY: MODELING GENOMES, NICHES, AND
Glasgow, Scotland, Dr. Dan Warren, Dr. Matt Fitzpatrick
10. May 7^th – 11^th 2018 ADVANCES IN MULTIVARIATE ANALYSIS OF
SPATIAL ECOLOGICAL DATA USING R (MVSP02)
CANADA (QUEBEC), Prof. Pierre Legendre, Dr. Guillaume Blanchet
11. May 14^th - 18^th 2018
INTRODUCTION TO MIXED (HIERARCHICAL) MODELS FOR BIOLOGISTS (IMBR01)
CANADA (QUEBEC), Prof Subhash Lele
12. May 21^st - 25^th 2018
INTRODUCTION TO PYTHON FOR BIOLOGISTS (IPYB05)
SCENE, Scotland, Dr. Martin Jones
13. May 21^st - 25^th 2018
INTRODUCTION TO REMOTE SENISNG AND GIS FOR ECOLOGICAL APPLICATIONS
Glasgow, Scotland, Prof. Duccio Rocchini, Dr. Luca Delucchi
14. May 28^th – 31^st 2018
STABLE ISOTOPE MIXING MODELS USING SIAR, SIBER AND MIXSIAR (SIMM04)
CANADA (QUEBEC) Dr. Andrew Parnell, Dr. Andrew Jackson
15. May 28^th – June 1^st 2018
ADVANCED PYTHON FOR BIOLOGISTS (APYB02)
SCENE, Scotland, Dr. Martin Jones
16. June 12^th -0 15^th 2018
SPECIES DISTRIBUTION MODELLING (DBMR01)
Myuna Bay sport and recreation, Australia, TBC
17. June 18^th – 22^nd 2018
STRUCTURAL EQUATION MODELLING FOR ECOLOGISTS AND EVOLUTIONARY
BIOLOGISTS USING R (SEMR02)
Myuna Bay sport and recreation, Australia, TBC
18. July 2^nd - 5^th 2018
SOCIAL NETWORK ANALYSIS FOR BEHAVIOURAL SCIENTISTS USING R (SNAR01)
Glasgow, Scotland, Prof James Curley
19. July 8^th – 12^th 2018
MODEL BASE MULTIVARIATE ANALYSIS OF ABUNDANCE DATA USING R (MBMV02)
Glasgow, Scotland, Prof David Warton
20. July 16^th – 20^th 2018
PRECISION MEDICINE BIOINFORMATICS: FROM RAW GENOME AND TRANSCRIPTOME
DATA TO CLINICAL INTERPRETATION (PMBI01)
Glasgow, Scotland, Dr Malachi Griffith, Dr. Obi Griffith
21. July 23^rd – 27^th 2018
EUKARYOTIC METABARCODING (EUKB01)
Glasgow, Scotland, Dr. Owen Wangensteen
Oliver Hooker PhD.
- Ecosystem size predicts eco-morphological variability in post-glacial
diversification. Ecology and Evolution.
- The physiological costs of prey switching reinforce foraging
specialization. Journal of animal ecology.
Oliver Hooker <oliverhoo...@prinformatics.com>