Eukaryotic metabarcoding (EUKB01)

Delivered by Dr. Owen Wangensteen

20th January 2018 - 24th January 2018

http://www.prinformatics.com/course/eukaryotic-metabarcoding-eukb01/

This 5 day course will run from 20th – 24th January 2018 at SCENE field 
station, Loch Lomond national park, Scotland.

Course Overview:
Metabarcoding techniques are a set of novel genetic tools for assessing 
biodiversity of natural communities. Their potential applications include 
(but are not limited to) accurate water quality, soil diversity assessment, 
trophic analyses of digestive contents, early detection of non-indigenous 
species, studies of global ecological patterns and biomonitoring of 
anthropogenic impacts. This course will give an overview of metabarcoding 
procedures with an emphasis on practical problem-solving and hands-on work 
using analysis pipelines on real datasets. After completing the course, 
students should be in a position to (1) understand the potential and 
capabilities of metabarcoding, (2) run complete analyses of metabarcoding 
pipelines and obtain diversity inventories and ecologically interpretable 
data from raw next-generation sequence data and (3) design their own 
metabarcoding projects, using bespoke primer sets and custom reference 
databases. All course materials (including copies of presentations, 
practical exercises, data files, and example scripts prepared by the 
instructing team) will be provided electronically to participants.

Intended Audience:
This workshop is mainly aimed at researchers and technical workers with a 
background in ecology, biodiversity or community biology who want to use 
molecular tools for biodiversity research and researchers in other areas of 
bioinformatics who want to learn ecological applications for biodiversity-
assessment. In general, it is suitable for every researcher who wants to 
join the growing community of metabarcoders worldwide.

Monday 20th
Session 1. Introduction to metabarcoding procedures. The metabarcoding 
pipeline.
In this session students will be introduced to the key concepts of 
metabarcoding and the different next-generation sequencing platforms 
currently available for implementing this technology. The kind of results 
that we may obtain from metabarcoding projects is explained using examples 
from real life. We will outline the different steps of a typical 
metabarcoding pipeline and introduce some key concepts. In this session, we 
will check that the computing infrastructure for the rest of the course is 
in place and all the needed software is installed. Core concepts 
introduced: next-generation sequencer, multiplexing, NGS library, 
metabarcoding pipeline, metabarcoding marker, clustering algorithms, 
molecular operational taxonomic unit (MOTU), taxonomic assignment.

Session 2. Metabarcoding markers. Primer design. PCR and library 
preparation protocols.
In this session students will learn about the various kinds of molecular 
markers that can be used for metabarcoding different kinds of samples and 
the quality of the information which can be retrieved from them. They will 
learn about the most commonly used primer sets for each target taxonomic 
group and how to use the software available for designing their own custom 
metabarcoding primers. They will know about sample tags, library tags, 
adapter sequences, PCR protocols and library preparation procedures. Core 
concepts introduced: metabarcoding marker, universality, specificity, 
taxonomic range, taxonomic resolution, primer bias, amplification errors, 
sequencing errors, in silico PCR, sample tags, library tags, adapter 
sequences, PCR, library preparation kits, PCR-free methods, avoiding 
contaminations, good laboratory practice.

Tuesday 21st
Session 3. The OBITools pipeline. First steps and quality control.
In this session, we will start to work with the OBITools software suite, 
using a real sequence dataset as example for testing our metabarcoding 
pipeline. We will outline the steps needed to start analysing raw data from 
next-generation sequencers. The students will learn about the different 
data formats used by OBITools for working with sequences and they will 
perform protocols for quality control, paired-end alignment, sequence 
filtering, removal of chimeric sequences, sample demultiplexing, format 
conversion and dereplication of unique sequences. Core concepts introduced: 
fastq, fasta and extended fasta formats, Phred quality score, paired-end 
alignment, demultiplexing, sequence filtering, chimeras, dereplication, 
unique sequences, reads.

Session 4. Clustering algorithms. Constant and variable identity thresholds.
In this session, we will introduce different algorithms available for 
clustering sequences into molecular operational taxonomic units (MOTUs). We 
will learn the differences between constant and variable identity percent 
threshold for delineating the MOTUS. We will run some of these algorithms 
with our example dataset and will analyse the results from different 
methods. Core concepts introduced: MOTU, reference clustering, de novo 
clustering, unsupervised-learning clustering, Bayesian clustering, step-by-
step aggregation methods, identity threshold, variable identity threshold, 
singleton sequences, sequence mapping, abundance recalculation.
Wednesday 22nd – Classes from 09:00 to 17:00

Session 5. Taxonomic assignment. The ecotag algorithm. Reference databases.
In this session the students will learn about different algorithms for 
taxonomic assigment of MOTUs. The ecotag algorithm will be used for adding 
taxonomic information to the MOTUs in our example dataset and the results 
will be compared to those from other assignment software. Core concepts 
introduced: reference database, identity assignment, BLAST, phylogenetic 
assignment, best match, higher taxa assignment.
Session 6. Generating, improving and curating reference databases.
The quality of the reference database used for taxonomic assignment is 
crucial for the accuracy and applicability of the resulting datasets from 
any metabarcoding project. In this session the students will learn how to 
build local reference databases from the information available in public 
sequence repositories and how to add custom sequences to existing reference 
databases. They will also learn how sequence reference databases interact 
with taxonomy databases for retrieving the phylogenetic information needed 
by the assignment algorithms. Core concepts introduced: ecoPCR, sequence 
reference database, taxonomic database, taxonomic identifier (taxid), 
GenBank, European Nucleotide Archive (ENA), Barcode Of Life Datasystems 
(BOLD), SILVA database.

Thursday 23rd
Session 7. Refining and analysing the final dataset. Collapsing, 
renormalising and blank correction. α- and ß- diversity patterns.
In this session, students will learn about procedures for refining the 
final datasets obtained from the previous pipeline. They will learn about 
blank correction, renormalization procedures for deleting false positive 
results, and taxonomy collapsing of related MOTUs for obtaining enhanced 
final datasets. We will also discuss how to interpret these final datasets 
to obtain ecologically relevant information. Resampling and rarefying 
procedures are introduced. Qualitative and quantitative indices for 
assessing dissimilarity between samples are explained. We will introduce 
the UniFrac dissimilarity distance between samples, an index taking in 
account not only abundances of the different MOTUs but also their taxonomic 
affinity. Core concepts introduced: renormalization, taxonomy collapsing, 
blank correction, α-diversity, ß-diversity, rarefaction, MOTU richness, 
UniFrac distances, multidimensional scaling (MDS).

Session 8. Presenting the final results. Online resources and future 
developments.
In this session we will continue with the presentation of final results. 
Students will learn how to plot taxonomic summaries from their datasets, 
including krona plots, a type of graphic representation which allow to show 
relative abundances of reads at different taxonomic levels. The rest of the 
session will be dedicated to introduce current research and possible future 
developments of metabarcoding / metagenomics techniques and to provide a 
list of useful resources for further learning, continuous training and 
future research opportunities. Core concepts introduced: taxonomic summary, 
krona plots, target capture, metagenomics, mitogenomics, long range PCR, 
nanopore sequencing, mBRAVE.net, metabarcoding.org.

Friday 24th
Session 9. Customization.
This session will be dedicated to customize individual metabarcoding 
projects, in function of the specific needs of the students. We will 
discuss the best strategies to use for obtaining good quality results from 
our metabarcoding projects, by optimizing time, money and computing 
resources. The idea is to make this session as interactive and useful as 
possible. We will present our current and future projects in the format of 
an open discussion and we will try to propose the best solutions for every 
potential problem in a collaborative way.

Session 10.
Optional free afternoon to cover previous modules, discuss data or continue 
with the customization session.

Please email any inquiries to [email protected] or visit our 
website www.prinformatics.com

Please feel free to distribute this material anywhere you feel is suitable.

1.      CODING, DATA MANAGEMENT AND SHINY APPLICATIONS USING RSTUDIO FOR 
EVOLUTIONARY BIOLOGISTS AND ECOLOGISTS #CDSR
15th - 19th May, Scotland Dr. Aline Quadros
http://www.prinformatics.com/course/coding-data-management-and-shiny-
applications-using-rstudio-for-evolutionary-biologists-and-ecologists-
cdsr01/

2.      BIOINFORMATICS FOR GENETICISTS AND BIOLOGISTS #BIGB
3rd – 7th July 2017, Scotland, Dr. Nic Blouin, Dr. Ian Misner
http://www.prinformatics.com/course/bioinformatics-for-geneticists-and-
biologists-bigb02/

3.      INTRODUCTION TO BIOINFORMATICS USING LINUX #IBUL
16th – 20th October, Scotland, Dr. Martin Jones
http://www.prstatistics.com/course/introduction-to-bioinformatics-using-
linux-ibul02/

4.      INTRODUCTION TO PYTHON FOR BIOLOGISTS #IPYB
27th Nov – 1st Dec, Wales, Dr. Martin Jones
http://www.prinformatics.com/course/introduction-to-python-for-biologists-
ipyb04/

5.      DATA VISUALISATION AND MANIPULATION USING PYTHON #DVMP
11th – 15th December 2017, Wales, Dr. Martin Jones
http://www.prinformatics.com/course/data-visualisation-and-manipulation-
using-python-dvmp01/

6.      EUKARYOTIC METABARCODING
20th – 24th January 2018, Scotland, Dr. Owen Wangensteen
http://www.prinformatics.com/course/eukaryotic-metabarcoding-eukb01/

Reply via email to