[Bioc-devel] Bioconductor 3.4 is released

Hervé Pagès Tue, 18 Oct 2016 14:23:07 -0700

Thanks to all the developers for your contribution to the project!


-------------------------------------------------------------------
-------------------------------------------------------------------

October 18, 2016

Bioconductors:

We are pleased to announce Bioconductor 3.4, consisting of 1294
software packages, 309 experiment data packages, and 933
up-to-date annotation packages.

There are 100 new software packages, and many updates and improvements
to existing packages; Bioconductor 3.4 is compatible with R 3.3,
and is supported on Linux, 32- and 64-bit Windows, and Mac OS X.  This
release will include an updated Bioconductor Amazon Machine Image[1]
and Docker containers[2].

Visit http://bioconductor.org[3] for details and downloads.

[1]: http://bioconductor.org/help/bioconductor-cloud-ami/
[2]: http://bioconductor.org/help/docker/
[3]: http://bioconductor.org

Contents
--------

* Getting Started with Bioconductor 3.4
* New Software Packages
* NEWS from new and existing packages
* Deprecated and Defunct Packages

Getting Started with Bioconductor 3.4
======================================

To update to or install Bioconductor 3.4:

1. Install R 3.3 (>= 3.3.1 recommended).  Bioconductor 3.4 has been
   designed expressly for this version of R.

2. Follow the instructions at http://bioconductor.org/install/

New Software Packages
=====================

There are 100 new software packages in this release of Bioconductor.

alpine - Fragment sequence bias modeling and correction for RNA-seqtranscript abundance estimation.

AMOUNTAIN- A pure data-driven gene network, weighted gene co-expressionnetwork (WGCN) could be constructed only from expression profile.Different layers in such networks may represent different time points,multiple conditions or various species. AMOUNTAIN aims to search activemodules in multi-layer WGCN using a continuous optimization approach.

anamiR - This package is intended to identify potential interactions ofmiRNA-target gene interactions from miRNA and mRNA expression data. Itcontains functions for statistical test, databases of miRNA-target geneinteraction and functional analysis.

Anaquin - The project is intended to support the use of sequins(synthetic sequencing spike-in controls) owned and made available by theGarvan Institute of Medical Research. The goal is to provide a standardopen source library for quantitative analysis, modelling andvisualization of spike-in controls.

annotatr - Given a set of genomic sites/regions (e.g. ChIP-seq peaks,CpGs, differentially methylated CpGs or regions, SNPs, etc.) it is oftenof interest to investigate the intersecting genomic annotations. Suchannotations include those relating to gene models (promoters, 5'UTRs,exons, introns, and 3'UTRs), CpGs (CpG islands, CpG shores, CpGshelves), or regulatory sequences such as enhancers. The annotatrpackage provides an easy way to summarize and visualize the intersectionof genomic sites/regions with genomic annotations.

ASAFE - Given admixed individuals' bi-allelic SNP genotypes and ancestrypairs (where each ancestry can take one of three values) for multipleSNPs, perform an EM algorithm to deal with the fact that SNP genotypesare unphased with respect to ancestry pairs, in order to estimateancestry-specific allele frequencies for all SNPs.

ASpli - Integrative pipeline for the analyisis of alternative splicingusing RNAseq.

BaalChIP - The package offers functions to process multiple ChIP-seq BAMfiles and detect allele-specific events. Computes allele counts atindividual variants (SNPs/SNVs), implements extensive QC steps to removeproblematic variants, and utilizes a bayesian framework to identifystatistically significant allele- specific events. BaalChIP is able toaccount for copy number differences between the two alleles, a knownphenotypical feature of cancer samples.

BayesKnockdown - A simple, fast Bayesian method for computing posteriorprobabilities for relationships between a single predictor variable andmultiple potential outcome variables, incorporating prior probabilitiesof relationships. In the context of knockdown experiments, the predictorvariable is the knocked-down gene, while the other genes are potentialtargets. Can also be used for differential expression/2-class data.


bigmelon - Methods for working with Illumina arrays using gdsfmt.

bioCancer - bioCancer is a Shiny App to visualize and analyseinteractively Multi-Assays of Cancer Genomic Data.

BiocWorkflowTools - Provides functions to ease the transition betweenRmarkdown and LaTeX documents when authoring a Bioconductor Workflow.

CancerInSilico - The CancerInSilico package provides an R interface forrunning mathematical models of tumor progresson. This package has theunderlying models implemented in C++ and the output and analysisfeatures implemented in R.

CancerSubtypes - CancerSubtypes integrates the current commoncomputational biology methods for cancer subtypes identification andprovides a standardized framework for cancer subtype analysis based onthe genomic datasets.

ccmap - Finds drugs and drug combinations that are predicted to reverseor mimic gene expression signatures. These drugs might reverse diseasesor mimic healthy lifestyles.

CCPROMISE - Perform Canonical correlation between two forms of highdemensional genetic data, and associate the first compoent of each formof data with a specific biologically interesting pattern of associationswith multiple endpoints. A probe level analysis is also implemented.

CellMapper - Infers cell type-specific expression based on co-expressionsimilarity with known cell type marker genes. Can make accuratepredictions using publicly available expression data, even when a celltype has not been isolated before.

chromstaR - This package implements functions for combinatorial anddifferential analysis of ChIP-seq data. It includes uni- andmultivariate peak-calling, export to genome browser viewable files, andfunctions for enrichment analyses.

clusterExperiment - This package provides functions for running andcomparing many different clusterings of single-cell sequencing data.

covEB - Using bayesian methods to estimate correlation matrices assumingthat they can be written and estimated as block diagonal matrices. Theseblock diagonal matrices are determined using shrinkage parameters thatvalues below this parameter to zero.

covRNA - This package provides the analysis methods fourthcorner and RLQanalysis for large-scale transcriptomic data.

crisprseekplus - Bioinformatics platform containing interface to workwith offTargetAnalysis and compare2Sequences in the CRISPRseek package,and GUIDEseqAnalysis.

crossmeta] - Implements cross-platform and cross-species meta-analysesof Affymentrix, Illumina, and Agilent microarray data. This packageautomates common tasks such as downloading, normalizing, and annotatingraw GEO data. A user interface makes it easy to select control andtreatment samples for each contrast and study. This input is used forsubsequent surrogate variable analysis (models unaccounted sources ofvariation) and differential expression analysis. Final meta-analysis ofdifferential expression values can include genes measured in only asubset of studies.

ctsGE - Methodology for supervised clustering of potentially manypredictor variables, such as genes etc., in time series datasetsProvides functions that help the user assigning genes to predefined setof model profiles.

CVE - Shiny app for interactive variant prioritisation in precisioncancer medicine. The input file for CVE is the output file of therecently released Oncotator Variant Annotation tool summarisingvariant-centric information from 14 different publicly availableresources relevant for cancer researches. Interactive priortisation inCVE is based on known germline and cancer variants, DNA repair genes andfunctional prediction scores. An optional feature of CVE is theexploration of the tumour-specific pathway context that is facilitatedusing co-expression modules generated from publicly availabletranscriptome data. Finally druggability of prioritised variants isassessed using the Drug Gene Interaction Database (DGIdb).

CytoML - This package is designed to use GatingML2.0 as the standardformat to exchange the gated data with other software platform.


DeepBlueR - Accessing the DeepBlue Epigenetics Data Server through R.

DEsubs - DEsubs is a network-based systems biology package that extractsdisease-perturbed subpathways within a pathway network as recorded byRNA-seq experiments. It contains an extensive and customizable frameworkcovering a broad range of operation modes at all stages of thesubpathway analysis, enabling a case-specific approach. The operationmodes refer to the pathway network construction and processing, thesubpathway extraction, visualization and enrichment analysis with regardto various biological and pharmacological features. Its capabilitiesrender it a tool-guide for both the modeler and experimentalist for theidentification of more robust systems-level biomarkers for complex diseases.

Director - Director is an R package designed to streamline thevisualization of molecular effects in regulatory cascades. It utilizesthe R package htmltools and a modified Sankey plugin of the JavaScriptlibrary D3 to provide a fast and easy, browser-enabled solution todiscovering potentially interesting downstream effects of regulatoryand/or co-expressed molecules. The diagrams are robust, interactive, andpackaged as highly-portable HTML files that eliminate the need forthird-party software to view. This enables a straightforward approachfor scientists to interpret the data produced, and bioinformaticsdevelopers an alternative means to present relevant data.

dSimer - dSimer is an R package which provides computation of ninemethods for measuring disease-disease similarity, including a standardcosine similarity measure and eight function-based methods. The diseasesimilarity matrix obtained from these nine methods can be visualizedthrough heatmap and network. Biological data widely used indisease-disease associations study are also provided by dSimer.

eegc - This package has been developed to evaluate cellular engineeringprocesses for direct differentiation of stem cells or conversion(transdifferentiation) of somatic cells to primary cells based on highthroughput gene expression data screened either by DNA microarray or RNAsequencing. The package takes gene expression profiles as inputs fromthree types of samples: (i) somatic or stem cells to be(trans)differentiated (input of the engineering process), (ii) inducedcells to be evaluated (output of the engineering process) and (iii)target primary cells (reference for the output). The package performsdifferential gene expression analysis for each pair-wise samplecomparison to identify and evaluate the transcriptional differencesamong the 3 types of samples (input, output, reference). The ideal goalis to have induced and primary reference cell showing overlappingprofiles, both very different from the original cells.

esetVis - Utility functions for visualization of expressionSet (orSummarizedExperiment) Bioconductor object, including spectral map, tsneand linear discriminant analysis. Static plot via the ggplot2 package orinteractive via the ggvis or rbokeh packages are available.

ExperimentHub - This package provides a client for the BioconductorExperimentHub web resource. ExperimentHub provides a central locationwhere curated data from experiments, publications or training coursescan be accessed. Each resource has associated metadata, tags and date ofmodification. The client creates and manages a local cache of filesretrieved enabling quick and reproducible access.

ExperimentHubData - Functions to add metadata to ExperimentHub db andresource files to AWS S3 buckets.

fCCAC - An application of functional canonical correlation analysis toassess covariance of nucleic acid sequencing datasets such as chromatinimmunoprecipitation followed by deep sequencing (ChIP-seq).

fgsea - The package implements an algorithm for fast gene set enrichmentanalysis. Using the fast algorithm allows to make more permutations andget more fine grained p-values, which allows to use accurate stantardapproaches to multiple hypothesis correction.

FitHiC - Fit-Hi-C is a tool for assigning statistical confidenceestimates to intra-chromosomal contact maps produced by genome-widegenome architecture assays such as Hi-C.

flowPloidy - Determine sample ploidy via flow cytometry histogramanalysis. Reads Flow Cytometry Standard (FCS) files via the flowCorebioconductor package, and provides functions for determining the DNAploidy of samples based on internal standards.

FunChIP - Preprocessing and smoothing of ChIP-Seq peaks and efficientimplementation of the k-mean alignment algorithm to classify them.

GAprediction - [GAprediction] predicts gestational age using IlluminaHumanMethylation450 CpG data.

gCrisprTools - Set of tools for evaluating pooled high-throughputscreening experiments, typically employing CRISPR/Cas9 or shRNAexpression cassettes. Contains methods for interrogating library andcassette behavior within an experiment, identifying differentiallyabundant cassettes, aggregating signals to identify candidate targetsfor empirical validation, hypothesis testing, and comprehensive reporting.


GEM - Tools for analyzing EWAS, methQTL and GxE genome widely.

geneAttribution - Identification of the most likely gene or genesthrough which variation at a given genomic locus in the human genomeacts. The most basic functionality assumes that the closer gene is tothe input locus, the more likely the gene is to be causative.Additionally, any empirical data that links genomic regions to genes(e.g. eQTL or genome conformation data) can be used if it is supplied inthe UCSC .BED file format.

GeneGeneInteR - The aim of this package is to propose several methodsfor testing gene-gene interaction in case-control association studies.Such a test can be done by aggregating SNP-SNP interaction testsperformed at the SNP level (SSI) or by using gene-gene multidimensionnalmethods (GGI) methods. The package also proposes tools for a graphicdisplay of the results.

geneplast - Geneplast is designed for evolutionary and plasticityanalysis based on orthologous groups distribution in a given speciestree. It uses Shannon information theory and orthologs abundance toestimate the Evolutionary Plasticity Index. Additionally, it implementsthe Bridge algorithm to determine the evolutionary root of a given genebased on its orthologs distribution.

geneXtendeR - geneXtendeR is designed to optimally annotate a histonemodification ChIP-seq peak input file with functionally importantgenomic features (e.g., genes associated with peaks) based onoptimization calculations. geneXtendeR optimally extends the boundariesof every gene in a genome by some genomic distance (in DNA base pairs)for the purpose of flexibly incorporating cis-regulatory elements(CREs), such as enhancers and promoters, as well as downstream elementsthat are important to the function of the gene relative to an epigenetichistone modification ChIP-seq dataset. geneXtender computes optimal geneextensions tailored to the broadness of the specific epigenetic mark(e.g., H3K9me1, H3K27me3), as determined by a user-supplied ChIP-seqpeak input file. As such, geneXtender maximizes the signal-to-noiseratio of locating genes closest to and directly under peaks. Byperforming a computational expansion of this nature, ChIP-seq reads thatwould initially not map strictly to a specific gene can now be optimallymapped to the regulatory regions of the gene, thereby implicating thegene as a potential candidate, and thereby making the ChIP-seqexperiment more successful. Such an approach becomes particularlyimportant when working with epigenetic histone modifications that haveinherently broad peaks.

GOpro - Find the most characteristic gene ontology terms for groups ofhuman genes. This package was created as a part of the thesis which wasdeveloped under the auspices of MI^2 Group (http://mi2.mini.pw.edu.pl/,https://github.com/geneticsMiNIng).

GRmetrics- Functions for calculating and visualizing growth-rateinhibition (GR) metrics.

HelloRanges - Translates bedtools command-line invocations to R codecalling functions from the Bioconductor *Ranges infrastructure. This isintended to educate novice Bioconductor users and to compare the syntaxand semantics of the two frameworks.

ImpulseDE - ImpulseDE is suited to capture single impulse-like patternsin high throughput time series datasets. By fitting a representativeimpulse model to each gene, it reports differentially expressed geneswhether across time points in a single experiment or between two timecourses from two experiments. To optimize the running time, the codemakes use of clustering steps and multi-threading.

IPO - The outcome of XCMS data processing strongly depends on theparameter settings. IPO (`Isotopologue Parameter Optimization`) is aparameter optimization tool that is applicable for different kinds ofsamples and liquid chromatography coupled to high resolution massspectrometry devices, fast and free of labeling steps. IPO uses natural,stable 13C isotopes to calculate a peak picking score. Retention timecorrection is optimized by minimizing the relative retention timedifferences within features and grouping parameters are optimized bymaximizing the number of features showing exactly one peak from eachinjection of a pooled sample. The different parameter settings areachieved by design of experiment. The resulting scores are evaluatedusing response surface models.

KEGGlincs - See what is going on 'under the hood' of KEGG pathways byexplicitly re-creating the pathway maps from information obtained fromKGML files.

LINC - This package provides methods to compute co-expression networksof lincRNAs and protein-coding genes. Biological terms associated withthe sets of protein-coding genes predict the biological contexts oflincRNAs according to the 'Guilty by Association' approach.

LOBSTAHS - LOBSTAHS is a multifunction package for screening,annotation, and putative identification of mass spectral features inlarge, HPLC-MS lipid datasets. In silico data for a wide range oflipids, oxidized lipids, and oxylipins can be generated fromuser-supplied structural criteria with a database generation function.LOBSTAHS then applies these databases to assign putative compoundidentities to features in any high-mass accuracy dataset that has beenprocessed using xcms and CAMERA. Users can then apply a series oforthogonal screening criteria based on adduct ion formation patterns,chromatographic retention time, and other properties, to evaluate andassign confidence scores to this list of preliminary assignments. Duringthe screening routine, LOBSTAHS rejects assignments that do not meet thespecified criteria, identifies potential isomers and isobars, andassigns a variety of annotation codes to assist the user in evaluatingthe accuracy of each assignment.

M3Drop - This package fits a Michaelis-Menten model to the pattern ofdropouts in single-cell RNASeq data. This model is used as a null toidentify significantly variable (i.e. differentially expressed) genesfor use in downstream analysis, such as clustering cells.

MADSEQ - The MADSEQ package provides a group of hierarchical Bayeisanmodels for the detection of mosaic aneuploidy, the inference of the typeof aneuploidy and also for the quantification of the fraction ofaneuploid cells in the sample.

maftools - Analyze and visualize Mutation Annotation Format (MAF) filesfrom large scale sequencing studies. This package provides variousfunctions to perform most commonly used analyses in cancer genomics andto create feature rich customizable visualzations with minimal effort.


MAST - Methods and models for handling zero-inflated single cell assay data.

matter - Memory-efficient reading, writing, and manipulation ofstructured binary data on disk as vectors, matrices, and arrays. Thispackage is designed to be used as a back-end for Cardinal for workingwith high-resolution mass spectrometry imaging data.

meshes - MeSH (Medical Subject Headings) is the NLM controlledvocabulary used to manually index articles for MEDLINE/PubMed. MeSHterms were associated by Entrez Gene ID by three methods, gendoo,gene2pubmed and RBBH. This association is fundamental for enrichment andsemantic analyses. meshes supports enrichment analysis(over-representation and gene set enrichment analysis) of gene list orwhole expression profile. The semantic comparisons of MeSH terms providequantitative ways to compute similarities between genes and gene groups.meshes implemented five methods proposed by Resnik, Schlicker, Jiang,Lin and Wang respectively and supports more than 70 species.

MetaboSignal - MetaboSignal is an R package that allows merging,analyzing and customizing metabolic and signaling KEGG pathways. It is anetwork-based approach designed to explore the topological relationshipbetween genes (signaling- or enzymatic-genes) and metabolites,representing a powerful tool to investigate the genetic landscape andregulatory networks of metabolic phenotypes.

MetCirc - MetCirc comprises a workflow to interactively exploremetabolomics data: create MSP, bin m/z values, calculate similaritybetween precursors and visualise similarities.

methylKit - methylKit is an R package for DNA methylation analysis andannotation from high-throughput bisulfite sequencing. The package isdesigned to deal with sequencing data from RRBS and its variants, butalso target-capture methods and whole genome bisulfite sequencing. Italso has functions to analyze base-pair resolution 5hmC data fromexperimental protocols such as oxBS-Seq and TAB-Seq. Perl is needed toread SAM files only.


MGFR - The package is designed to detect marker genes from RNA-seq data.

MODA - MODA can be used to estimate and construct condition-specificgene co-expression networks, and identify differentially expressedsubnetworks as conserved or condition specific modules which arepotentially associated with relevant biological processes.

MoonlightR - Motivation: The understanding of cancer mechanism requiresthe identification of genes playing a role in the development of thepathology and the characterization of their role (notably oncogenes andtumor suppressors). Results: We present an R/bioconductor package calledMoonlightR which returns a list of candidate driver genes for specificcancer types on the basis of TCGA expression data. The method firstinfers gene regulatory networks and then carries out a functionalenrichment analysis (FEA) (implementing an upstream regulator analysis,URA) to score the importance of well-known biological processes withrespect to the studied cancer type. Eventually, by means of randomforests, MoonlightR predicts two specific roles for the candidate drivergenes: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As aconsequence, this methodology does not only identify genes playing adual role (e.g. TSG in one cancer type and OCG in another) but alsohelps in elucidating the biological processes underlying their specificroles. In particular, MoonlightR can be used to discover OCGs and TSGsin the same cancer type. This may help in answering the question whethersome genes change role between early stages (I, II) and late stages(III, IV) in breast cancer. In the future, this analysis could be usefulto determine the causes of different resistances to chemotherapeutictreatments.

msPurity - Assess the contribution of the targeted precursor infragmentation acquired or anticipated isolation windows using a metriccalled "precursor purity". Also provides simple processing steps(averaging, filtering, blank subtraction, etc) for DI-MS data. Works forboth LC-MS(/MS) and DI-MS(/MS) data.

MultiAssayExperiment - Develop an integrative environment where multipleassays are managed and preprocessed for genomic data analysis.

MutationalPatterns - An extensive toolset for the characterization andvisualization of a wide range of mutational patterns in basesubstitution data.

netprioR - A model for semi-supervised prioritisation of genesintegrating network data, phenotypes and additional prior knowledgeabout TP and TN gene labels from the literature or experts.

normr - Robust normalization and difference calling procedures forChIP-seq and alike data. Read counts are modeled jointly as a binomialmixture model with a user-specified number of components. A fittedbackground estimate accounts for the effect of enrichment in certainregions and, therefore, represents an appropriate null hypothesis. Thisrobust background is used to identify significantly enriched or depletedregions.

PathoStat - The purpose of this package is to perform StatisticalMicrobiome Analysis on metagenomics results from sequencing datasamples. In particular, it supports analyses on the PathoScope generatedreport files. PathoStat provides various functionalities includingRelative Abundance charts, Diversity estimates and plots, tests ofDifferential Abundance, Time Series visualization, and Core OTU analysis.

PharmacoGx - Contains a set of functions to perform large-scale analysisof pharmacogenomic data.

philr - PhILR is short for Phylogenetic Isometric Log-Ratio Transform.This package provides functions for the analysis of compositional data(e.g., data representing proportions of different variables/parts).Specifically this package allows analysis of compositional data wherethe parts can be related through a phylogenetic tree (as is common inmicrobiota survey data) and makes available the Isometric Log Ratiotransform built from the phylogenetic tree and utilizing a weightedreference measure.

Pi - Priority index or Pi is developed as a genomic-led targetprioritisation system, with the focus on leveraging human genetic datato prioritise potential drug targets at the gene, pathway and networklevel. The long term goal is to use such information to enhanceearly-stage target validation. Based on evidence of disease associationfrom genome-wide association studies (GWAS), this prioritisation systemis able to generate evidence to support identification of the specificmodulated genes (seed genes) that are responsible for the geneticassociation signal by utilising knowledge of linkage disequilibrium(co-inherited genetic variants), distance of associated variants fromthe gene, and evidence of independent genetic association with geneexpression in disease-relevant tissues, cell types and states. Seedgenes are scored in an integrative way, quantifying the geneticinfluence. Scored seed genes are subsequently used as baits to rank seedgenes plus additional (non-seed) genes; this is achieved by iterativelyexploring the global connectivity of a gene interaction network. Geneswith the highest priority are further used to identify/prioritisepathways that are significantly enriched with highly prioritised genes.Prioritised genes are also used to identify a gene networkinterconnecting highly prioritised genes and a minimal number of lessprioritised genes (which act as linkers bringing together highlyprioritised genes).

Pigengene - Pigengene package provides an efficient way to inferbiological signatures from gene expression profiles. The signatures areindependent from the underlying platform, e.g., the input can bemicroarray or RNA Seq data. It can even infer the signatures using datafrom one platform, and evaluate them on the other. Pigengene identifiesthe modules (clusters) of highly coexpressed genes using coexpressionnetwork analysis, summarizes the biological information of each modulein an eigengene, learns a Bayesian network that models the probabilisticdependencies between modules, and builds a decision tree based on theexpression of eigengenes.

proFIA - Flow Injection Analysis coupled to High-Resolution MassSpectrometry is a promising approach for high-throughput metabolomics.FIA- HRMS data, however, cannot be pre-processed with current softwaretools which rely on liquid chromatography separation, or handle lowresolution data only. Here we present the proFIA package, whichimplements a new methodology to pre-process FIA-HRMS raw data (netCDF,mzData, mzXML, and mzML) including noise modelling and injection peakreconstruction, and generate the peak table. The workflow includes noisemodelling, band detection and filtering then signal matching and missingvalue imputation. The peak table can then be exported as a .tsv file forfurther analysis. Visualisations to assess the quality of the data andof the signal made are easely produced.

psichomics - Automatically retrieve data from RNA-Seq sources such asThe Cancer Genome Atlas or load your own files and process the data.This tool allows you to analyse and visualise alternative splicing.

qsea - qsea (quantitative sequencing enrichment analysis) was developedas the successor of the MEDIPS package for analyzing data derived frommethylated DNA immunoprecipitation (MeDIP) experiments followed bysequencing (MeDIP-seq). However, qsea provides several functionalitiesfor the analysis of other kinds of quantitative sequencing data (e.g.ChIP-seq, MBD-seq, CMS-seq and others) including calculation ofdifferential enrichment between groups of samples.

RCAS - RCAS is an automated system that provides dynamic genomeannotations for custom input files that contain transcriptomic regions.Such transcriptomic regions could be, for instance, peak regionsdetected by CLIP-Seq analysis that detect protein-RNA interactions, RNAmodifications (alias the epitranscriptome), CAGE-tag locations, or anyother collection of target regions at the level of the transcriptome.RCAS is designed as a reporting tool for the functional analysis ofRNA-binding sites detected by high-throughput experiments. It takes asinput a BED format file containing the genomic coordinates of the RNAbinding sites and a GTF file that contains the genomic annotationfeatures usually provided by publicly available databases such asEnsembl and UCSC. RCAS performs overlap operations between the genomiccoordinates of the RNA binding sites and the genomic annotation featuresand produces in-depth annotation summaries such as the distribution ofbinding sites with respect to gene features (exons, introns, 5'/3' UTRregions, exon-intron boundaries, promoter regions, and wholetranscripts). Moreover, by detecting the collection of targetedtranscripts, RCAS can carry out functional annotation tables forenriched gene sets (annotated by the Molecular Signatures Database) andGO terms. As one of the most important questions that arise duringprotein-RNA interaction analysis; RCAS has a module for detectingsequence motifs enriched in the targeted regions of the transcriptome. Afull interactive report in HTML format can be generated that containsinteractive figures and tables that are ready for publication purposes.

rDGIdb - The rDGIdb package provides a wrapper for the Drug GeneInteraction Database (DGIdb). For simplicity, the wrapper query functionand output resembles the user interface and results format provided onthe DGIdb website (http://dgidb.genome.wustl.edu/).

readat - This package contains functionality to import, transform andannotate data from ADAT files generated by the SomaLogic SOMAscan platform.

recount - Explore and download data from the recount project availableat https://jhubiostatistics.shinyapps.io/recount/. Using the recountpackage you can download RangedSummarizedExperiment objects at the gene,exon or exon-exon junctions level, the raw counts, the phenotypemetadata used, the urls to the sample coverage bigWig files or the meancoverage bigWig file for a particular study. TheRangedSummarizedExperiment objects can be used by different packages forperforming differential expression analysis. Usinghttp://bioconductor.org/packages/derfinder you can performannotation-agnostic differential expression analyses with the data fromthe recount project as described athttp://biorxiv.org/content/early/2016/08/08/068478.

regsplice - Statistical methods for detection of differential exon usagein RNA-seq and exon microarray data sets, using L1 regularization(lasso) to improve power.

sights - SIGHTS is a suite of normalization methods, statistical tests,and diagnostic graphical tools for high throughput screening (HTS)assays. HTS assays use microtitre plates to screen large libraries ofcompounds for their biological, chemical, or biochemical activity.

signeR - The signeR package provides an empirical Bayesian approach tomutational signature discovery. It is designed to analyze singlenucleotide variaton (SNV) counts in cancer genomes, but can also beapplied to other features as well. Functionalities to characterizesignatures or genome samples according to exposure patterns are alsoprovided.

SIMLR - Single-cell RNA-seq technologies enable high throughput geneexpression measurement of individual cells, and allow the discovery ofheterogeneity within cell populations. Measurement of cell-to-cell geneexpression similarity is critical to identification, visualization andanalysis of cell populations. However, single-cell data introducechallenges to conventional measures of gene expression similaritybecause of the high level of noise, outliers and dropouts. We develop anovel similarity-learning framework, SIMLR (Single-cell Interpretationvia Multi-kernel LeaRning), which learns an appropriate distance metricfrom the data for dimension reduction, clustering and visualization.SIMLR is capable of separating known subpopulations more accurately insingle-cell data sets than do existing dimension reduction methods.Additionally, SIMLR demonstrates high sensitivity and accuracy onhigh-throughput peripheral blood mononuclear cells (PBMC) data setsgenerated by the GemCode single-cell technology from 10x Genomics.

SNPediaR - SNPediaR provides some tools for downloading and parsing datafrom the SNPedia web site <http://www.snpedia.com>. The implementedfunctions allow users to import the wiki text available in SNPedia pagesand to extract the most relevant information out of them. If someinformation in the downloaded pages is not automatically processed bythe library functions, users can easily implement their own parsers toaccess it in an efficient way.

SPLINTER - SPLINTER provides tools to analyze alternative splicingsites, interpret outcomes based on sequence information, select anddesign primers for site validiation and give visual representation ofthe event to guide downstream experiments.

SRGnet - We developed SRMnet to analyze synergistic regulatorymechanisms in transcriptome profiles that act to enhance the overallcell response to combination of mutations, drugs or environmentalexposure. This package can be used to identify regulatory modulesdownstream of synergistic response genes, prioritize synergisticregulatory genes that may be potential intervention targets, andcontextualize gene perturbation experiments.

StarBioTrek - This tool StarBioTrek presents some methodologies tomeasure pathway activity and cross-talk among pathways integrating alsothe information of network data.

statTarget - An easy to use tool provide a graphical user interface forquality control based shift signal correction, integration ofmetabolomic data from multi-batch experiments, and the comprehensivestatistic analysis in non-targeted or targeted metabolomics.

SVAPLSseq - The package contains functions that are intended for theidentification of differentially expressed genes between two groups ofsamples from RNAseq data after adjusting for various hidden biologicaland technical factors of variability.

switchde - Inference and detection of switch-like differentialexpression across single-cell RNA-seq trajectories.

synergyfinder - Efficient implementations for all the popular synergyscoring models for drug combinations, including HSA, Loewe, Bliss andZIP and visualization of the synergy scores as either a two-dimensionalor a three-dimensional interaction surface over the dose matrix.

TVTB - The package provides S4 classes and methods to filter, summariseand visualise genetic variation data stored in VCF files. In particular,the package extends the FilterRules class (S4Vectors package) to definenews classes of filter rules applicable to the various slots of VCFobjects. Functionalities are integrated and demonstrated in a Shinyweb-application, the Shiny Variant Explorer (tSVE).

uSORT - This package is designed to uncover the intrinsic cellprogression path from single-cell RNA-seq data. It incorporates datapre-processing, preliminary PCA gene selection, preliminary cellordering, feature selection, refined cell ordering, and post-analysisinterpretation and visualization.

yamss - Tools to analyze and visualize high-throughput metabolomics dataaquired using chromatography-mass spectrometry. These tools preprocessdata in a way that enables reliable and powerful differential analysis.

YAPSA - This package provides functions and routines useful in theanalysis of somatic signatures (cf. L. Alexandrov et al., Nature 2013).In particular, functions to perform a signature analysis with knownsignatures (LCD = linear combination decomposition) and a signatureanalysis on stratified mutational catalogue (SMC = stratify mutationalcatalogue) are provided.

yarn - Expedite large RNA-Seq analyses using a combination of previouslydeveloped tools. YARN is meant to make it easier for the user inperforming basic mis-annotation quality control, filtering, andcondition-aware normalization. YARN leverages many Bioconductor toolsand statistical techniques to account for the large heterogeneity andsparsity found in very large RNA-seq experiments.


NEWS from new and existing packages
===================================

There is too much NEWS to include here, see the full releaseannouncement at


  https://bioconductor.org/news/bioc_3_4_release/

Deprecated and Defunct Packages
===============================

1 software package (betr) was marked as deprecated, to be removed in thenext release.


17 previously deprecated software packages were removed from this release.

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] Bioconductor 3.4 is released

Reply via email to