Samtools (and HTSlib and BCFtools) version 1.9 is now available from
GitHub and SourceForge
https://sourceforge.net/projects/samtools/
https://github.com/samtools/htslib/releases/tag/1.9
https://github.com/samtools/samtools/releases/tag/1.9
https://github.com/samtools/bcftools/releases/tag/1.9
The main changes are listed below:
------------------------------------------------------------------------------
htslib - changes v1.9
------------------------------------------------------------------------------
* If `./configure` fails, `make` will stop working until either configure is
re-run successfully, or `make distclean` is used. This makes configuration
failures more obvious. (#711, thanks to John Marshall)
* The default SAM version has been changed to 1.6. This is in line with the
latest version specification and indicates that HTSlib supports the CG tag
used to store long CIGAR data in BAM format.
* bgzip integrity check option '--test' (#682, thanks to @sd4B75bJ, @jrayner)
* Faidx can now index fastq files as well as fasta. The fastq index adds an
extra column to the `.fai` index which gives the offset to the quality
values. New interfaces have been added to `htslib/faidx.h` to read the
fastq index and retrieve the quality values. It is possible to open a
fastq index as if fasta (only sequences will be returned), but not the
other way round. (#701)
* New API interfaces to add or update integer, float and array aux tags.
(#694)
* Add `level=<number>` option to `hts_set_opt()` to allow the compression
level to be set. Setting `level=0` enables uncompressed output. (#715)
* Improved bgzip error reporting.
* Better error reporting when CRAM reference files can't be opened. (#706)
* Fixes to make tests work properly on Windows/MinGW - mainly to handle line
ending differences. (#716)
* Efficiency improvements:
- Small speed-up for CRAM indexing.
- Reduce the number of unnecessary wake-ups in the thread pool. (#703)
- Avoid some memory copies when writing data, notably for uncompressed BGZF
output. (#703)
* Bug fixes:
- Fix multi-region iterator bugs on CRAM files. (#684)
- Fixed multi-region iterator bug that caused some reads to be skipped
incorrectly when reading BAM files. (#687)
- Fixed synced_bcf_reader() bug when reading contigs multiple times. (#691,
reported by @freeseek)
- Fixed bug where bcf_hdr_set_samples() did not update the sample
dictionary when removing samples. (#692, reported by @freeseek)
- Fixed bug where the VCF record ref length was calculated incorrectly if
an INFO END tag was present. (71b00a)
- Fixed warnings found when compiling with gcc 8.1.0. (#700)
- sam_hdr_read() and sam_hdr_write() will now return an error code if
passed a NULL file pointer, instead of crashing.
- Fixed possible negative array look-up in sam_parse1() that somehow
escaped previous fuzz testing. (#731, reported by @fCorleone)
- Fixed bug where cram range queries could incorrectly report an error when
using multiple threads. (#734, reported by Brent Pedersen)
- Fixed very rare rANS normalisation bug that could cause an assertion
failure when writing CRAM files. (#739, reported by @carsonhh)
------------------------------------------------------------------------------
samtools - changes v1.9
------------------------------------------------------------------------------
* Samtools mpileup VCF and BCF output is now deprecated. It is still
functional, but will warn. Please use bcftools mpileup instead. (#884)
* Samtools mpileup now handles the '-d' max_depth option differently. There
is no longer an enforced minimum, and '-d 0' is interpreted as limitless
(no maximum - warning this may be slow). The default per-file depth is
now 8000, which matches the value mpileup used to use when processing a
single sample. To get the previous default behaviour use the higher of
8000 divided by the number of samples across all input files, or 250.
(#859)
* Samtools stats new features:
- The '--remove-overlaps' option discounts overlapping portions of
templates when computing coverage and mapped base counting. (#855)
- When a target file is in use, the number of bases inside the target is
printed and the percentage of target bases with coverage above a given
threshold specified by the '--cov-threshold' option. (#855)
- Split base composition and length statistics by first and last reads.
(#814, #816)
* Samtools faidx new features:
- Now takes long options. (#509, thanks to Pierre Lindenbaum)
- Now warns about zero-length and truncated sequences due to the requested
range being beyond the end of the sequence. (#834)
- Gets a new option (--continue) that allows it to carry on when a
requested sequence was not in the index. (#834)
- It is now possible to supply the list of regions to output in a text
file using the new '--region-file' option. (#840)
- New '-i' option to make faidx return the reverse complement of the
regions requested. (#878)
- faidx now works on FASTQ (returning FASTA) and added a new fqidx command
to index and return FASTQ. (#852)
* Samtools collate now has a fast option '-f' that only operates on primary
pairs, dropping secondary and supplementary. It tries to write pairs to
the final output file as soon as both reads have been found. (#818)
* Samtools bedcov gets a new '-j' option to make it ignore deletions (D) and
reference skips (N) when computing coverage. (#843)
* Small speed up to samtools coordinate sort, by converting it to use radix
sort. (#835, thanks to Zhuravleva Aleksandra)
* Samtools idxstats now works on SAM and CRAM files, however this isn't fast
due to some information lacking from indices. (#832)
* Compression levels may now be specified with the level=N
output-fmt-option. E.g. with -O bam,level=3.
* Various documentation improvements.
* Bug-fixes:
- Improved error reporting in several places. (#827, #834, #877, cd7197)
- Various test improvements.
- Fixed failures in the multi-region iterator (view -M) when regions
provided via BED files include overlaps (#819, reported by Dave Larson).
- Samtools stats now counts '=' and 'X' CIGAR operators when counting
mapped bases. (#855)
- Samtools stats has fixes for insert size filtering (-m, -i). (#845; #697
reported by Soumitra Pal)
- Samtools stats -F now longer negates an earlier -d option. (#830)
- Fix samtools stats crash when using a target region. (#875, reported by
John Marshall)
- Samtools sort now keeps to a single thread when the -@ option is absent.
Previously it would spawn a writer thread, which could cause the CPU
usage to go slightly over 100%. (#833, reported by Matthias Bernt)
- Fixed samtools phase '-A' option which was incorrectly defined to take a
parameter. (#850; #846 reported by Dianne Velasco)
- Fixed compilation problems when using C_INCLUDE_PATH. (#870; #817
reported by Robert Boissy)
- Fixed --version when built from a Git repository. (#844, thanks to John
Marshall)
- Use noenhanced mode for title in plot-bamstats. Prevents unwanted
interpretation of characters like underscore in gnuplot version 5.
(#829, thanks to M. Zapukhlyak)
- blast2sam.pl now reports perfect match hits (no indels or mismatches).
(#873, thanks to Nils Homer)
- Fixed bug in fasta and fastq subcommands where stdout would not be
flushed correctly if the -0 option was used.
- Fixed invalid memory access in mpileup and depth on alignment records
where the sequence is absent.
------------------------------------------------------------------------------
bcftools - changes v1.9
------------------------------------------------------------------------------
* `annotate`
- REF and ALT columns can be now transferred from the annotation file.
- fixed bug when setting vector_end values.
* `consensus`
- new -M option to control output at missing genotypes
- variants immediately following insersions should not be skipped. Note
however, that the current fix requires normalized VCF and may still
falsely skip variants adjacent to multiallelic indels.
- bug fixed in -H selection handling
* `convert`
- the --tsv2vcf option now makes the missing genotypes diploid, "./."
instead of "."
- the behavior of -i/-e with --gvcf2vcf changed. Previously only
sites with FILTER set to "PASS" or "." were expanded and the -i/-e
options dropped sites completely. The new behavior is to let the
-i/-e options control which records will be expanded. In order to
drop records completely, one can stream through "bcftools view"
first.
* `csq`
- since the real consequence of start/splice events are not known, the
aminoacid positions at subsequent variants should stay unchanged
- add `--force` option to skip malformatted transcripts in GFFs with
out-of-phase CDS exons.
* `+dosage`: output all alleles and all their dosages at multiallelic sites
* `+fixref`: fix serious bug in -m top conversion
* `-i/-e` filtering expressions:
- add two-tailed binomial test
- add functions N_PASS() and F_PASS()
- add support for lists of samples in filtering expressions, with many
samples it was impractical to list them all on the command line.
Samples can be now in a file as, e.g., GT[@samples.txt]="het"
- allow multiple perl functions in the expressions and some bug fixes
- fix a parsing problem, '@' was not removed from '@filename' expressions
* `mpileup`: fixed bug where, if samples were renamed using the `-G`
(`--read-groups`) option, some samples could be omitted from the output
file.
* `norm`: update INFO/END when normalizing indels
* `+split`: new -S option to subset samples and to use custom file names
instead of the defaults
* `+smpl-stats`: new plugin
* `+trio-stats`: new plugin
* Fixed build problems with non-functional configure script produced on some
platforms
Rob Davies r...@sanger.ac.uk
The Sanger Institute http://www.sanger.ac.uk/
Hinxton, Cambs., Tel. +44 (1223) 834244
CB10 1SA, U.K. Fax. +44 (1223) 494919
--
The Wellcome Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help