Hello,
I checked ape::del.colgapsonly, ips::deleteGaps and ips::deleteEmptyCells.
They delete columns containing missing values, but I need also to delete
columns containing base "N" (all columns with amount of Ns over certain
threshold).
Actually, ips::deleteEmptyCells has option nset=c("-", "n
Hello V.
Because you speak of columns I assume you are handling an alignment,
right? If you handle an alignment all sequences have the same length and
you can do as.matrix
Like this?
library(magrittr)
#maximum number of n's
thresh <- 0.005 #0.5%
seq <- as.matrix(seq)
temp <- seq %>% sapply(.
Hm, I tried a dirty hack: I exported the DNAbin object using ape::write.dna
and replaced all occurrences of "n" in any sequence by "-" and imported the
file back to R with ape::read.dna. Then I tried the mentioned functions. They
did nothing. When I exported the file to disk, the FASTA file did
Hi Vojtěch,
Here's something you could do. First, make a copy of del.colgapsonly:
toto <- del.colgapsonly
Then, edit this copy (e.g., with fix(toto)), find this line:
foo <- function(x) sum(x == 4)
and replace 4 by 240. Save and close. Now you can use toto() in the same
way than del.colg
Thank You,
Andreas, yes, I try to manipulate an alignment. This is nice trick, although
it returns empty alignment regardless threshold value used (I do have some
data in the alignment:-)...
Have a nice weekend,
V.
Dne pátek 27. října 2017 17:02:45 CEST jste napsal(a):
> Hello V.
> Because you s
HI Vojtech,
Below is a function modified from package *ips deleteEmptyCells *function,
it works on "?" percentage in the alignment, but that can be easily
modified with the above suggestions by Emmanuel.
Best,
Matt
DeleteFunkyColumns =
function (DNAbin, cutoff=0.3, quiet = FALSE)
{
isfunk