Hi Jarrett, read.FASTA() always returns a list. So you may do something (quite general) like:
fls <- dir(pattern = "\\.fas$|\\.fasta$", ignore.case = TRUE) # add more file extensions if needed X <- lapply(fls, read.FASTA) seqlen <- lengths(X) if (length(unique(seqlen)) == 1) X <- as.matrix(X) If the sequences are not of the same length, you can use the vector 'seqlen' for further processing, for instance to remove the shortest ones (if this makes sense): X[seqlen > 100] Also I found the function fasta.index (in Biostrings on BioConductor) to be very useful for this kind of tasks: it scans a bunch of FASTA files (possibly in different directories) and returns a data frame with each row describing each sequence (length, label, path, ...). HTH Best, Emmanuel ----- Le 12 Mar 20, à 22:18, Jarrett Phillips phillipsjarre...@gmail.com a écrit : > Hi All, > > I have a folder with multiple FASTA files which need to be read into R. > > To avoid file overwriting, I use ape::rbind.DNAbin() as follows: > > file.names <- list.files(path = envr$filepath, pattern = ".fas") > tmp <- matrix() > for (i in 1:length(file.names)) { > seqs <- read.dna(file = file.names[i], format = "fasta") > seqs <- rbind.DNAbin(tmp, seqs) > } > > When run however, I get an error saying that the files do not have the same > number of columns (i.e., alignments are all not of the same length). > > How can I avoid this error. I feel that it's a basic fix, but one that is > not immediately obvious to me. > > Thanks! > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-phylo mailing list - R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ _______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/