16s is basically useless for identification to genus. Since I started
sequencing 16s in 1992, I have come to realize that without sequencing
the full 1540 bases, it is generally misleading, and even than, it is
not accurate enough to nail genus on more than 1/2 the cases. However,
Glad that helped. And yes, you can merge many file types that are
text-based with the tool 'Text Manipulation - Concatenate datasets.
Sometimes you will need to convert to format tabular first, and then
back to the desired format (fasta, gtf, etc.) after.
I am analyzing miRNA sequencing now. My data is 51bp, single -ended and ~5
M reads. I want to remove the adapter sequences from the reads before
mapping to the genomes/known miRNA database.
My 3' adapter sequence is : 5-AGATCGGAAGAGCACACGTCT-3. I found that many
reads only contain part of
Just enter the whole adapter sequence. The tool will match what is found
in the input sequence and clip. The help graphic on the Clip form itself
illustrates this - only one adapter is entered (can be entered) but a
variable length is clipped from the input to produce the output.
To hopefully be clearer, the part matched is clipped (whole or partial, and
there is even some tolerance for low-frequency mismatches).
I would suggest taking a few sequences out and running the tool on them to try
it out. You could test for both length and mismatch constraints this
Mail list logo