Hello again,

I am trying to randomly select sequences from an uploaded fasta file, but only about one-half of the randomly selected sequences actually contain sequence data (see below).  The others contain only the name of the sequence.  This happens even after making sure that in the initial file all of the sequences indeed have sequence data (by filtering to obtain only sequences with >100bp).

Any suggestions?

This is what the output looks like:

>scaffold1034  2.1
>scaffold1085  1.7
>scaffold1499  1.2
CCTTTGGATGTCACACATGTGCCATCCCGTAGCATTCTTAAAAGAAGGCTAAGGAGCAATTTGCGATTCCCAGTCTAAGTCAATTTACTGTTCGATTTTANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAGACCTGATAATACTGTGTACACAATGAGAGCGACTTAATGCTTCATCATATGAAGAACTGTAGGCCATTTTTTCTAATCAAGTTTGTGGCGGATTCATCATAGCTGCTATTGGTGACAATTCTTT
CTAAGGTTGCTAGAAATAGTGATGTGGAACACAAGTGCTGCAGGTCATTGCATGTCTCAATCAGTCGTTTGCTTCTCAAAACACGGGCTGTAGGAAGCGCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCGAGTTATTGCCATCCTAATTTATCATTTCGTGCGCGATATATATCGACTTTTTTTCGTCCTGTCTGTGCCTCTCCTGCGAAGGCGCCATTCTAATCCCTGCGCGTGACGGCAGATTGACATGACCTCAAG
CAACCTGAACACCCCTATCCCAAATATACTTGAGTCCCCCCTGCACATCCTCCGCATTACTCATGATAACCTGCCACATTGTTCATGGTAGCCCTTTAAANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGCAGACATTACCTAGGATGATGTTTCTAATCGTAGCAAAAATTCTTGTAAATGACGTTCCAGTTGTTTA
CAACCTAAAATTACACACATTAAAACTGCTGGCTAGAATTTACATTGAAACATTAAGATATATTACAAAATATGGACAAATAAATTCGTGACAAATATATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCTGTTTGCGATACAAGGAAGGGCTTACGCAAAAATTTCCCAAGAAACCATGTGCGATGAGAAGCGAAACAGTAACTACAGGATTTCTTACCCATTAATTGCTCATTTCTCTAATCTGCATTTCCGTTGATCAATTT
>scaffold2897  2.0
>scaffold2930  3.0
TCCTAAACGTACATATTTCAAACAAAGATGTTTGAAGCCTTAGCAATATTTAGATGACTTTGCTTTCAAGGGTTCTTTTCGTCACCTTTTGTCATCTGCANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTACCAACGTTAAAAGAGTTGATTAGTAAGAGTGATATGCCCCTCGATCAAAAGCATCCCCCGGATATTTCAGCGACAGGGCAGCGGAACTT
CCAGAAGTGTAGAGTTGATTTAGTTTTGTTGCAAGGGCTCCCAGATTAATTAGTAAGATATCAAAGTAAATAATAACATAGTTTTTACTCAAAGCAGTTGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATGATTCTCGAAGCTGTTATTTCATTTTTTCCTATGAATTTTTTTAAGTTTTGAAGTATACTTTCGATTTTTGTAAAGCGGGCTATAATCTCGAGGGAAATATTCTAAAGCTGGGTGAAAAATTATTCCATCTCTTGGAAAATATGATTGAAAGGTTCCGTTCGGCAAAGGGTTATCCCTCTTCGGAGTAGCTCTGTTATGAGATGGTTCAACGTTATGTCATTTTTCATTTTCACTGCAGGAGG
GAAGACGTTTCACTGATATATGCATTGCCCTCTGTCACATCGAATCACTGTATGATAATGCCCACGAGAAAAGATAACCCCATCCCAAACTTTTTATAGATGTAACAAATGGTTGAAGAAATTTGGCTATTCATGCTATTATGTTCATCTCTTTAANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNATGTTCATCTCTTTAAAGCTTTCTTTCAGGGAAAAATTCCCATCCTCCAAAATATTCTCACACCGTACCGGGAGCAACAGGAAGGCACATCGACGTTTTATTTGGCACAATGAAAAAATATCACCAGCAATTTCTATTATAACTTGGAGCCTTGTTCTCTGGATTTATTGAGGC
GTCGGAATTTTTCCTGGATCCACCCCTGGATACTTCATGCAGATCAACTTACCCATGCTTTAGCTGTGAGGAATTTATGAAGATTTTCATTGAAAGACATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCAAGAAGTCAGGATCAAATCTTTCAAAATGCTTGCTGAATGTAAACTTCTCAGGCAAGGGCTGAATAG
GCCACAACACCTTACTGCGACTCCATGCTGTACAACGAATTATTGCTGCATTTAATCATTGTGACCATGATTTAATTCACATCGCACCTCCAAATTTGGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAGGTAGCAGATTCAAAGCAATTTGCCAGCTTGCACATGTGATGAAAGGATGTGAAGCGATGGTTTCAGTTTCAACTACTAATTGCTGTAAACAATGCTAATAATTTTTCACAATTACTGCTGCACAGCCACAGANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTGACTCGTAAGAGAGAGTGCAGTCTCATTTTCAGCGACAAGTCCAAAGAGGCACTTGAGTTTCCCAGGACAAAAAAACACAAAAAACAGAAGATATTCAAAAATGCGAAATGAATTATTTCTTAATTGCTATCCACCAAAGTACCAACATATCAACATTAATGTGCTAGGTCAATCGTTTTTTACATTCCTCAAGGTTATG
ATTGATAAAACTGTAAGGACTCTTCTTATACACAAAGGGCGTTTAATCTCTTTCAAGAGTTACACGACATTAGTTTTCAAGCAAATATAAAAGATTTTCANNNNNNNNNNNNNNNNCGATTTCTCACGGGTGAGAGATAGCCTCTGCACATTTATTAGCATGGGTTTGAAAATCTTTAGCATTTGTTTTAAAATCTATGCCTTATAACTATTGAGAGATGTAAAACGCCCATCGTGTATTGTTTTCTGTGAGGGAAGTCCTCGGGATTTGATCAACGTCTTAGGCCCTTTTTCAGTCATTTTGCAATCAT
>scaffold4235  1.6

While the original file looks OK:
>scaffold2  2.0
CAGATGATTCAAACAAAGAGACTGAAGAAATGACTTCATCCGCTGACATAGAGAAAAAGGGCAATGAGAGTGTACATGCAGACTTAGTTATGCAAATAAANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCCGAGAGCTTTGCAATGGGTAGTTGCCGTTCATTAACTGATATACTTGCCAAATTTAGTGAATTCCGT
>scaffold7  2.5
CTCAAACTGGTTTGAAATATTTAAAATTTCTCCCCGATCTGAGTTGAACTCGGTGTCTGGTCTAAGCCGTTAGAGTGTTTACATGGAAAATGCAACTCAANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGAAGGTGTGAGATGCAGTGTCTCGGAAGTCTGATTTAGCCTAGTGTTTTACTTGGCGCTCAATGCATGA
>scaffold5  3.1
TTCTCTTCTCAACCCTCATTACGCAATCAGTAACCTTCTTCTTGGTCAACCCTGGACCATCGATGCAGCCAGTGAACGATTCAAACAAGGTAGTGTATTCNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCCTTATTTACAGCTTTCAAGATGTCCTCTGGCTTAACCTTGAATTCATCTGAGAGAAATCCAAATCCTGTCCCGAAG
>scaffold9  2.0
CTTTTACTTCAGGAGAAAATAACCTTTCAAACATCGTGCATTCTTTCTTACTCATAAGGTATAGATAGCTCTTTGTAATAATTCATACGTTCTCTTTCACNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTAGTTTGTCTTATCTTAACAGCTCTTGTTACATAGATATATTTGGGAAGGAGTCGGTCAGTCAAGTTTG
>scaffold10  2.4
GTTGTCTCCTGAAATCATAAATTAGTATCATCATTATCATCATTATCATCATTATTATTTTCAAGGAAATATTTGGTCTAAACATCATTAAGATTTCAACNNNN

Thanks
Daniel
-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Daniel Sher, PhD
Department of Marine Biology
Leon H. Charney School of Marine Sciences
University of Haifa, Mt. Carmel 31905, Haifa, Israel
 
Office +972-4-8240731
Lab    +972-4-8288961
email: ds...@sci.haifa.ac.il
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to