Hello Rita,

The UCSC Table Browser has a limit on the amount of output that can be extracted in any single query. Without seeing your history, my initial suspicion is that both of the queries timed out, the first sooner than the last.

Comparing the number of items between the original UCSC Table and the final dataset in Galaxy is a good place to start. You could also check the last few lines of the dataset to see if the data ends abruptly (sometimes with a message) using the "Text Manipulation -> Select last lines from a dataset", last 10 lines or so, converted to tabular format if necessary first.

As long as the entire track is under 50G, you could consider loading the flat text file of the data. This would be on the UCSC Downloads server. Download using their instructions and then upload into Galaxy using FTP: http://wiki.g2.bx.psu.edu/FTPUpload?action=edit

UCSC: http://genome.ucsc.edu/index.html
See "Downloads" on left tool menu
Navigate to genome, build, annotation database, and target table
Help is at: http://genome.ucsc.edu/contacts.html

Best wishes for your project,

Jen
Galaxy team

On 2/22/12 8:08 AM, Rita Rebollo wrote:
Hello,

I got an unexpected scientific result from a simple "get data" from UCSC table 
browser with galaxy. I have uploaded the mouse mm9 repeat masker track with a filtering 
on repClass = LINE SINE LTR DNA
If I ask the output to be sequences, I will get 1 454 739 sequences, but if I ask the 
same data to be retrieved as a BED format I get roughly 3 600 000 which is the closest to 
the "summary/statiscs" of the dataset (item count = 3 493 484). Why is there a 
difference between the FASTA file and the BED file?

Thank you

Rita

___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/

--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org/wiki/Support
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

Reply via email to