Re: [galaxy-user] Megablast database identity

2014-04-28 Thread Jennifer Jackson
Hello, The Megablast htgs, nt, and wgs databases are in the process of being updated to the latest NCBI releases and are expected to be available by tomorrow morning (possibly sooner). Should you wish to continue your analysis using the prior versions, these are available through our rsync

[galaxy-user] Megablast database identity

2014-04-27 Thread Scott W. Tighe
Jennifer I am megablasting a simple 500,000 line dataset that is certainly in galaxy fasta. For a week i have been seeing numerous errors. So i have reprocesed the data multiple times. The error message is could not find specified database directory Is there an alternative approach? I

Re: [galaxy-user] MegaBLAST output

2012-04-25 Thread Sandrine Hughes
Dear all, Sometimes ago, I’ve reported on this list the same problem with megablast than Sarah mentioned. I finally used another way to analyse my data but my conclusion was similar to Sarah one with most of the time a shift of « -1 » between the GI number in the output and the following

Re: [galaxy-user] MegaBLAST output

2012-04-24 Thread Jennifer Jackson
Hi Sarah, We appreciate all of the information you have provided and have been working here since yesterday to investigate the issue in more detail. This includes incorporating the additional data both you and Peter have been posting. We don't have anything conclusive to report yet, but it

Re: [galaxy-user] MegaBLAST output

2012-04-24 Thread Peter Cock
On Tue, Apr 24, 2012 at 10:24 PM, Jennifer Jackson j...@bx.psu.edu wrote: ..., using the BLAST+ BLASTN megablast wrapper that Peter authored, in a local or cloud instance, would be the best immediate remedy (this version has the standard 12 column output). Sequence length data could always be

Re: [galaxy-user] MegaBLAST output

2012-04-24 Thread Jennifer Jackson
Thanks Peter, Excellent point. From there, the Cut tool could be used to reorganize the output to exactly match that of the 13-column regular megablast output. So, no external data needed, no tool modifications needed. This can't be done on the main public Galaxy instance as BLAST+ is not

[galaxy-user] MegaBLAST output

2012-04-23 Thread Sarah Hicks
I am having trouble finding information on the MegaBLAST output columns. What is each column for? I can't seem to figure this out by comparing info in the columns to NCBI directly because the GI#'s don't match with the correct entry on NCBI. I've seen that others have posted about that problem, so

Re: [galaxy-user] MegaBLAST output

2012-04-23 Thread Jennifer Jackson
Hi Sarah, Peter defined the columns (thanks) but I can provide some information about the GenBank identifiers. The megablast database on the public server are roughly a year old and there have been updates at NCBI since that time. As I understand it, this manifests as occasional mismatches

Re: [galaxy-user] MegaBLAST output

2012-04-23 Thread Sarah Hicks
Thanks so much for the prompt reply. I don't mind using last years GenBank, as long as I am getting accurate hits. I just have a couple more questions to confirm I am safe using the Galaxy pipline for this... So if I continue to work within the the 1 year old database, can I trust the output as

Re: [galaxy-user] MegaBLAST output

2012-04-23 Thread Sarah Hicks
Peter, you requested an example, here are the first five hits for my first query sequence (OTU#0) 0 324034994 527 93.23 266 13 5 1 265 22 283 7e-102 379.0 0 56181650513 93.26 267 10 8 1 265 25

Re: [galaxy-user] Megablast question

2012-04-11 Thread Jennifer Jackson
Hi Vasu, The three primary megablast databases available on the public main Galaxy instance are comprised of individual fragments/sequences of different types from many species (not assembled genomes): http://user.list.galaxyproject.org/Question-about-megablast-td4543260.html If you want to

[galaxy-user] Megablast question

2012-04-10 Thread shamsher jagat
Hi, I am using megablast and was wondering how can I get chromosome number and coordinates of its hits. Thanks Shamesher ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server

Re: [galaxy-user] Megablast

2012-02-21 Thread Jennifer Jackson
Hello Scott, For #1, option -p: Here is a link to some megablast parameter documentation online: http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/megablast.html#3 (the primary paper for the Galaxy tool is noted at the bottom of the tool form, but this is convenient) Quote: Table 3.30 Parameter

Re: [galaxy-user] Megablast question

2012-02-16 Thread Noa Sher
Hi Scott I never used megablast so what i am writing is true of just any fasta file (so if there is anything quirky in megablast that i dont know about, apologies!): Take your fasta file and convert to tabular (under "fasta manipulation" - this will

Re: [galaxy-user] Megablast question

2012-02-16 Thread Dannon Baker
Noa has the right idea, but if you're asking for how to split a dataset into two non-overlapping halves you'll want to use Select First and Select Last, instead of random lines. Get an accurate line count from your file using the Line/Word/Character count tool and then split it right in the