[galaxy-user] Text Manipulation: Filter out duplicates (uniq) from an plain text file ?

2011-05-06 Thread Roman Valls
Hey galaxy users, Thats a fairly good question from one of my colleagues. I've looked through the menus (mainly Text Manipulation and Filter and Sort(Select)), googled (on the mailing list archives too), but couldn't find an answer: How should I remove duplicates on plain text files without

Re: [galaxy-user] Text Manipulation: Filter out duplicates (uniq) from an plain text file ?

2011-05-06 Thread Peter Cock
On Fri, May 6, 2011 at 3:16 PM, Roman Valls brainst...@nopcode.org wrote: Well, having similarly basic tools (in Galaxy) that can be performed on the commandline, such as sort or cut I just wondered how come a uniq is not there on the tool panel in some form/name. Thanks for the feedback Rory

Re: [galaxy-user] Text Manipulation: Filter out duplicates (uniq) from an plain text file ?

2011-05-06 Thread Guru Ananda
Hi Peter and Roman, The Count tool under Statistics section provides uniq-like functionality. If you run this tool by selecting all columns under Count occurrences of values in column(s) field, your output will contain one line per record, with the 1st column containing the number of occurrences

[galaxy-user] RNA seq analysis

2011-05-06 Thread puvan001
Hi I have a couple of questions regarding RNA seq analysis. My questions are 1.I need to use a viral genome (very small, ~2kb ) as a reference genome and it is not available in Galaxy. I guess I can use this data from my history. I have a fasta file but I am not sure whether I have to do some

Re: [galaxy-user] RNA seq analysis

2011-05-06 Thread David Matthews
Hi, I have done exactly the same kind of thing for adenovirus so I can help with it. In answer to question 1 you do not need to index it will be done for you when tophat is called. Secondly you should leave the 40 multihits as it is and post analysis filter out the multihits - this will allow

Re: [galaxy-user] RNA seq analysis

2011-05-06 Thread puvan001
Hi David, Thanks!When I tried to run Tophat, it doesn't recognise my FASTA file and it says History does not include a dataset of the required format / build. Do you have any thoughts about this? Now it makes more sense about multihits. Thanks for sharing your workflow. With regards

Re: [galaxy-user] RNA seq analysis

2011-05-06 Thread Austin Paul
Hi, You need to run fastq groomer on your rna-seq data. Your reference is fine as a fasta. Austin On Fri, May 6, 2011 at 10:26 AM, puvan...@umn.edu wrote: Hi David, Thanks!When I tried to run Tophat, it doesn't recognise my FASTA file and it says History does not include a dataset of the

Re: [galaxy-user] RNA seq analysis

2011-05-06 Thread puvan001
Hi Austin I did all these (grooming and trimming)on rna-seq data and I don't have a problem with built in genome . I'll try again! Thanks Sumathy On May 6 2011, Austin Paul wrote: Hi, You need to run fastq groomer on your rna-seq data. Your reference is fine as a fasta. Austin On

[galaxy-user] Composite Datatypes Q.

2011-05-06 Thread Todd Yilk
I have a program I'm trying to galaxify that emits a variable number of result files. I would like the output of my Galaxy tool to show up in Galaxy as an html file with links to the result files. So when you click on the eye, the html file should up in the middle pane ... sorry if I'm not

[galaxy-user] Megblast GIs

2011-05-06 Thread Douglas Rhoads
We have a local install of Galaxy and are using it for training grad and undergrad students (using the Windshield Splatter data). We have a relatively new install and the Megablast seems to be doing something funny with the output in that column 2 which is supposed to be the GI of the

Re: [galaxy-user] RNA seq analysis

2011-05-06 Thread Austin Paul
There are many ways. I typically use IGV. It needs a sam file, so I first convert the bam to sam in galaxy, then download the sam file. In IGV, I upload the reference and the sam file, then use IGVtools to index the sam file, then I can visualize the data. Austin On Fri, May 6, 2011 at 5:30

Re: [galaxy-user] RNA seq analysis

2011-05-06 Thread Austin Paul
Oops. Good to know. Thanks. Austin On Fri, May 6, 2011 at 6:02 PM, Sean Davis sdav...@mail.nih.gov wrote: IGV reads BAM files just fine; no need to convert to SAM. Sean On Fri, May 6, 2011 at 8:45 PM, Austin Paul austi...@usc.edu wrote: There are many ways. I typically use IGV. It

Re: [galaxy-user] RNA seq analysis

2011-05-06 Thread Jim Robinson
Hi Vasu, I'm going to add the function to index BAM files soon, using Picard. In the beginning there was no java BAM reader, only SAM, and I added the index then. Indexed BAMs came along later, but that's probably more than you want to know...I think most people will still

Re: [galaxy-user] RNA seq analysis

2011-05-06 Thread puvan001
Hi I may be doing in a wrong way. I clicked trackster and I added the custom build genome. Since it is a very small genome (~2kb), I considered this as a single contig. Then I cliked add tracks and added my data file. But I got a message no data for this contig. Whenever I used built in

Re: [galaxy-user] RNA seq analysis

2011-05-06 Thread vasu punj
Thanks Jim,   Vasu   --- On Fri, 5/6/11, Jim Robinson jrobi...@broadinstitute.org wrote: From: Jim Robinson jrobi...@broadinstitute.org Subject: Re: [galaxy-user] RNA seq analysis To: vasu punj pu...@yahoo.com Cc: Austin Paul austi...@usc.edu, Sean Davis sdav...@mail.nih.gov,