Re: [galaxy-user] Filter fastq by percentage of ambiguous (N) bases
Hello Anto, There is no specific tool that I know of to do this based off read content, but you could use the very low quality score (2) assigned to ambiguous bases and the tool 'Filter by quality' to do a filter by percentage. Be aware that other bases may have scores assigned to this lower value, but these would very likely not be of practical usage anyway. You could clip these end first, then do the filter, discarding any that have very short usable sequence left. If the data is Illumina, is likely a sign of a sequence that failed vendor quality checks, and these are no longer removed by default as of Casava 1.8+. Creating regular expression with the Select tool is another option, but this probably more effort than it is worth to construct. But, your choice. A google will bring up syntax advice. Ideally the first will do the job, Jen Galaxy team On 7/29/13 3:17 AM, Anto Praveen Rajkumar Rajamani wrote: Hello, I like to filter my fastq files (50 bp single end Illumina RNA seq reads) by a maximum threshold (10%) of ambiguous (N) bases. I can see that the CLIP tool removes all reads with one or more N bases. Is there a way to remove only the reads with five or more N bases using Galaxy? Thank you. Best wishes, Anto ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Problem with downloading
Hello Olivier, From a shell/unix/terminal window on your side, use curl to download very large datasets. The link can be obtained by right clicking the floppy disk icon inside a history item and choosing Copy Link Address (for most datasets) or Download Dataset/Download bam_index (for BAM datasets there are two downloads). Once you have the link: $ curl -O 'link' Hopefully this helps! Jen Galaxy team On 7/22/13 10:28 PM, GANDRILLON OLIVIER wrote: Hi I am trying to download the results of my analysis (.sam files) from Galaxy main. The download gets interrupted about halfway, with no error message, as if the full file had been downloaded (but has not). Thank's for your help Best Olivier Gandrillon ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Problem with ftp connexion to Galaxy
Hello Fabrice, If you have not already done so, please reset your password and try again. This is our notice about the issue with accounts created during a short time window last May: http://wiki.galaxyproject.org/Main/Notices#May_29th_2013_FTP_login_resolution Best, Jen Galaxy team On 7/29/13 7:57 AM, Fabrice BESNARD wrote: Hello, I registered some time ago to Galaxy. From the very beginning, my ftp connexion is not working. I am using Filezilla, on a windows operating system. More precisely, the connexion can be established, but the process fail when it is asking for the password. Statut :Résolution de l'adresse de main.g2.bx.psu.edu Statut :Connexion à 128.118.250.4:21... Statut :Connexion établie, attente du message d'accueil... Réponse : 220 ProFTPD 1.3.4b Server (Galaxy Main Server FTP) [:::128.118.250.4] Commande : USER fbesn...@biologie.ens.fr Réponse : 331 Password required for fbesn...@biologie.ens.fr Commande : PASS *** Réponse : 530 Login incorrect. Erreur :Erreur critique Erreur :Impossible d'établir une connexion au serveur I sent an email two month ago to report this problem. I was answered that this was an issue happening with the new accounts created. Obviously, this issue has not been fixed yet, has it? If yes, could you indicate me to make my ftp connexion works now ? Thank you for your help !! -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] rattus norvegicus reference genome
Hello Matt, If you are using the public Main Galaxy instance or a Cloud Galaxy AMI, the genome should be available. But on a local instance, reference genomes need to be set up. These are the instructions: http://wiki.galaxyproject.org/Admin/Data%20Integration http://wiki.galaxyproject.org/Admin/NGS%20Local%20Setup#May_29th_2013_FTP_login_resolution If you are on a local, you will probably want to start asking questions about local set-up on the galaxy-...@bx.psu.edu mailing list and also consider subscribing/following it. http://wiki.galaxyproject.org/Support#Mailing_Lists Take care, Jen Galaxy team On 7/29/13 10:51 AM, Matthew Girgenti wrote: Hi I'm currently using the Galaxy pipeline to analyze my illumina chip-seq data. When I try to use NGA: MappingMap with Bowtie for Illumina to map my reads there is no rat reference genome. I says I should contact galaxy. Should I simply upload the genome to proceed with analysis? Thanks Matt ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] FTP problem
Hi Karen, Help to use FTP is on this wiki. The screencasts both show the two step process. The first is to FTP the data to the server, the second is to move the data from the Get Data - Upload Data tool form into your history. http://wiki.galaxyproject.org/FTPUpload Thanks! Jen Galaxy team On 7/29/13 4:05 PM, Karen Schwarzberg wrote: Hello, I am using the Galaxy server and I am trying to upload files. I do not see the FTP option even when logged in. Please advise. Best regards, Karen ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] Inquiry on FastQC report
Hello, Your post is very difficult to read with the formatting. The best place to find out more about the FastQC program is through the tool documentation, linked from the tool form but also here: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/ More below. On 7/31/13 11:08 PM, Ng Kiaw Kiaw wrote: Dear Galaxy Officer, Good day. I am a new user of Galaxy main server. The tools provided are very user-friendly. Thanks for the establishment of these. I just new to the RNA-seq analysis and now in the learning process of Bioinformatics. I would like to inquire on the FastaQC report generated on my data. For your information: Samples: Plant (dicotyledon) Type of data: RNA-seq (Illumina HiSeq 2000 with CASAVA v 1.8.2) Paired ends Adapter sequence: RPI 15 ( 5' CAAGCAGAAGACGGCATACGAGA*TTGACATG*TGACTGGAGTTCCTTGGCACCCGAGAATTCCA) Main purpose of my analysis: Identification of novel transcript and gene expression studies I run FastQC on my raw RNA-seq data both forward and reverse. I attach the FastQC report in this email. My questions are: 1) The basic statistics shows that my data encoding is Sanger/illumina 1.9. When I grooming my data for downstream analysis in Galaxy, is that correct I choose Sanger for the input FASTQ quality score type? Yes, if you choose to groom, Sanger is the correct input. Or you can just assign the datatype to .fastqsanger by clicking on the pencil icon. More help is in this screencast FASTQ Prep - Illumina https://main.g2.bx.psu.edu/u/galaxyproject/p/screencasts-usegalaxyorg 2) Based on the per base sequence quality, the quality scores are above 20.0 for both forward and reverse data. Do I still need to trim off my data? No, most likely not, this is a reasonable quality score to use as a baseline. 3) The result for Per base sequence content, Per base GC content, sequence duplication level are fail. What are these three results indicate? What are the solution for these problems? These are quality metrics and indicate that the data is skewed away from what would be expected in a normal distribution. You could investigate the library preparation methods is this is your own data. 4) What the overrepresented sequence indicate? Do I need to trim off the overrepresented sequence? Same as above. And yes, if it is a great portion of your data, repetitive, or causes problem later on, as it effectively shortens the length of the sequence being aligned, even though the sequence is longer - and this could cause you to pick the wrong length parameters in Tophat. 5) Based on the K-mer content, how could I analyse and justify whether this is good data or not? Same as above. 6) In the reverse data FastQC report, per sequence GC content seem not good. What do this indicate? Same as above. 7) How could I identify the adapter sequence in my RNA-seq data and how could could I remove? Locating the methods associated with the preparation of the data is the first place to look. You could also just trim the reads if the overrepresented sequence is localized to where the adapter is most likely to be, then trim based off of that range. 8) After grooming data, running FastQC on data, adapter removal, is there any other pre-processing steps need to be done before running bowtie and top hat? Because quality is not an issue, no trimming is necessary. You could however filter out short sequences that will never be able to meet the alignment criteria. See the Tophat documentation about how to best tune parameters to match data based on the length of reads. All of this said, most of the time, very little needs to be done most of the time. Poor reads will simply fall out and not align in the first steps of the pipeline. Trimming and setting Tophat parameters will have the greatest impact. Take care, Jen Galaxy team Many Thanks in advance for your kind assistance and supports. Best regards Ng Kiaw Kiaw PhD student RIKEN Yokohama Campus Japan. ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your
Re: [galaxy-user] after 'trim sequences', 'map w/Bowtie' job has been waiting for hours
Hello, The public Main Galaxy has been very slow recently, but you should being seeing performance improvement by now. Allow jobs to run for the best results. http://wiki.galaxyproject.org/Support#Dataset_status_and_how_jobs_execute Our apologies for the inconvenience, Jen Galaxy team On 8/2/13 6:34 AM, Theresa Stueve wrote: Greetings, I am using Galaxy Main On 8/1/2013, and after 'trimming sequences' on my 'FASTQ groomed' data, I attempted to align my (single-end) 'trimmed sequences' output data to 'hg19 Canonical male' with 'Map with Bowtie for Illumina' It is now 8/2/2013, and the 'Map with Bowtie for Illumina' job has been 'waiting to run' all night, over night. I have 're-newed' the history several times, but have not deleted or re-run the job for fear of losing my place in the queue. Can you tell me if the delay is because I've done something incorrectly? https://main.g2.bx.psu.edu/u/theresa-stueve/h/1stunsupctcfupload -- (Theresa) Ryan Stueve Department of Preventive Medicine USC ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] very slow server
Hello, The public Main Galaxy has been very slow recently, but you should being seeing performance improvement by now. Allow jobs to run for the best results. http://wiki.galaxyproject.org/Support#Dataset_status_and_how_jobs_execute Our apologies for the inconvenience, Jen Galaxy team On 8/2/13 9:27 AM, Politz, Samuel M. wrote: Hello, for the last week or so the main Galaxy server has not completed any of my jobs. These were jobs that were successfully queued, but I deleted them after 48 hours passed without completion. I started a job last night (BEDtools BedGraph of genome coverage on a BAM file). This morning I got an error message, something to the effect that my account was having difficulty receiving updates from the Galaxy server. I am not sure this latest message had anything to do with the previous delays, but the current job is still queued up. Over the past month or two, the Galaxy server has been slow to complete jobs at times, but never more than a few hours, or at most, overnight. I wonder if the current problem is a symptom of increased use of the server, or if something is wrong with my particular usage. Either way, it is currently preventing me from getting any analysis done on the server. Best regards, Sam Politz ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] slow Galaxy server
Hello, The public Main Galaxy has been very slow recently, but you should being seeing performance improvement by now. Allow jobs to run for the best results. http://wiki.galaxyproject.org/Support#Dataset_status_and_how_jobs_execute Our apologies for the inconvenience, Jen Galaxy team On 8/2/13 9:49 AM, Politz, Samuel M. wrote: This is a followup to my previous post on this topic. The exact message (in red) that I see after a long delay in running a tool is An error occurred while getting updates from the server. Please contact a Galaxy administrator if the problem persists. I have seen this message twice while waiting for two unrelated tools to run. There is no green'bug' icon so I could not report this to an administrator. Another main server user reported this error message to the user mailing list as well. The response to their post was to wait a few minutes. Well, in this case I have been waiting overnight. Best regards, Sam Politz ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-user] after 'trim sequences', 'map w/Bowtie' job has been waiting for days to run
Hello, This is the same issue - the public Main Galaxy has been very slow recently, but is catching up. Again, very sorry for the confusion, Jen Galaxy team On 8/3/13 7:04 AM, Theresa Stueve wrote: Greetings, I am using Galaxy Main; On 8/1/2013, and after 'trimming sequences' on my 'FASTQ groomed' data, I attempted to align my (single-end) 'trimmed sequences' output data to 'hg19 Canonical male' with 'Map with Bowtie for Illumina' It is now 8/3/2013, and the 'Map with Bowtie for Illumina' job has been 'waiting to run' +48hrs. I have 're-newed' the history several times, but have not deleted or re-run the original job for fear of losing my place in the queue. To check if 'canonical male' was the problem, I queued a second alignment to be 'mapped w/bowtie for Illumina' against hg19 (no sex); this job has also not run after being queued mor than 24 hours ago. Can you tell me if the delay is because I've done something incorrectly or if there's some action I can take to speed things? your help is deeply and sincerely appreciated https://main.g2.bx.psu.edu/u/theresa-stueve/h/1stunsupctcfupload-1 -- (Theresa) Ryan Stueve Department of Preventive Medicine Ite Laird-Offringa Lab NTT 6420 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-user] Uploading speed question
I am uploading a file to the Main Galaxy and ... the speed as shown by FileZilla is 180KB/sec. I know from other work that my internet connection has 2MB/sec and usually does not go lower than 1 MB/sec. Is that normal, or is there a problem, and where is it, my system or Main? What should the speed be? Also, does anybody have experience with uploading to a Galaxy cloud instance? What's the speed there? ;-) does Main accept mailed-in hard drives? ;-) Gerald ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/