[galaxy-dev] stdout and stderr while using pbs
Hi All, if i am using pbs . in this i am getting stderror and stdout . then how can i handle such type of problem. can i check the standard error before displaying anything on browser. Reagrds shashi shekhar ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] GATK integration
Hey, I know, the GATK-tools is considered to be in alpha status, but maybe someone can help me. First, is there a documentation on which dependencies where to put (just to confirm I've done it correctly) ? Second, when I run the unified genotyper, I get the following error: [Wed May 25 15:57:04 CEST 2011] net.sf.picard.sam.CreateSequenceDictionary REFERENCE=/var/folders/gr/grLYUw45FFS4KKMXzK5R6TI/-Tmp-/tmp7m3sg0/gatk_input.fasta OUTPUT=/var/folders/gr/grLYUw45FFS4KKMXzK5R6TI/-Tmp-/tmp7m3sg0/dict6511818973877179087.tmp TRUNCATE_NAMES_AT_WHITESPACE=true NUM_SEQUENCES=2147483647 TMP_DIR=/var/folders/gr/grLYUw45FFS4KKMXzK5R6TI/-Tmp-/ngs2 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=50 CREATE_INDEX=false CREATE_MD5_FILE=false [Wed May 25 15:57:04 CEST 2011] net.sf.picard.sam.CreateSequenceDictionary done. Runtime.totalMemory()=129957888 Exception in thread main java.lang.OutOfMemoryError: Java heap space at net.sf.picard.reference.IndexedFastaSequenceFile.getSubsequenceAt(IndexedFastaSequenceFile.java:178) at net.sf.picard.reference.IndexedFastaSequenceFile.getSequence(IndexedFastaSequenceFile.java:157) at net.sf.picard.reference.IndexedFastaSequenceFile.nextSequence(IndexedFastaSequenceFile.java:234) at net.sf.picard.sam.CreateSequenceDictionary.makeSequenceDictionary(CreateSequenceDictionary.java:133) at net.sf.picard.sam.CreateSequenceDictionary.doWork(CreateSequenceDictionary.java:113) at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:165) at org.broadinstitute.sting.gatk.datasources.simpleDataSources.ReferenceDataSource.(ReferenceDataSource.java:131) at org.broadinstitute.sting.gatk.AbstractGenomeAnalysisEngine.openReferenceSequenceFile(AbstractGenomeAnalysisEngine.java:577) at org.broadinstitute.sting.gatk.AbstractGenomeAnalysisEngine.initializeDataSources(AbstractGenomeAnalysisEngine.java:318) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:90) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:97) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:244) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:87) I assume, changing the java heap size will solve the problem. But how can I do it in galaxy ? Thanks for your help! Jan ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Workflows and unknown number of output datasets
Hi everyone First let me thank all the Team Galaxy, the conference was really great. Kanwei asked me to send the tool I told him about. It's one of these tools for which you can't know the exact number of output datasets before tool run. Here are the files. It's really simple actually, but it would be nice if I could integrate it in workflows without it having to be the final step. Would it even be possible? And please ignore the idiotic comments in the files, I was really tired that day. Cheers, L-A # -*- coding: UTF-8 -*- import os, sys, string mpxdata = sys.argv[1] barcodes= sys.argv[2] output1 = sys.argv[3] output1id = sys.argv[4] newfilepath = sys.argv[5] # Building the command line cmd = java -cp /g/steinmetz/projects/solexa/java/solexaJ/bin/:/g/steinmetz/projects/solexa/software/picard/trunk/dist/picard-1.18.jar:/g/steinmetz/projects/solexa/software/picard/trunk/dist/sam-1.18.jar cmd+= deBarcoding F1= cmd+= mpxdata cmd+= BL= cmd+= barcodes cmd+= DR=\ cmd+= newfilepath cmd+= \ # Executing deBarcoding status = os.system(cmd) # In the unlikely event of a fire, please use the nearest emergency exit if status != 0: print Demultiplexing failed. sys.exit(status) oldnames=[] # Reconstructing the output file names as deBarcoding writes them bc = open(barcodes, r) for l in bc.readlines(): l = l.split() if l[0] != : oldnames.append(l[0]) for i in range(len(oldnames)): oldnames[i] = oldnames[i] + .txt newnames=[] # Creating the required paths for multiple outputs if os.path.isdir(newfilepath): for f in oldnames: if os.path.isfile(newfilepath+/+f): name = os.path.splitext(f)[0] s = primary_ s+= output1id s+= _ s+= string.replace(name, _, -) s+= _visible_fastq newnames.append(newfilepath+/+s) # Adding the appropriate prefixes to the old filenames for i in range(len(oldnames)): oldnames[i] = newfilepath+/+oldnames[i] # Setting the first file as the mandatory output file defined in the xml newnames[0] = output1 # Moving everything where it will be seen properly by Galaxy for i in range(len(oldnames)): os.rename(oldnames[i],newnames[i]) # Ta-da! tool id=debarcoding name=Demultiplexer descriptionDemultiplexes multiplexed data (who would have guessed?)/description command interpreter=pythondebarcoding.py $mpxdata $barcodes $output1 $output1.id $__new_file_path__/command inputs param type=data format=gz name=mpxdata label=Compressed Sequence/ param type=data format=bc name=barcodes label=Barcode Set/ /inputs outputs data format=fastq name=output1 metadata_source=mpxdata / /outputs help **Program:** debarcoding.py (v1.0.0) **Author:** This is a wrapper for Wave's deBarcoding java tool **Summary:** This tool demutiplexes data according to a list of barcodes containing a column withe the sample name and a second column with the barcode sequence. **Usage:** Here is an example of the java tool's usage:: Two failed PE runs (multiplex 2 and multiplex8) yielded SE data that can be used for assembly together with the corrected PE ones. To this end, the deBarcoding script from wave was used: javasol deBarcoding where javasol is an alias to: java -cp /g/steinmetz/projects/solexa/java/solexaJ/bin/:/g/steinmetz/projects/solexa/software/picard/trunk/dist/picard-1.18.jar:/g/steinmetz/projects/solexa/software/picard/trunk/dist/sam-1.18.jar The command lines were (thanks to Wave for the clarifications): For the mplex num. 3, some pre-processing was necessary R src/getBarcodeSequencing.R javasol deBarcoding F1=s_1_LESAFFRE_sequence.txt.gz BL=barcodeList.txt DR=seq_lane1 javasol deBarcoding F1=s_2_LESAFFRE_sequence.txt.gz BL=barcodeList.txt DR=seq_lane2 javasol deBarcoding F1=s_3_LESAFFRE_sequence.txt.gz BL=barcodeList.txt DR=seq_lane3 cat seq_lane1/111.txt seq_lane2/111.txt seq_lane3/111.txt sequences/111.txt cat seq_lane1/112.txt seq_lane2/112.txt seq_lane3/112.txt sequences/112.txt cat seq_lane1/1251.txt seq_lane2/1251.txt seq_lane3/1251.txt sequences/1251.txt cat seq_lane1/1303.txt seq_lane2/1303.txt seq_lane3/1303.txt sequences/1303.txt cat seq_lane1/93ep.txt seq_lane2/93ep.txt seq_lane3/93ep.txt sequences/93ep.txt rename .txt _s_sequence.txt *.txt gzip *.txt /help /tool ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Filter data and cut column bugs
On Mon, May 23, 2011 at 3:34 PM, Peter Cock p.j.a.c...@googlemail.com wrote: On Mon, May 23, 2011 at 3:09 PM, Anton Nekrutenko an...@bx.psu.edu wrote: Dear Peter: Yes, that would help. One possibility would be to have all new bugs CC'd to the dev mailing list? Not sure if everyone here would like that or not... But, you patches are definitely not unnoticed. We'll apply them (likely at the conference) ... Kanwei has just applied the fix for issues 535 and 537, thanks! https://bitbucket.org/galaxy/galaxy-central/issue/535/ Filter data on any column tool complains about hash comment lines https://bitbucket.org/galaxy/galaxy-central/issue/537/ Filter data on any column tool casts unused columns That leaves these two pending, https://bitbucket.org/galaxy/galaxy-central/issue/534/ Cut column tool messes up # header lines https://bitbucket.org/galaxy/galaxy-central/issue/536/ Filter data on any column tools shows nonsense % vs total on stdout While I'm looking over minor bugs where I submitted a patch, the following should also be fairly quick to review: FASTQ to Tabular tool doesn't accept plain FASTQ https://bitbucket.org/galaxy/galaxy-central/issue/436/ Conditional does not work with scripts in different path https://bitbucket.org/galaxy/galaxy-central/issue/159/ Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/