This is an automated email from the git hooks/post-receive script. afif pushed a commit to branch master in repository sprai.
commit d0299168b86587798ba94094005a51fc5cef1c09 Author: Afif Elghraoui <[email protected]> Date: Wed Jun 1 21:28:29 2016 -0700 Imported Upstream version 0.9.9.16+dfsg --- doc/_build/html/_sources/Download.txt | 13 +- doc/_build/html/_sources/README.txt | 102 +++- doc/_build/html/_sources/index.txt | 2 + ec.spec | 22 +- ezez4makefile_v4.pl | 864 ++++++++++++++++++++++++++++++++++ ezez4qsub_vx1.pl | 16 +- ezez_vx1.pl | 14 +- makefile | 4 +- nss2v_v3.c | 481 ++++++++++--------- wscript | 6 +- 10 files changed, 1247 insertions(+), 277 deletions(-) diff --git a/doc/_build/html/_sources/Download.txt b/doc/_build/html/_sources/Download.txt index 3edb882..b658f02 100644 --- a/doc/_build/html/_sources/Download.txt +++ b/doc/_build/html/_sources/Download.txt @@ -1,6 +1,11 @@ -======== -Download -======== +=========================== +Download The Latest Version +=========================== +http://zombie.cb.k.u-tokyo.ac.jp/sprai/dist/sprai-0.9.9.15.tar.gz + +============== +Older releases +============== http://zombie.cb.k.u-tokyo.ac.jp/sprai/dist/sprai-0.9.9.14.tar.gz http://zombie.cb.k.u-tokyo.ac.jp/sprai/dist/sprai-0.9.9.13.tar.gz @@ -40,3 +45,5 @@ http://zombie.cb.k.u-tokyo.ac.jp/sprai/dist/sprai-0.9.5.1.6.tar.gz .. http://zombie.cb.k.u-tokyo.ac.jp/sprai/dist/sprai-0.9.5.1.tar.gz .. https://bitbucket.org/yun2/sprai_doc/downloads/sprai-0.9.1.3.tar.gz + +http://zombie.cb.k.u-tokyo.ac.jp/sprai/dist/sprai-0.2.2.3.tar.bz2 diff --git a/doc/_build/html/_sources/README.txt b/doc/_build/html/_sources/README.txt index 05e9e18..65b3500 100644 --- a/doc/_build/html/_sources/README.txt +++ b/doc/_build/html/_sources/README.txt @@ -162,8 +162,8 @@ After the first assembly, you can calculate the depth distribution of reads to e Modify other parameters in ec.spec as well, following instructions in the file. However, the result is not so sensitive to this parameter in our experience. -Single Node Mode ------------------ +Single Node Mode (1) +--------------------- Sprai has several execution modes. The first mode we describe is single node mode, with which we can use only a single machine. If you have more than one server, please see the next section. @@ -219,15 +219,52 @@ After error correction, if you want to assemble corrected reads using Celera Ass -ca_path /path/to/your/wgs/Linux-amd64/bin \ -sprai_path the path to get_top_20x_fa.pl installed +Single Node Mode (make mode) +----------------------------- + +This mode can detect unfinished jobs and restart at the appropriate stage using GNU make. + +:: + + ezez4makefile_v4.pl ec.spec pbasm.spec + +After this command, 'Makefile' will be created. +Then, + +:: + + make -j <number_of_processes> -f Makefile > log 2>&1 & + +If you only want to error correction only, do as below + +:: + + ezez4makefile_v4.pl ec.spec + make -j <number_of_processes> -f Makefile ec_only > log 2>&1 & + +The file result/c_l${libi}.1.fin.v$valid_voters.idfq.gz (eg. c_l0.1.fin.v30.idfq.gz) is the corrected read file. + +After error correction, if you want to assemble the corrected reads, do as below + +:: + + ezez4makefile_v4.pl ec.spec pbasm.spec + make -j <number_of_processes> -f Makefile > log 2>&1 & + +Error correction will be skipped and only assmbling will start. + Multi-node mode 1 (qsub mode) ------------------------------ There are two types of execution modes with Sprai. The first one is qsub mode; a single master process throws child jobs by qsub. -This mode runs faster and more reliablely than the second mode. However, there is a drawbacks. -The biggest problem might be that there is no way of restarting the process once a child process fails. -Anyway, this mode is the most extensively tested, so you should use this mode if your target genome is small enough to be processed with a small number of nodes and thus with little chance of failure. + +.. This mode runs faster and more reliablely than the second mode. However, there is a drawbacks. +.. The biggest problem might be that there is no way of restarting the process once a child process fails. +.. Anyway, this mode is the most extensively tested, so you should use this mode if your target genome is small enough to be processed with a small number of nodes and thus with little chance of failure. + Currently, Sprai supports Sun Grid Engine (SGE) or equivalents (e.g., N1GE, UGE). To correct sequencing errors of PacBio Continuous Long Reads and also would like to assemble them, specify *blast_path* and *sprai_path* in ec.spec, and do as follows + :: ezez4qsub_vx1.pl ec.spec pbasm.spec > log 2>&1 & @@ -260,32 +297,55 @@ or .. ezez4makefile.pl ec.spec asm.spec > ezez4makefile.log 2>&1 && make ec_only & -Multi-node mode 2 (TGEW mode) +The file data_yyyymmdd_hhmmss/c01.fin.idfq.gz is the corrected read file. + + +If some child processes fail, do as follows + +:: + + ezez4qsub_vx1.pl ec.spec pbasm.spec -now yyyymmdd_hhmmss + +The yyyymmdd_hhmmss is the suffix of your data directory ezez4qsub_vx1.pl made. +This command will detect unfinished jobs and restart at the appropriate stage. + +Multi-node mode 2 (make mode) ------------------------------ -The second mode works with `TGEW <https://github.com/mkasa/TGEW>`_, which is a wrapper script of qsub. -tge_make in the TGEW package interprets Makefile and submits jobs by qsub. +This mode can detect unfinished jobs and restart at the appropriate stage using GNU make. +Currently, Sprai supports Sun Grid Engine (SGE) or equivalents (e.g., N1GE, UGE). + :: - ezez4makefile_v3.pl ec.spec pbasm.spec > log 2>&1 - tge_make -bg > tge_make.log 2>&1 & + ezez4makefile_v4.pl ec.spec pbasm.spec -submit + +The command above will create Makefile, job files and log files and submit job files using qsub. + + +.. The second mode works with `TGEW <https://github.com/mkasa/TGEW>`_, which is a wrapper script of qsub. +.. tge_make in the TGEW package interprets Makefile and submits jobs by qsub. + +.. :: + +.. ezez4makefile_v3.pl ec.spec pbasm.spec > log 2>&1 +.. tge_make -bg > tge_make.log 2>&1 & -ezez4makefile_v3.pl creates Makefile, and tge_make processes it. -In the case of failure, you can just reexecute tge_make to restart. As make utility, tge_make compares the timestamps of files to see if any updates are needed. -You can type the following command to see what would be reexecuted:: +.. ezez4makefile_v3.pl creates Makefile, and tge_make processes it. +.. In the case of failure, you can just reexecute tge_make to restart. As make utility, tge_make compares the timestamps of files to see if any updates are needed. +.. You can type the following command to see what would be reexecuted:: - tge_make -n +.. tge_make -n -Since this mode submits a significant number of jobs at once to SGE, you may have to limit the number of partitions for not to exceed the job number limit. -You might add a make target to limit the number of jobs being submitted simultaneously to SGE. -For example, if you want only error-correction, you can specify ec_only target:: +.. Since this mode submits a significant number of jobs at once to SGE, you may have to limit the number of partitions for not to exceed the job number limit. +.. You might add a make target to limit the number of jobs being submitted simultaneously to SGE. +.. For example, if you want only error-correction, you can specify ec_only target:: - tge_make -bg ec_only +.. tge_make -bg ec_only -tge_make actually calls GNU make to analyse dependencies between files, so you can give any valid target for GNU make. +.. tge_make actually calls GNU make to analyse dependencies between files, so you can give any valid target for GNU make. -Before Sprai ver 0.9.6.1.4, Multi-node mode 2 considers only 'pre_partition'. -Since Sprai ver 0.9.6.1.4, the number of jobs submitted to SGE became 'partition' * 'pre_partition' in Multi-node mode 2. +.. Before Sprai ver 0.9.6.1.4, Multi-node mode 2 considers only 'pre_partition'. +.. Since Sprai ver 0.9.6.1.4, the number of jobs submitted to SGE became 'partition' * 'pre_partition' in Multi-node mode 2. Postprocessing =============== diff --git a/doc/_build/html/_sources/index.txt b/doc/_build/html/_sources/index.txt index b729ca9..317a77a 100644 --- a/doc/_build/html/_sources/index.txt +++ b/doc/_build/html/_sources/index.txt @@ -20,6 +20,8 @@ Contents Changelogs ============= +2016.5.31 v0.9.9.15: ezez4makefile_v4.pl was added. You can use Makefile when execute Sprai. (Thanks to Tomoaki Nishiyama for code modifications) + 2016.4.15: v0.9.9.14: -ec_only mode can be run without a spec file of Celera Assembler. (Thanks to Afif Elghraoui for a report) 2016.4.13: v0.9.9.13: myrealigner.c: dynamic memory allocation for element & col pools. (Thanks to Tomoaki Nishiyama for code modifications) diff --git a/ec.spec b/ec.spec index 104bb7e..df49b5d 100644 --- a/ec.spec +++ b/ec.spec @@ -24,18 +24,20 @@ partition 12 # blast_path: where blastn and makeblastdb exist in blast_path /home/imai/bin/ # sprai_path: where binaries of sprai (bfmt72s, nss2v_v3 and so on) exist in -sprai_path /home/imai/sprai/bin/ +sprai_path /home/imai/test/sprai/bin/ #### many node mode (advanced) #### -#sge: options for all the SGE jobs +#sge: options for all the SGE jobs (used by ezez4qsub_vx1.pl) #sge -soft -l ljob,lmem,sjob -#queue_req: additional options for all the SGE jobs +#queue_req: additional options for all the SGE jobs (used by ezez4qsub_vx1.pl and ezez4makefile_v4.pl) #queue_req -l s_vmem=4G -l mem_req=4 -#longestXx_queue_req: if valid, displaces queue_req +#longestXx_queue_req: if valid, displaces queue_req (used by ezez4qsub_vx1.pl) #longestXx_queue_req -l s_vmem=64G -l mem_req=64 -#BLAST_RREQ: additional options for SGE jobs of all vs. all alignment +#BLAST_RREQ: additional options for SGE jobs of all vs. all alignment (used by ezez4qsub.pl and ezez4makefile_v4.pl) #BLAST_RREQ -pe def_slot 4 +#ec_rreq: options for error correction (used by ezez4makefile_v4.pl) +#ec_rreq -l s_vmem=4G -l mem_req=4 #### common (advanced) #### @@ -50,3 +52,13 @@ max_target_seqs 100 #trim: both ends of each alignment by blastn will be trimmed 'trim' bases to detect chimeric reads trim 42 +# direct_vote & copy_blastdb are used by ezez4makefile_v4.pl +direct_vote 0 +# skip writing the blast results once to disk before selecting +# voters (default 1), or write to the disk to allow multiple use +# of the blast results for different number of voters (0) +copy_blastdb 1 +# copy the blastdb to $TMP, presumably set by gridengine to local node, +# and use it during execution. This could reduce NFS access per job, but may +# increase the total transfer if each node is running multiple jobs. + diff --git a/ezez4makefile_v4.pl b/ezez4makefile_v4.pl new file mode 100755 index 0000000..bc83b9c --- /dev/null +++ b/ezez4makefile_v4.pl @@ -0,0 +1,864 @@ +#!/usr/bin/perl +use strict; +use warnings; +use Getopt::Long; +use File::Path qw(make_path); + +# Environmental variables (deprecated): +# MAKEDB_RREQ options for qsub in creating databases (blast) +# BLAST_RREQ options for qsub in searching databases (blast) +# BLAST_OPT options for blast in searching databases +# ASSEMBLY_RREQ options for qsub in assembly (celera) + +my $valid_depth=4; +my $valid_read_length=500; + +my $now = `date +%Y%m%d_%H%M%S`; +chomp $now; +my $submit=0; +my $jobs=0; +my $opt_help; + +GetOptions( + "now=s"=>\$now, # string + "jobs" => \$jobs, + "submit" => \$submit, + "h" => \$opt_help +); + +my %params; + +my @msgs = ( + "USAGE: <this> <ec.spec> <asm.spec> or <this> <ec.spec>", + "[-jobs: create job files for a grid engine]", + "[-submit: actually submit the job to a grid engine]", + "[-now string: a suffix of some directories]", + "[-h: show this message]", +); + +if(@ARGV == 0 || @ARGV >= 3 || $opt_help){ + my $msg = join "\n\t", @msgs; + die "$msg\n"; + #die "USAGE: <this> <ec.spec> <asm.spec>\n\t[-jobs: create job files for a grid engine]\n\t[-submit: actually submit the job to a grid engine]\n\t[-now string: a suffix of some directories]\n"; +} + +my $spec = "/dev/null"; +if(defined($ARGV[1])){ + $spec = $ARGV[1]; +} + +{ + my $ec_spec = $ARGV[0]; + open my $fh,"<",$ec_spec or die "cannot open $ec_spec :$!\n"; + while($_ = <$fh>){ + next if($_ =~ /^\s+$/); + next if($_ =~ /^\s*#/); + chomp; + my @line = split /\s+/,$_; + for(my $i=0; $i<@line; ++$i){ + if($line[$i] =~ /^\s*#/){ + @line = @line[0..$i-1]; + last; + } + } + $params{$line[0]}=join(" ",@line[1..@line-1]); + } + close $fh; +} + +my @input_libs; +my $nlibs; +my $pre_partition=1; +my $partition=12; +my $word_size=""; +my $evalue=1e-50; +my $num_threads=1; +my $max_target_seqs=100; +my $valid_voters=11; +my $max_read_length=128000; +my @voters=(); +my $direct_vote = 1; +my $copy_blastdb = 0; +my $trim=42; +my $estimated_genome_size=0; +my $estimated_depth = 0; +my @gsizes=(); +my $ca_path="/opt/wgs-8.1/Linux-amd64/bin/"; +my $blast_path="/usr/local/bin/"; +my $sprai_path="/usr/local/bin/"; +my $param_makedb_rreq = ""; # not so much resource +my $param_blast_rreq = ""; # desired to have a high CPU slot for latency optimization +my $param_ec_rreq = ""; # not so much resource +my $param_blast_opt = ""; +my $param_gettop_rreq = ""; # requires relatively high memory +my $param_assembly_rreq = ""; +my $filter_same_lib = "| cat"; +my $min_len_for_query=1; +my $max_len_for_query=1000000000000000; + +my $command=""; + +if(defined($params{input_for_database})){ + # new multi-lib mode + # use complementary set of data for each library to avoid correction of + # chimeric data as authentic data + my $libs; + $libs = $params{input_for_database}; + @input_libs = split(/\s+/, $libs); + $nlibs = @input_libs; + printf STDERR "number of libraries: $nlibs\n"; + printf STDERR "libs: @input_libs\n"; +}else{ + die "specify input_fastq in ec.spec\n"; +} + +if(defined($params{estimated_genome_size})){ + $estimated_genome_size = $params{estimated_genome_size}; + @gsizes = split(/\s+/, $estimated_genome_size); +} +else{ + die "specify estimated_genome_size in ec.spec\n"; +} +if(defined($params{estimated_depth})){ + $estimated_depth=$params{estimated_depth}; +} + +foreach my $eg (@gsizes){ + if($eg <= 0){die "estimated_genome_size must be > 0\n";} +} + +if(defined($params{pre_partition})){ + $pre_partition = $params{pre_partition}; +} +if(defined($params{partition})){ + $partition = $params{partition}; +} +if(defined($params{word_size})){ + $word_size = $params{word_size}; +} +if(defined($params{evalue})){ + $evalue = $params{evalue}; +} +if(defined($params{num_threads})){ + $num_threads = $params{num_threads}; +} +if(defined($params{max_target_seqs})){ + $max_target_seqs = $params{max_target_seqs}; +} +if(defined($params{valid_voters})){ + $valid_voters = $params{valid_voters}; + @voters = split(/\s+/, $valid_voters); + undef($valid_voters); +} +else{ + if($estimated_depth == 0) { + my $n_base = 0; + foreach my $input_fastq(@input_libs){ + $n_base += -s $input_fastq; + } + $n_base /= 2; + $estimated_depth=$n_base/$estimated_genome_size; + } + $valid_voters = int(0.8*$estimated_depth); + $valid_voters = ($valid_voters < 11) ? 11 : $valid_voters; + $valid_voters = ($valid_voters > 30) ? 30 : $valid_voters; + @voters = ($valid_voters); + undef($valid_voters); +} +if(defined($params{direct_vote})){ + $direct_vote = $params{direct_vote}; +} +if(defined($params{copy_blastdb})){ + $copy_blastdb = $params{copy_blastdb}; +} +if(defined($params{filter_same_lib})){ + if($params{filter_same_lib} =~ /t/i){ + $filter_same_lib = "| perl -e 'while(<>){\@a=split;next if(\$\$a[0] ne \$\$a[3] && \$\$a[0]=~s/_.*//&& \$\$a[3]=~s/_.*//&& \$\$a[0] eq \$\$a[3]);print;}'" + } +} +if(defined($params{max_read_length})){ + $max_read_length = $params{max_read_length}; +} +if(defined($params{trim})){ + $trim = $params{trim}; +} +if(defined($params{ca_path})){ + $ca_path = $params{ca_path}; +} +if(defined($params{blast_path})){ + $blast_path = $params{blast_path}; +} +if(defined($params{sprai_path})){ + $sprai_path = $params{sprai_path}; +} +if(exists $ENV{'MAKEDB_RREQ'}) { + $param_makedb_rreq = $ENV{'MAKEDB_RREQ'}; + print STDERR "WARNING: environmental variable MAKEDB_RREQ is deprecated. Please set queue_req in ec.spec file.\n"; +} +if(defined($params{queue_req})) { + $param_assembly_rreq = $params{queue_req}; + $param_blast_rreq = $params{queue_req}; + $param_makedb_rreq = $params{queue_req}; + $param_ec_rreq = $params{queue_req}; +} +if(defined($params{BLAST_RREQ})) { + $param_blast_rreq = $params{BLAST_RREQ}; +} +if(exists $ENV{'BLAST_OPT'}) { + $param_blast_opt = $ENV{'BLAST_OPT'}; + print STDERR "WARNING: environmental variable BLAST_OPT is deprecated. Please set BLAST_OPT in ec.spec file.\n"; +} +if(defined($params{blast_opt})) { + $param_blast_opt = $params{blast_opt}; +} +if($param_blast_opt =~ /num_threads/) { + print STDERR "ERROR: you cannot specify -num_threads in blast_opt. Please use num_threads in ec.spec file.\n"; + exit 2; +} +if($param_blast_opt =~ /word_size/) { + print STDERR "ERROR: you cannot specify -word_size in blast_opt. Please use word_size in ec.spec file.\n"+ + exit 2; +} +if(defined($params{makedb_rreq})) { + $param_makedb_rreq = $params{makedb_rreq}; +} +if(defined($params{ec_rreq})) { + $param_ec_rreq = $params{ec_rreq}; +} +if(defined($params{gettop_rreq})) { + $param_gettop_rreq = $params{gettop_rreq}; +}else{ + $param_gettop_rreq = $param_makedb_rreq; +} +if(exists $ENV{'ASSEMBLY_RREQ'}) { + $param_assembly_rreq = $ENV{'ASSEMBLY_RREQ'}; + print STDERR "WARNING: environmental variable ASSEMBLY_RREQ is deprecated. Please set assembly_rreq in ec.spec file.\n"; +} +if(defined($params{assembly_rreq})) { + $param_assembly_rreq = $params{assembly_rreq}; +} +if(defined($params{min_len_for_query})){ + $min_len_for_query = $params{min_len_for_query}; +} +if(defined($params{max_len_for_query})){ + $max_len_for_query = $params{max_len_for_query}; +} + +printf STDERR ("-- params --\n"); +printf STDERR ("input_fastq %s\n", join(" ", @input_libs)); +printf STDERR ("estimated_genome_size %s\n",join(" ", @gsizes)); +printf STDERR ("pre_partition %s\n",$pre_partition); +printf STDERR ("partition %s (total %d)\n",$partition, $pre_partition * $partition); +$partition *= $pre_partition; +printf STDERR ("word_size %d\n",$word_size); +printf STDERR ("evalue %g\n",$evalue); +printf STDERR ("num_threads %d\n",$num_threads); +if($min_len_for_query){ + printf STDERR ("min_len_for_query %d\n",$min_len_for_query); +} +if($max_len_for_query){ + printf STDERR ("max_len_for_query %d\n",$max_len_for_query); +} +if($max_target_seqs){ + printf STDERR ("max_target_seqs %d\n",$max_target_seqs); +} +printf STDERR ("valid_voters %s\n",join(" ", @voters)); +printf STDERR ("trim %d\n",$trim); +printf STDERR ("ca_path %s\n",$ca_path); +printf STDERR ("blast_path %s\n",$blast_path); +printf STDERR ("sprai_path %s\n",$sprai_path); +printf STDERR ("BLAST_RREQ %s\n", $param_blast_rreq); +printf STDERR ("blast_opt %s\n", $param_blast_opt); +printf STDERR ("makedb_rreq %s\n", $param_makedb_rreq); +printf STDERR ("gettop_rreq %s\n", $param_gettop_rreq); +printf STDERR ("assembly_rreq %s\n", $param_assembly_rreq); +printf STDERR ("ec_rreq %s\n", $param_ec_rreq); +printf STDERR ("direct_vote %d\n", $direct_vote); +printf STDERR ("copy_blastdb %d\n", $copy_blastdb); +printf STDERR ("-- params --\n"); + +my $log="./ezez4makefile.log"; + +if(length($blast_path) > 0 && $blast_path !~ /\/$/){ + $blast_path .= "/"; +} +if(length($sprai_path) > 0 && $sprai_path !~ /\/$/){ + $sprai_path .= "/"; +} + + +my $date=""; +my $message=""; + +my $bindir=$sprai_path; +my $path2blast=$blast_path; + +if(!-d $bindir){ + die "$bindir does not exist\n" +} +if(!-d $path2blast){ + die "$path2blast does not exist\n" +} + +my $pwd = `pwd`; +chomp $pwd; + +my $outdir = "result"; +if(!-d $outdir){ + mkdir "$outdir" or die "cannot mkdir $outdir: $!\n"; +} + +my @allscripts=(); +my $script; +my @alloutputfiles=(); +my @allerrfiles=(); +my $errfile; + +my $makefile = "$pwd/Makefile"; +open my $mh, ">", $makefile or die "cannot open $makefile: $!\n"; +my $prefix="c01.fin"; +my $rreq = ""; +if(defined $param_assembly_rreq) { + $rreq = "SGE_RREQ=\"$param_assembly_rreq\" "; +} +my @fintargets = (); +foreach my $egsize (@gsizes){ + foreach my $vv (@voters){ + my $ps = "$prefix.v$vv.20x$egsize"; + my $target = "$outdir/$ps/9-terminator/$ps.scf.fasta"; + push(@fintargets, $target); + } +} +printf $mh "all: %s\n\n", join(" ", @fintargets); +foreach my $egsize (@gsizes){ + foreach my $vv (@voters){ + my $ps = "$prefix.v$vv.20x$egsize"; + my $target = "$outdir/$ps/9-terminator/$ps.scf.fasta"; + printf $mh "$outdir/$ps/9-terminator/$ps.scf.fasta: $outdir/$ps.frg\n"; + printf $mh "\t${rreq}$ca_path/runCA -dir $outdir/`basename \$< .frg` -p `basename \$< .frg` -s $spec \$< \n"; + } +} + + +my $libi; +my @current_dep; +my $tmp_dfq; +my $tmp_blastdb = sprintf("tmp_blastdb"); +if(!-d "$outdir/$tmp_blastdb"){ + mkdir "$outdir/$tmp_blastdb" or die "cannot mkdir $outdir/$tmp_blastdb: $!\n"; +} # this is not per library directory; thus outside the loop; +my $aggr_db ="$outdir/$tmp_blastdb/all_libs"; +my $nalf ="$aggr_db.nal"; +for($libi = 0; $libi < $nlibs; $libi++){ +# $libi is a safe string that represents an integer; safe to embed directly in a string + my $iprefix = "$outdir/c_l$libi"; + my $idfqgz = "$iprefix.idfq.gz"; + my $qidfqgz = "$iprefix.q.idfq.gz"; + # longer scope required + { + # construct a idfq file from a fastq file + + my $PG= "$bindir/fq2idfq.pl"; + my $PG0="$bindir/fa2fq.pl"; + my $PG2="$bindir/fqfilt.pl"; + my $PG3="$bindir/dumbbell_filter.pl"; + my $command = "STMP=`mktemp -d --tmpdir sprai-XXXXXX` ; set -o pipefail; $PG0 \$\< | $PG3 - | $PG - --prefix $iprefix | gzip -c -1 > \$\$STMP/c_l${libi}.idfq.gz && cp -p \$\$STMP/c_l${libi}.idfq.gz \$@ && openssl dgst \$\$STMP/c_l${libi}.idfq.gz \$@; status=\$\$?; rm -fr \$\$STMP;exit \$\$status"; + + $rreq = ""; + if(defined $param_makedb_rreq) { + $rreq = "SGE_RREQ=\"$param_makedb_rreq\" "; + } + printf $mh "$idfqgz: %s\n", $input_libs[$libi]; + printf $mh "\t${rreq}$command\n\n"; + + if($min_len_for_query > 0){ + $command = "STMP=`mktemp -d --tmpdir sprai-XXXXXX`; set -o pipefail; gzip -d -c \$\< | $PG2 - $min_len_for_query -max_len $max_len_for_query | gzip -c -1 > \$\$STMP/c_l${libi}.q.idfq.gz && cp -p \$\$STMP/c_l${libi}.q.idfq.gz \$@ && openssl dgst \$\$STMP/c_l${libi}.q.idfq.gz \$@; status=\$\$?; rm -fr \$\$STMP; exit \$\$status"; + printf $mh "$qidfqgz: %s\n", $idfqgz; + printf $mh "\t${rreq}$command\n\n"; + }else{ + $qidfqgz=$idfqgz; + } + } + + my $tmp_idfq; + my $dfq2fa = "$bindir/dfq2fq_v2.pl -f -"; + my $addlibn = "sed -e 's/^>/>l${libi}_s/'"; + #construct blast database from idfq file + { + $rreq = ""; + if(defined $param_makedb_rreq) { + $rreq = "SGE_RREQ=\"$param_makedb_rreq\" "; + } + my $PG = "$path2blast/makeblastdb"; + $command = "gzip -d -c $idfqgz | $dfq2fa | $addlibn | $PG -in - -dbtype nucl -out $outdir/$tmp_blastdb/c_l$libi -title c_l$libi "; + + my $makeblastdone = "$outdir/c${libi}_imakeblastdb.done"; + + printf $mh "$makeblastdone: $idfqgz\n"; + printf $mh "\t${rreq}$command\n"; + printf $mh "\ttouch \$\@\n\n"; + } + + if($libi==0){ # %.nal depends on makeblastdb.done of other libs +# construct aggregate complementary blast database +# don't need real data. just write to .nal file +# write a file with the two lines stating TITLE and DBLIST +#TITLE libs_excludinglib1 +#DBLIST lib2 lib3 + my @clibs=(); + my @current_dep=(); + my $k; + for($k=0; $k < $nlibs; $k++){ + push(@clibs, "c_l$k"); + push(@current_dep, "$outdir/c${k}_imakeblastdb.done"); + } + printf $mh "$nalf: %s\n", join(" ", @current_dep); + printf $mh "\techo TITLE all_libs > \$\@\n"; + printf $mh "\techo DBLIST %s >> \$\@\n\n", join(" ", @clibs); + } + + my $tmp_fa; + { + $rreq = ""; + if(defined $param_makedb_rreq) { + $rreq = "SGE_RREQ=\"$param_makedb_rreq\" "; + } + $tmp_fa = "tmp_fa_l$libi"; + if(!-d "$outdir/$tmp_fa"){ + mkdir "$outdir/$tmp_fa" or die "cannot mkdir $outdir/$tmp_fa: $!\n"; + } + my $PG2 = "$bindir/partition_fa.pl"; + $command = "${rreq}gzip -d -c $qidfqgz | $dfq2fa | $addlibn | $PG2 - $partition -p $outdir/$tmp_fa/c_l$libi"; + + my $ipartdone = "$outdir/c_l${libi}_partition_fa.done"; + + printf $mh "$ipartdone: $qidfqgz\n"; + printf $mh "\t$command\n"; + printf $mh "\ttouch \$\@\n\n"; + } + + { # main error correction job + $rreq = ""; + if(defined $param_blast_rreq && $param_blast_rreq ne '') { + $rreq = "SGE_RREQ=\"$param_blast_rreq\" "; + } + my $tmp_evalue; + $tmp_evalue = $evalue; + $tmp_dfq = "tmp_dfq_l$libi"; + if(!-d "$outdir/$tmp_dfq"){ + mkdir "$outdir/$tmp_dfq" or die "cannot mkdir $outdir/$tmp_dfq: $!\n"; + } + my $BLAST_CMD = "$path2blast/blastn -dbsize 1 -num_threads $num_threads"; + if($word_size){ + $BLAST_CMD .= " -word_size $word_size"; + } + if($max_target_seqs){ + $BLAST_CMD .= " -max_target_seqs $max_target_seqs"; + } + if(defined $param_blast_opt) { + $BLAST_CMD .= " $param_blast_opt"; + } + my $outfmt = "7 qseqid sstart send sacc qstart qend bitscore evalue pident qseq sseq"; + $BLAST_CMD .= " -evalue $tmp_evalue -outfmt '$outfmt'"; + my $PG2=sprintf("$bindir/bfmt72s -c $trim -u -i",$libi); + if($copy_blastdb){ # copy the blastdb per each job to reduce the number of NFS access; this might increase transer if many jobs are run on the same node + $BLAST_CMD .= " -db \$\$STMP/$aggr_db"; + }else{ + $BLAST_CMD .= " -db $aggr_db"; + } + if($direct_vote){ + foreach my $valid_voters(@voters){ + my $PG3="$bindir/nss2v_v3 -r $max_read_length -v $valid_voters"; + my $PG4="$bindir/myrealigner -f -B $valid_voters -b 3 -d 0.5 -l $max_read_length"; + if(!-d "$outdir/$tmp_dfq.v$valid_voters"){ + mkdir "$outdir/$tmp_dfq.v$valid_voters" or die "cannot mkdir $outdir/$tmp_dfq.v$valid_voters: $!\n"; + } + for(my $i=0; $i < $partition; ++$i){ + my $inputfile = sprintf("$outdir/$tmp_fa/c_l${libi}_%04d.fa", $i); + printf $mh "$outdir/$tmp_dfq.v$valid_voters/c_l${libi}_%04d.v$valid_voters.dfq.gz: $outdir/c_l${libi}_partition_fa.done $nalf\n", $i; + printf $mh "\t${rreq} STMP=`mktemp -d --tmpdir sprai-XXXXXX`; set -o pipefail; mkdir -p \$\$STMP/$outdir $outdir/$tmp_dfq.v$valid_voters &&"; + if($copy_blastdb){ printf $mh " cp -rp $outdir/$tmp_blastdb \$\$STMP/$outdir &&";} + printf $mh " cp -p $inputfile \$\$STMP/input && openssl dgst $inputfile \$\$STMP/input &&"; + printf $mh " $BLAST_CMD -query \$\$STMP/input $filter_same_lib | $PG2 - | $PG3 - | $PG4 - | gzip -c -1 > \$\$STMP/output.tmp && cp -p \$\$STMP/output.tmp \$\@ ; status=\$\$?; rm -fr \$\$STMP; exit \$\$status\n\n"; + } + } + }else{ # not direct_vote + for(my $i=0; $i < $partition; ++$i){ + my $outputfile = sprintf("$outdir/$tmp_dfq/c_l${libi}_%04d.blast.sam.gz",$i); + my $inputfile = sprintf("$outdir/$tmp_fa/c_l${libi}_%04d.fa", $i); + printf $mh "$outputfile : $outdir/c_l${libi}_partition_fa.done $nalf\n"; + printf $mh "\t${rreq}"; + printf $mh " STMP=`mktemp -d --tmpdir sprai-XXXXXX`; set -o pipefail; mkdir -p \$\$STMP/$outdir &&"; + if($copy_blastdb){ printf $mh " cp -rp $outdir/$tmp_blastdb \$\$STMP/$outdir &&";} + printf $mh " cp -p $inputfile \$\$STMP/input && openssl dgst $inputfile \$\$STMP/input &&"; + printf $mh " $BLAST_CMD -query \$\$STMP/input $filter_same_lib | $PG2 - | gzip -c -1 > \$\$STMP/output.tmp && cp -p \$\$STMP/output.tmp \$\@ ; status=\$\$?; rm -fr \$\$STMP; exit \$\$status\n\n", $i; + } + + $rreq = ""; # smaller resource request + if(defined $param_makedb_rreq) { + $rreq = "SGE_RREQ=\"$param_makedb_rreq\" "; + } + foreach my $valid_voters(@voters){ + my $PG3="$bindir/nss2v_v3 -r $max_read_length -v $valid_voters"; + my $PG4="$bindir/myrealigner -f -B $valid_voters -b 3 -d 0.5"; + if(!-d "$outdir/$tmp_dfq.v$valid_voters"){ + mkdir "$outdir/$tmp_dfq.v$valid_voters" or die "cannot mkdir $outdir/$tmp_dfq.v$valid_voters: $!\n"; + } + # instead of using pattern match write every pattern for tge_make + for(my $i=0; $i < $partition; ++$i){ + printf $mh "$outdir/$tmp_dfq.v$valid_voters/c_l${libi}_%04d.v$valid_voters.dfq.gz: $outdir/$tmp_dfq/c_l${libi}_%04d.blast.sam.gz\n", $i, $i; + printf $mh "\t${rreq}gzip -dc \$< | $PG3 - | $PG4 - | gzip -c -1 > \$\@.tmp && mv \$\@.tmp \$\@\n\n"; + } + } + } + } + + { + $rreq = ""; + if(defined $param_makedb_rreq) { + $rreq = "SGE_RREQ=\"$param_makedb_rreq\" "; + } + foreach my $valid_voters(@voters){ + $tmp_idfq = "tmp_idfq_l$libi"; + if(!-d "$outdir/$tmp_idfq.v$valid_voters"){ + mkdir "$outdir/$tmp_idfq.v$valid_voters" or die "cannot mkdir $outdir/$tmp_idfq: $!\n"; + } + for(my $i=0; $i < $partition; ++$i){ + printf $mh "$outdir/$tmp_idfq.v$valid_voters/c_l${libi}.refined_%04d.v$valid_voters.idfq.gz: $outdir/$tmp_dfq.v$valid_voters/c_l${libi}_%04d.v$valid_voters.dfq.gz\n", $i, $i; + printf $mh "\t${rreq}gzip -d -c \$< | $bindir/dfq2fq_v2.pl - | gzip -c -1 > \$\@.tmp && mv \$\@.tmp \$\@\n\n"; + } + } + } + + foreach my $valid_voters(@voters){ + my $outputfile = "$outdir/c_l${libi}.1.v$valid_voters.idfq.gz"; + my @part_refined_dfqgz = (); # initialize each time + for(my $i=0; $i < $partition; ++$i){ + my $pidfqfiles = sprintf("$outdir/$tmp_idfq.v$valid_voters/c_l${libi}.refined_%04d.v$valid_voters.idfq.gz", $i); + push(@part_refined_dfqgz, $pidfqfiles); + } + my $files=join(" ", @part_refined_dfqgz); + + printf $mh "$outputfile : $files\n"; + printf $mh "\tcat \$^ > \$\@.tmp && mv \$\@.tmp \$\@\n\n"; + } + + { + $rreq = ""; + if(defined $param_makedb_rreq) { + $rreq = "SGE_RREQ=\"$param_makedb_rreq\" "; + } + $tmp_dfq = "tmp_dfq_l$libi"; + foreach my $valid_voters(@voters){ + my $tmp_finish = "tmp_finish_l$libi.v$valid_voters"; + for(my $i=0; $i < $partition; ++$i){ + if(!-d "$outdir/$tmp_finish"){ + mkdir "$outdir/$tmp_finish" or die "cannot mkdir $outdir/$tmp_finish: $!\n"; + } + printf $mh "$outdir/$tmp_finish/c_l${libi}.refined_%04d.fin.v$valid_voters.idfq.gz: $outdir/$tmp_dfq.v$valid_voters/c_l${libi}_%04d.v$valid_voters.dfq.gz\n", $i, $i; + printf $mh "\t${rreq}gzip -d -c \$< | $bindir/dfq2fq_v2.pl - -finish -valid_depth $valid_depth -valid_read_length $valid_read_length | gzip -c -1 > \$\@.tmp && mv \$\@.tmp \$\@\n\n"; + } + } + } + + foreach my $valid_voters(@voters){ + my $tmp_finish = "tmp_finish_l$libi.v$valid_voters"; + my @part_finidfqgz = (); + for(my $i = 0; $i < $partition; ++$i){ + my $outputfile = sprintf("$outdir/$tmp_finish/c_l${libi}.refined_%04d.fin.v$valid_voters.idfq.gz",$i); + push(@part_finidfqgz,$outputfile); + } + my $files=join(" ", @part_finidfqgz); + my $outputfile = "$outdir/c_l${libi}.1.fin.v$valid_voters.idfq.gz"; + + printf $mh "\n"; + printf $mh "$outputfile: $files\n"; + printf $mh "\tcat \$^ > \$\@.tmp && mv \$\@.tmp \$\@\n\n"; + } +} + +my @cdep=(); +foreach my $valid_voters(@voters){ + for($libi = 0; $libi < $nlibs; $libi ++){ + push(@cdep, "$outdir/c_l${libi}.1.fin.v$valid_voters.idfq.gz"); + } +} +my $ecdone = "ec.done"; +printf $mh "$ecdone: %s\n", join(" ", @cdep); +printf $mh "\ttouch \$\@;\n\n"; + +my $ec_only = "ec_only: $ecdone"; +printf $mh "$ec_only\n\n"; + +my $bashcommand=""; +my $suffix = "top20x"; + + +foreach my $estimated_genome_size (@gsizes){ + foreach my $valid_voters (@voters){ + my $combined = "$outdir/$prefix.v$valid_voters.20x$estimated_genome_size"; + { + $rreq = ""; + if(defined $param_gettop_rreq) { + $rreq = "SGE_RREQ=\"$param_gettop_rreq\" "; + } + my $PG1 = "$bindir/get_top_20x_fa.pl"; + my @vdep = (); + for($libi = 0; $libi < $nlibs; $libi ++){ + push(@vdep, "$outdir/c_l${libi}.1.fin.v$valid_voters.idfq.gz"); + } + printf $mh "%s.fq: %s\n", $combined, join(" ", @vdep); + printf $mh "\t${rreq}STMP=`mktemp -d --tmpdir sprai-XXXXXX`;gzip -d -c \$^ > \$\$STMP/ec.idfq && $PG1 \$\$STMP/ec.idfq -l -g $estimated_genome_size -q > \$\@.tmp && mv \$\@.tmp \$\@; status=\$\$?; rm -fr \$\$STMP; exit \$\$status \n\n"; + + printf $mh "%s.frg: %s.fq\n", $combined, $combined; + printf $mh "\t$ca_path/fastqToCA -libraryname foo -technology pacbio-corrected -reads \$< > \$\@\n\n"; + } + } +} +my @frgs = (); +foreach my $egsize (@gsizes){ + foreach my $vv (@voters){ + my $frg = "$outdir/$prefix.v$vv.20x$egsize.frg"; + push(@frgs, $frg); + } +} +printf $mh ".PHONY: prepfrg\n"; +printf $mh "prepfrg: %s\n\n", join(" ", @frgs); + +close $mh; + + +if($submit || $jobs){ + my $headstr ="#!/bin/sh\n#\$ -S /bin/sh\n#\$ -cwd\n"; +# construct jobs for grid engines +# each job just calls a make command to construct certain targets +# $partition should be run as array job and accessed as $SGE_TASK_ID +# STR0=`expr $SGE_TASK_ID - 1` +# PART=`printf '%04d' $STR0` + my $partgen = q(STR0=`expr $SGE_TASK_ID - 1` +PART=`printf '%04d' $STR0` +); + my $jobdir = "jobs"; + if(!-d $jobdir){ + make_path($jobdir) or die "cannot make_path $jobdir: $!\n"; + } + make_path('log'); + my $rreq = $param_makedb_rreq; + my @idfqjobids = (); # store the jobids for each library + + make_path('log/idfqgz'); + for($libi = 0; $libi < $nlibs; $libi++){ + my $iprefix = "$outdir/c_l$libi"; + my $idfqgz = "$iprefix.idfq.gz"; + my $jf = "$jobdir/idfq$libi.job"; + open my $jh, ">", $jf or die "cannot open $jf: $!\n"; + print $jh $headstr; + printf $jh "#\$ %s\n", $rreq; + print $jh "make -f $makefile $idfqgz\n"; + close $jh; + if($submit){ + my $retv=`qsub -o log/idfqgz -e log/idfqgz $jf`; + if($retv =~ /Your job (\d+)/){ + push(@idfqjobids, $1); + print "$1: qsub -o log/idfqgz -e log/idfqgz $jf\n"; + }else{ + die "job submission error: qsub $jf said $retv"; + } + } + } + + my @jobids = (); # store the jobids for each library + make_path('log/makeblastdb'); + for($libi = 0; $libi < $nlibs; $libi++){ + my $jf = "$jobdir/blastdb_partition$libi.job"; + open my $jh, ">", $jf or die "cannot open $jf: $!\n"; + print $jh $headstr; + printf $jh "#\$ %s\n", $rreq; + print $jh "make -f $makefile $outdir/c${libi}_imakeblastdb.done\n"; + close $jh; + if($submit){ + my $retv=`qsub -o log/makeblastdb -e log/makeblastdb -hold_jid $idfqjobids[$libi] $jf`; + if($retv =~ /Your job (\d+)/){ + push(@jobids, $1); + print "$1: qsub -o log/makeblastdb -e log/makeblastdb -hold_jid $idfqjobids[$libi] $jf\n"; + }else{ + die "job submission error: qsub $jf said $retv"; + } + } + } + # prepare the target database consisting of all the libraries by writing a single file + # pseudo dependency to synchronize + my $jf = "$jobdir/blastdb_all.job"; + { + open my $jh, ">", $jf or die "cannot open $jf: $!\n"; + print $jh $headstr; + printf $jh "#\$ %s\n", $rreq; + print $jh "make -f $makefile $nalf\n"; + close $jh; + } + my $nalfjid; + if($submit){ + my $depends = join(",",@jobids); + my $retv=`qsub -o log/makeblastdb -e log/makeblastdb -hold_jid $depends $jf`; + if($retv =~ /Your job (\d+)/){ + $nalfjid = $1; + print "$1: qsub -o log/makeblastdb -e log/makeblastdb -hold_jid $depends $jf\n"; + }else{ + die "job submission error: qsub $jf said $retv"; + } + } + + make_path('log/partition'); + my $preq = sprintf " -t 1-%d ", $partition; + for($libi = 0; $libi < $nlibs; $libi++){ + my $ipjid; + my $jf = "$jobdir/input_partition_l$libi.job"; + { + open my $jh, ">", $jf or die "cannot open $jf: $!\n"; + print $jh $headstr; + printf $jh "#\$ %s\n", $rreq; + print $jh "make -f $makefile $outdir/c_l${libi}_partition_fa.done\n"; + print $jh "sleep 5s\n"; + close $jh; + } + if($submit){ + my $retv=`qsub -o log/partition -e log/partition -hold_jid $idfqjobids[$libi] $jf`; + if($retv =~ /Your job (\d+)/){ + $ipjid = $1; + print "$1: qsub -o log/partition -e log/partition -hold_jid $idfqjobids[$libi] $jf\n"; + }else{ + die "job submission error: qsub $jf said $retv"; + } + } + + my $tmp_dfq = "tmp_dfq_l$libi"; + make_path('log/blast'); + #blast individual partitions and convert to sam like format + $jf = "$jobdir/blast_exec$libi.job"; + { + open my $jh, ">", $jf or die "cannot open $jf: $!\n"; + print $jh $headstr; + printf $jh "#\$ %s\n", $param_blast_rreq; + printf $jh "#\$ %s\n", $preq; + print $jh $partgen; + if($direct_vote){ + # This code do not separate the jobs for different valid_voters, which is apparently inefficient. + # This is justified because for $direct_vote mode, only a single valid_voters are expected. + # Not reusing the blast output for the multiple valid_voters is by itself inefficient, and + # parallelization merit for multiple valid_voters is not implemented. + foreach my $valid_voters(@voters){ + print $jh "make -f $makefile $outdir/$tmp_dfq.v$valid_voters/c_l${libi}_\$PART.v$valid_voters.dfq.gz\n"; + } + }else{ + print $jh "make -f $makefile $outdir/$tmp_dfq/c_l${libi}_\$PART.blast.sam.gz\n"; + } + close $jh; + } + if($submit){ + my $retv=`qsub -o log/blast -e log/blast -hold_jid $nalfjid,$ipjid $jf`; + if($retv =~ /Your job-array (\d+)/){ + print "$1: qsub -o log/blast -e log/blast -hold_jid $nalfjid,$ipjid $jf\n"; + $jobids[$libi] = $1; + }else{ + die "job submission error: qsub $jf said $retv"; + } + } + } + $rreq = $param_ec_rreq; + my %jid_vv; + make_path('log/part_ec'); + foreach my $valid_voters(@voters){ + my @t_jid_lib = (); + for($libi = 0; $libi < $nlibs; $libi++){ + #error correct individual partitions + my $jf="$jobdir/part_ec$libi.v$valid_voters.job"; + my $tmp_dfq = "tmp_dfq_l$libi"; + my $tmp_finish = "tmp_finish_l$libi.v$valid_voters"; + { + open my $jh, ">", $jf or die "cannot open $jf: $!\n"; + print $jh $headstr; + printf $jh "#\$ %s\n", $rreq; + printf $jh "#\$ %s\n", $preq; + print $jh $partgen; + print $jh "make -f $makefile $outdir/$tmp_finish/c_l${libi}.refined_\$PART.fin.v$valid_voters.idfq.gz\n"; + close $jh; + } + if($submit){ + my $depend = $jobids[$libi]; + my $retv=`qsub -o log/part_ec -e log/part_ec -hold_jid_ad $depend $jf`; + if($retv =~ /Your job-array (\d+)/){ + $t_jid_lib[$libi] = $1; + print "$1: qsub -o log/part_ec -e log/part_ec -hold_jid_ad $depend $jf\n"; + }else{ + die "job submission error: qsub $jf said $retv"; + } + } + #finalize ec proc (merge error corrected partitions) + # consecutive job with array dependency + $jf="$jobdir/merge_gc$libi.v$valid_voters.job"; + { + open my $jh, ">", $jf or die "cannot open $jf: $!\n"; + print $jh $headstr; + printf $jh "#\$ %s\n", $rreq; + print $jh "make -f $makefile $outdir/c_l${libi}.1.fin.v$valid_voters.idfq.gz\n"; + close $jh; + } + if($submit){ + my $depend = $t_jid_lib[$libi]; + my $retv=`qsub -o log -e log -hold_jid $depend $jf`; + if($retv =~ /Your job (\d+)/){ + $jid_vv{$valid_voters} = $1; + print "$1: qsub -o log -e log -hold_jid $depend $jf\n"; + }else{ + die "job submission error: qsub $jf said $retv"; + } + } + } + } + $rreq = $param_gettop_rreq; + foreach my $estimated_genome_size (@gsizes){ + foreach my $valid_voters (@voters){ + #prepCA (select top 20 x based on estimated genome size) + my $jf="$jobdir/prepCA.v$valid_voters.e$estimated_genome_size.job"; + my $ps = "$prefix.v$valid_voters.20x$estimated_genome_size"; + { + open my $jh, ">", $jf or die "cannot open $jf: $!\n"; + print $jh $headstr; + printf $jh "#\$ %s\n", $rreq; + print $jh "make -f $makefile $outdir/$ps.frg\n"; + close $jh; + } + my $t_jid; + if($submit){ + my $depend = $jid_vv{$valid_voters}; + my $retv=`qsub -o log -e log -hold_jid $depend $jf`; + if($retv =~ /Your job (\d+)/){ + $t_jid = $1; + print "$1: qsub -o log -e log -hold_jid $depend $jf\n"; + }else{ + die "job submission error: qsub $jf said $retv"; + } + } + #execCA + $jf="$jobdir/execCA.v$valid_voters.e$estimated_genome_size.job"; + { + open my $jh, ">", $jf or die "cannot open $jf: $!\n"; + print $jh $headstr; + printf $jh "#\$ %s\n", $rreq; + print $jh "make -f $makefile $outdir/$ps/9-terminator/$ps.scf.fasta\n"; + close $jh; + } + if($submit){ + my $depend = $t_jid; + my $retv=`qsub -o log -e log -hold_jid $depend $jf`; + if($retv =~ /Your job (\d+)/){ + $t_jid = $1; + print "$1: qsub -o log -e log -hold_jid $depend $jf\n"; + }else{ + die "job submission error: qsub $jf said $retv"; + } + } + } + } +} + diff --git a/ezez4qsub_vx1.pl b/ezez4qsub_vx1.pl index 5f5e5c9..6ed1751 100755 --- a/ezez4qsub_vx1.pl +++ b/ezez4qsub_vx1.pl @@ -363,6 +363,8 @@ my $orig_idfq = sprintf("$datadir/$preprefix%02d.idfq.gz",0); my $db_idfq_gz = sprintf("$datadir/$preprefix%02d.db.idfq.gz",0); my $qu_idfq_gz = sprintf("$datadir/$preprefix%02d.qu.idfq.gz",0); +my $pipefail = "set -o pipefail;"; + if($from == 0) { # fq -> idfq (& id2n) @@ -385,7 +387,7 @@ if($from == 0) $f_do = &do_or_not(\@parents,\$child); my $PG0 = "$bindir/fa2fq.pl"; - $command = sprintf("cat $parent | $PG0 - | $dumbbell_filter - | $PG - -flag --prefix $datadir/$preprefix%02d | gzip -c -1 > $child.tmp && mv $child.tmp $child",0); + $command = sprintf("$pipefail cat $parent | $PG0 - | $dumbbell_filter - | $PG - -flag --prefix $datadir/$preprefix%02d | gzip -c -1 > $child.tmp && mv $child.tmp $child",0); if($f_do){ $do_qsub=1; } @@ -401,7 +403,7 @@ if($from == 0) my $com2; if($min_len_for_query > 1){ - $com2 = sprintf("gzip -d -c $parent | $PG2 - $min_len_for_query -max_len $max_len_for_query | gzip -c -1 > $child.tmp && mv $child.tmp $child",0); + $com2 = sprintf("$pipefail gzip -d -c $parent | $PG2 - $min_len_for_query -max_len $max_len_for_query | gzip -c -1 > $child.tmp && mv $child.tmp $child",0); } else{ $com2 = sprintf("ln -s $parent $child"); @@ -495,7 +497,7 @@ for(my $index=$from; $index<$to; ++$index){ $PG = "$path2blast/makeblastdb"; $PG2= "$bindir/dfq2fq_v2.pl"; - $command = sprintf("gzip -d -c $parent | $PG2 -f - | $PG -in - -dbtype nucl -out $datadir/$preprefix%02d -title $preprefix%02d 1>$child.tmp && mv $child.tmp $child",$index,$index); + $command = sprintf("$pipefail gzip -d -c $parent | $PG2 -f - | $PG -in - -dbtype nucl -out $datadir/$preprefix%02d -title $preprefix%02d 1>$child.tmp && mv $child.tmp $child",$index,$index); if($f_do){ $do_qsub=1; } @@ -711,7 +713,7 @@ for(my $index=$from; $index<$to; ++$index){ my $f_do=1; my $input = sprintf("$maindir/$preprefix%02d_%04d.fq",$index,$list[$i]); #my $input = sprintf("$maindir/$preprefix%02d_%04d.fa",$index,$list[$i]); - $command = sprintf("cat $input | $bindir/fq2fa.pl - | $PG -db $datadir/$preprefix%02d -query - -evalue $evalue -outfmt '$outfmt' | $PG2 - | $PG3 - | $PG4 - | gzip -c -1 > $outputfile.tmp && mv $outputfile.tmp $outputfile", $index); + $command = sprintf("$pipefail cat $input | $bindir/fq2fa.pl - | $PG -db $datadir/$preprefix%02d -query - -evalue $evalue -outfmt '$outfmt' | $PG2 - | $PG3 - | $PG4 - | gzip -c -1 > $outputfile.tmp && mv $outputfile.tmp $outputfile", $index); #$command = sprintf("cat $input | $bindir/fq2fa.pl - | $PG -db $datadir/$preprefix%02d -query - -evalue $evalue -outfmt '$outfmt' | $PG2 - -f $input | $PG3 - | $PG4 - | gzip -c -1 > $outputfile.tmp && mv $outputfile.tmp $outputfile", $index); push @othercom, "gzip -d -t $outputfile"; push @othercom, "while [ \$? -ne 0 ]"; @@ -849,10 +851,10 @@ for(my $index=$from; $index<$to; ++$index){ } my $tmp = join " ", @ta; if($index+1 == $to){ - $command = sprintf("gzip -d -c $tmp | $bindir/dfq2fq_v2.pl - --finish --valid_depth $valid_depth -valid_read_length $valid_read_length | gzip -c -1 > $outputfile.tmp && mv $outputfile.tmp $outputfile"); + $command = sprintf("$pipefail gzip -d -c $tmp | $bindir/dfq2fq_v2.pl - --finish --valid_depth $valid_depth -valid_read_length $valid_read_length | gzip -c -1 > $outputfile.tmp && mv $outputfile.tmp $outputfile"); } else{ - $command = sprintf("gzip -d -c $tmp | $bindir/dfq2fq_v2.pl - | gzip -c -1 > $outputfile.tmp && mv $outputfile.tmp $outputfile"); + $command = sprintf("$pipefail gzip -d -c $tmp | $bindir/dfq2fq_v2.pl - | gzip -c -1 > $outputfile.tmp && mv $outputfile.tmp $outputfile"); } } @@ -1059,7 +1061,7 @@ for(my $index=$from; $index<$to; ++$index){ my $uuid = $now; #my $uuid = `uuidgen`; chomp $uuid; - $command = sprintf("gzip -d -c $second_catted_file > $datadir/$uuid.tmp && $PG1 $datadir/$uuid.tmp -l -g $estimated_genome_size -q -c 20 > $child.tmp && mv $child.tmp $child && rm $datadir/$uuid.tmp"); + $command = sprintf("$pipefail gzip -d -c $second_catted_file > $datadir/$uuid.tmp && $PG1 $datadir/$uuid.tmp -l -g $estimated_genome_size -q -c 20 > $child.tmp && mv $child.tmp $child && rm $datadir/$uuid.tmp"); #$command = sprintf("gzip -d -c $second_catted_file | $PG1 - -g $estimated_genome_size -q -c 20 > $child.tmp && mv $child.tmp $child"); if($f_do){ $do_qsub=1; diff --git a/ezez_vx1.pl b/ezez_vx1.pl index 35d7c01..f470750 100755 --- a/ezez_vx1.pl +++ b/ezez_vx1.pl @@ -250,6 +250,8 @@ my $confident_length_coefficient = 0.75; my $date=""; my $message=""; +my $pipefail = "set -o pipefail;"; + # mkdirs my $PWD = `pwd`; chomp $PWD; @@ -302,7 +304,7 @@ if($from == 0) $PG0 = "$sprai_path/$PG0"; } if($f_do){ - $com1 = sprintf("time cat $parent | $PG0 - | $dumbbell_filter - | $PG - --prefix $tmp_dir/db%02d | gzip -c -1 > $child.tmp && mv $child.tmp $child",$from); + $com1 = sprintf("$pipefail time cat $parent | $PG0 - | $dumbbell_filter - | $PG - --prefix $tmp_dir/db%02d | gzip -c -1 > $child.tmp && mv $child.tmp $child",$from); `$com1`; } else{ @@ -321,7 +323,7 @@ if($from == 0) my $parent=$db_idfq_gz; if($f_do){ if($min_len_for_query > 1){ - $com2 = sprintf("time gzip -d -c $parent | $PG2 - $min_len_for_query -max_len $max_len_for_query | gzip -c -1 > $child.tmp && mv $child.tmp $child",$from); + $com2 = sprintf("$pipefail time gzip -d -c $parent | $PG2 - $min_len_for_query -max_len $max_len_for_query | gzip -c -1 > $child.tmp && mv $child.tmp $child",$from); } else{ $com2 = sprintf("ln -s $parent $child"); @@ -356,7 +358,7 @@ for(my $index=$from; $index<$to; ++$index){ $f_do = &do_or_not(\@parents,\$child); my $parent = $parents[0]; if($f_do){ - $com1 = sprintf("time gzip -d -c $parent | $PG1 -f - | makeblastdb -in - -dbtype nucl -out $tmp_dir/$preprefix%02d -title $preprefix%02d 1>$child.tmp && mv $child.tmp $child",$index,$index); + $com1 = sprintf("$pipefail time gzip -d -c $parent | $PG1 -f - | makeblastdb -in - -dbtype nucl -out $tmp_dir/$preprefix%02d -title $preprefix%02d 1>$child.tmp && mv $child.tmp $child",$index,$index); } else{ $com1 = "sleep 0"; @@ -376,7 +378,7 @@ for(my $index=$from; $index<$to; ++$index){ $f_do = &do_or_not(\@parents,\$child); my $parent = $parents[0]; if($f_do){ - $com2 = sprintf("time gzip -d -c $parent | $PG1 -f - | $PG2 - $partition -p $tmp_dir/$preprefix%02d 1>$child.tmp && mv $child.tmp $child", $index); + $com2 = sprintf("$pipefail time gzip -d -c $parent | $PG1 -f - | $PG2 - $partition -p $tmp_dir/$preprefix%02d 1>$child.tmp && mv $child.tmp $child", $index); } else{ $com2 = "sleep 0"; @@ -445,7 +447,7 @@ for(my $index=$from; $index<$to; ++$index){ my $f_do=1; $f_do = &do_or_not(\@parents,\$child); if($f_do){ - $command .= sprintf("time cat $parent | $BLASTN -db $tmp_dir/$preprefix%02d -query - -evalue $evalue -outfmt '$outfmt' | $PG2 - | $PG3 - | $PG4 - | gzip -c -1 > $child.tmp && mv $child.tmp $child & ", $index); + $command .= sprintf("$pipefail time cat $parent | $BLASTN -db $tmp_dir/$preprefix%02d -query - -evalue $evalue -outfmt '$outfmt' | $PG2 - | $PG3 - | $PG4 - | gzip -c -1 > $child.tmp && mv $child.tmp $child & ", $index); } else{ $command .= "sleep 0 & "; @@ -480,7 +482,7 @@ for(my $index=$from; $index<$to; ++$index){ my $f_do=1; $f_do = &do_or_not(\@parents,\$child); if($f_do){ - $command .= sprintf("time gzip -d -c $parent | $PG1 - -finish -valid_depth $valid_depth -valid_read_length $valid_read_length | gzip -c -1 > $child.tmp && mv $child.tmp $child & "); + $command .= sprintf("$pipefail time gzip -d -c $parent | $PG1 - -finish -valid_depth $valid_depth -valid_read_length $valid_read_length | gzip -c -1 > $child.tmp && mv $child.tmp $child & "); } else{ $command .= "sleep 0 & "; diff --git a/makefile b/makefile index 84ad6f2..8f78f13 100644 --- a/makefile +++ b/makefile @@ -1,5 +1,5 @@ APPNAME = 'sprai' -VERSION = '0.9.9.14' +VERSION = '0.9.9.16' PREFIX=$(PWD) COMPILED= \ @@ -11,6 +11,7 @@ m52bfmt7 \ SCRIPTS= \ ca_ikki_v5.pl \ +ezez4makefile_v4.pl \ ezez4qsub_vx1.pl \ ezez_vx1.pl \ dumbbell_filter.pl \ @@ -32,7 +33,6 @@ check_circularity.pl \ #re2cons.pl \ #ezez4qsub_v9.pl \ #ezez_v8.pl \ -#ezez4makefile_v4.pl \ all: $(COMPILED) diff --git a/nss2v_v3.c b/nss2v_v3.c index 51b5d24..2711dac 100644 --- a/nss2v_v3.c +++ b/nss2v_v3.c @@ -4,8 +4,9 @@ #include <unistd.h> #include <getopt.h> #include <ctype.h> /* toupper */ +#include <assert.h> -int LSEQ = 46; +int LSEQ = 2048; #define LAUX 32 /* ~hit num */ @@ -547,6 +548,232 @@ calculate_region_length(sam_t* t_sam) } return length_of_region; } +int +find_row_with_space(regions_t**lists, int* i_list, int listsize, int nrows, int opt_separate, int stt, int end, int interval) +{ + int row; + for(row=0; row < nrows; ++row){ + int f_free = 0; + /* find a free region */ + if(!opt_separate){ + int l; + int not_collide = 1; + for(l=0; l<i_list[row] && l<listsize; ++l){/* check the rgn does not overlap any reads */ + if(end+interval < lists[row][l].stt || stt > lists[row][l].end+interval){ + continue; + }else{ + not_collide=0; + } + } + if(not_collide==1 && i_list[row] <= listsize){ + f_free = 1; + } + } + if(opt_separate){ + if(i_list[row] == 0){ + f_free = 1; + } + } + if(f_free){ + return row; + } + } + return row; +} +int +paste_match(b_col_t * basic_column, el_pool_t * el_pool, int *pos, int *si, const char *sbuf, int num, int row, int stt) +{ + /* basic_column is the major operand that will be changed */ + /* el_pool supplies element cells */ + + /* num is referred only once */ + /* pos is incremented for num times*/ + /* k is referred once */ + /* row is not changed */ + /* si is incremented */ + /* stt is not changed */ + /* sbuf is read only */ + + int t = *pos+num; + int l; + int t_si = *si; + for(l=*pos; l<t; ++l,++*pos){ + int score = 0; + el_t * tmp; + int m; + if(basic_column->array[l].to_el == NULL){ + /* initialize */ + basic_column->array[l].to_el = get_el(el_pool); + basic_column->array[l].next = NULL; + basic_column->array[l].score = 0; + } + tmp = basic_column->array[l].to_el; + for(m=0; m<row; ++m){ + if(tmp->down == NULL){ + tmp->down = get_el(el_pool); + } + if(tmp->el.base != ' '){ + ++score; + } + tmp = tmp->down; + } + if(sbuf[t_si] != ' '){ + assert(tmp->el.base == ' '); + tmp->el.base = sbuf[t_si]; + ++score; + }else{ + if(tmp->el.base != ' '){ + ++score; + } + } + ++t_si; + while(tmp->down){ + tmp = tmp->down; + if(tmp->el.base != ' '){ + ++score; + } + } + basic_column->array[l].score = score; + + if(*pos != stt && basic_column->array[l].to_el->el.base != ' ' && basic_column->array[l-1].to_el->el.base != ' ') + { + column_t * cur_col; + column_t * l_col; + column_t * r_col; + assert(l >= 1); + cur_col = basic_column->array[l-1].next;/* ins column */ + l_col = &(basic_column->array[l-1]); + r_col = &(basic_column->array[l]); + for(; cur_col; cur_col=cur_col->next){ + { + el_t * l_el = l_col->to_el; + el_t * r_el = r_col->to_el; + el_t * c_el = cur_col->to_el; + int depth = 0; + int score = 0; + while(l_el != NULL && r_el != NULL) + { + if(c_el->el.base == ' '){ + if(l_el->el.base != ' ' && r_el->el.base != ' '){ + c_el->el.base = '-'; + ++score; + } + }else{ + ++score; + } + ++depth; + + l_el = l_el->down; + r_el = r_el->down; + if(depth > row && (l_el == NULL || r_el == NULL)){ + break; + } + if(c_el->down == NULL){ + c_el->down = get_el(el_pool); + } + c_el = c_el->down; + } + while(depth <= row){ + if(depth == row){ + c_el->el.base = '-'; + ++score; + break; + }else if(c_el->el.base == ' '){ + }else{ + ++score; + } + if(c_el->down == NULL){ + c_el->down = get_el(el_pool); + } + c_el = c_el->down; + ++depth; + } + cur_col->score = score; + } + } + } + } + *si = t_si; + return *pos; +} +int +paste_insert(b_col_t * basic_column, el_pool_t * el_pool, col_pool_t *col_pool, int pos, int *si, const char *sbuf, int num, int row, int stt) +{ + int p; + column_t *l_col, *r_col, *cur_col; + assert(pos >=1); + l_col = &(basic_column->array[pos-1]); + r_col = &(basic_column->array[pos]); + cur_col = l_col; + + for(p=0; p< num; ++p){ + if(cur_col->next == NULL){ + cur_col->next = get_col(col_pool, el_pool); + } + cur_col = cur_col->next; + { + el_t * l_el = l_col->to_el; + el_t * r_el = r_col->to_el; + el_t * c_el = cur_col->to_el; + int depth = 0; + int score = 0; + while(l_el != NULL && r_el != NULL) + { + if(depth == row){ + if(sbuf[*si] != ' '){ + assert(c_el->el.base == ' '); + c_el->el.base = sbuf[*si]; + ++score; + }else{ + if(c_el->el.base != ' '){ + ++score; + } + } + ++*si; + }else if(c_el->el.base == ' '){ + if(l_el->el.base != ' ' && r_el->el.base != ' '){ + c_el->el.base = '-'; + ++score; + } + }else{ + /* keep */ + ++score; + } + ++depth; + + l_el = l_el->down; + r_el = r_el->down; + if(depth > row && (l_el == NULL || r_el == NULL)){ + break; + } + if(c_el->down == NULL){ + c_el->down = get_el(el_pool); + } + c_el = c_el->down; + } + while(depth <= row){ + if(depth == row){ + if(sbuf[*si] != ' '){ + c_el->el.base = sbuf[*si]; + ++score; + } + ++*si; + break; + }else if(c_el->el.base == ' '){ + }else{ + ++score; + } + if(c_el->down == NULL){ + c_el->down = get_el(el_pool); + } + c_el = c_el->down; + ++depth; + } + cur_col->score = score; + } + } + return *si; +} int main(int argc, char ** argv) { int hitnum=0; @@ -612,7 +839,6 @@ int main(int argc, char ** argv) fprintf(stderr, "USAGE: <this> <in.nss>\n"); fprintf(stderr, "\tinput MUST be sorted by 1st chromosome name (must), 2nd evalue and 3rd bitScore\n"); fprintf(stderr, "\t-v <integer>: valid voter.\n"); - fprintf(stderr, "\t-r <integer>: max read length. default value is 32768\n"); return 1; } if(LSEQ <= 65536){ @@ -676,7 +902,6 @@ int main(int argc, char ** argv) int stt2; int end2; int f_discard; - int not_collide; int length_of_region; length_of_region = calculate_region_length(t_sam); if(!length_of_region){ continue; } @@ -716,237 +941,33 @@ int main(int argc, char ** argv) continue; } - gen_inserted_seq(sbuf, t_sam); - - int row; - for(row=0; row<NROWS; ++row){ - int f_free = 0; - not_collide = 1; - /* find a free region */ - if(!opt_separate){ - int l; - for(l=0; l<i_list[row] && l<LISTSIZE; ++l){/* check the rgn does not overlap any reads */ - if(end2+INTERVAL < lists[row][l].stt || stt2 > lists[row][l].end+INTERVAL){ - continue; + { + int row = find_row_with_space(lists, i_list, LISTSIZE, NROWS, opt_separate, stt2, end2, INTERVAL); + if(row >= NROWS) continue; + /* column[row] is free */ + /* set */ + { + int k; + int pos=stt; + int si = 0; + gen_inserted_seq(sbuf, t_sam); + for(k=0; k<t_sam->cigar_length; ++k){ + char sym = t_sam->cigar[k].sym; + if(sym == 'M' || sym == 'D'){ + paste_match(&basic_column, &el_pool, &pos, &si, sbuf, t_sam->cigar[k].num, row, stt); + }else if(sym == 'I'){ + paste_insert(&basic_column, &el_pool, &col_pool, pos, &si, sbuf, t_sam->cigar[k].num, row, stt); }else{ - not_collide=0; + fprintf(stderr,"strange sym %c\n",sym); + exit(1); } } - if(not_collide==1 && i_list[row] <= LISTSIZE){ - f_free = 1; - } } - if(opt_separate){ - if(i_list[row] == 0){ - f_free = 1; - } - } - if(f_free){ - /* column[row] is free */ - /* set */ - { - int k,m; - int pos=stt; - int si = 0; - for(k=0; k<t_sam->cigar_length; ++k){ - char sym = t_sam->cigar[k].sym; - if(sym == 'M' || sym == 'D'){ - int t = pos+t_sam->cigar[k].num; - int l; - for(l=pos; l<t; ++l,++pos){ - int score = 0; - el_t * tmp; - if(basic_column.array[l].to_el == NULL){ - /* initialize */ - basic_column.array[l].to_el = get_el(&el_pool); - basic_column.array[l].next = NULL; - basic_column.array[l].score = 0; - } - tmp = basic_column.array[l].to_el; - for(m=0; m<row; ++m){ - if(tmp->down == NULL){ - tmp->down = get_el(&el_pool); - } - if(tmp->el.base != ' '){ - ++score; - } - tmp = tmp->down; - } - if(sbuf[si] != ' '){ - if(tmp->el.base != ' '){ - fprintf(stderr,"bug %s %c %c\n",t_sam->rname,tmp->el.base,sbuf[si]); - } - tmp->el.base = sbuf[si]; - ++score; - }else{ - if(tmp->el.base != ' '){ - ++score; - } - } - ++si; - while(tmp->down){ - tmp = tmp->down; - if(tmp->el.base != ' '){ - ++score; - } - } - basic_column.array[l].score = score; - - if(pos != stt && basic_column.array[l].to_el->el.base != ' ' && basic_column.array[l-1].to_el->el.base != ' ') - { - column_t * cur_col; - column_t * l_col; - column_t * r_col; - if(l<1){ - fprintf(stderr,"pg bug l\n"); - } - cur_col = basic_column.array[l-1].next;/* ins column */ - l_col = &(basic_column.array[l-1]); - r_col = &(basic_column.array[l]); - for(; cur_col; cur_col=cur_col->next){ - { - el_t * l_el = l_col->to_el; - el_t * r_el = r_col->to_el; - el_t * c_el = cur_col->to_el; - int depth = 0; - int score = 0; - while(l_el != NULL && r_el != NULL) - { - if(c_el->el.base == ' '){ - if(l_el->el.base != ' ' && r_el->el.base != ' '){ - c_el->el.base = '-'; - ++score; - } - }else{ - ++score; - } - ++depth; - - l_el = l_el->down; - r_el = r_el->down; - if(depth > row && (l_el == NULL || r_el == NULL)){ - break; - } - if(c_el->down == NULL){ - c_el->down = get_el(&el_pool); - } - c_el = c_el->down; - } - while(depth <= row){ - if(depth == row){ - c_el->el.base = '-'; - ++score; - break; - }else if(c_el->el.base == ' '){ - }else{ - ++score; - } - if(c_el->down == NULL){ - c_el->down = get_el(&el_pool); - } - c_el = c_el->down; - ++depth; - } - cur_col->score = score; - } - } - } - } - }else if(sym == 'I'){ - int p; - int n_ins; - column_t *l_col, *r_col, *cur_col; - if(pos<1){ - fprintf(stderr,"strange pos %d\n",pos); - exit(1); - } - l_col = &(basic_column.array[pos-1]); - r_col = &(basic_column.array[pos]); - cur_col = l_col; - - n_ins = t_sam->cigar[k].num; - for(p=0; p<n_ins; ++p){ - if(cur_col->next == NULL){ - cur_col->next = get_col(&col_pool, &el_pool); - } - cur_col = cur_col->next; - - { - el_t * l_el = l_col->to_el; - el_t * r_el = r_col->to_el; - el_t * c_el = cur_col->to_el; - int depth = 0; - int score = 0; - while(l_el != NULL && r_el != NULL) - { - if(depth == row){ - if(sbuf[si] != ' '){ - if(c_el->el.base != ' '){ - fprintf(stderr,"pg bug 433\n"); - } - c_el->el.base = sbuf[si]; - ++score; - }else{ - if(c_el->el.base != ' '){ - ++score; - } - } - ++si; - }else if(c_el->el.base == ' '){ - if(l_el->el.base != ' ' && r_el->el.base != ' '){ - c_el->el.base = '-'; - ++score; - } - }else{ - /* keep */ - ++score; - } - ++depth; - - l_el = l_el->down; - r_el = r_el->down; - if(depth > row && (l_el == NULL || r_el == NULL)){ - break; - } - if(c_el->down == NULL){ - c_el->down = get_el(&el_pool); - } - c_el = c_el->down; - } - while(depth <= row){ - if(depth == row){ - if(sbuf[si] != ' '){ - c_el->el.base = sbuf[si]; - ++score; - } - ++si; - break; - }else if(c_el->el.base == ' '){ - }else{ - ++score; - } - if(c_el->down == NULL){ - c_el->down = get_el(&el_pool); - } - c_el = c_el->down; - ++depth; - } - cur_col->score = score; - } - } - }else{ - fprintf(stderr,"strange sym %c\n",sym); - exit(1); - } - } - } - lists[row][i_list[row]].stt = stt2; - lists[row][i_list[row]].end = end2; - ++i_list[row]; - if(max_end < end2){ - max_end = end2; - } - break; + lists[row][i_list[row]].stt = stt2; + lists[row][i_list[row]].end = end2; + ++i_list[row]; + if(max_end < end2){ + max_end = end2; } } } diff --git a/wscript b/wscript index 69d8d4e..645b700 100644 --- a/wscript +++ b/wscript @@ -1,5 +1,5 @@ APPNAME = 'sprai' -VERSION = '0.9.9.14' +VERSION = '0.9.9.16' srcdir = '.' blddir = 'build' @@ -42,7 +42,7 @@ def build(bld): 'get_top_20x_fa.pl', # 'mira_ikki.pl', 'partition_fa.pl', -# 'ezez4makefile_v4.pl', + 'ezez4makefile_v4.pl', 'get_target_fasta_records.pl', 'dfq2fq_v2.pl', 'extract_fq.pl', @@ -120,7 +120,7 @@ def dist(ctx): 'get_top_20x_fa.pl', # 'mira_ikki.pl', 'partition_fa.pl', -# 'ezez4makefile_v4.pl', + 'ezez4makefile_v4.pl', 'get_target_fasta_records.pl', 'doc/_build/html/**', 'configure', -- Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-med/sprai.git _______________________________________________ debian-med-commit mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/debian-med-commit
