Hi Reuti
Still struggling but here is what I did.
In a particular directory having my 4 fastq files ran this command and
stored the output in workset
ls *.fastq > workset
In the script jobscript.sh
#!/bin/bash
f=$(sed -n ${SGE_TASK_ID}p)
#my commands without for loop so do you mean like this
LB="${f%%.*}";
SM="${f%%.*}";
PU="${f%%.lane_*}";
PU="${PU#*.}";
NUM="${f%%_P*}";
NUM="${NUM##*_}";
ID="$LB-$PU-$NUM";
PART1="${f%%_P*}";
PART2="${f##*_P1_}";
#Replace the submission command from below by a plain call to your
application.
#Add the necessary commands for postprocessing as you outlined below
# By this do you mean this
# THis is my plain command which does alignment
/apps1/bwa/bwa-0.7.5a/bwa mem -t 12 -R
'@RG\tID:$ID\tPL:illumina\tPU:$PU\tLB:$LB\tSM:$SM' -v 1 -a -M
/cork/vgupta12/S.cerevisiae/indexes/bwa/sacCer3.fa "${PART1}_P1_${PART2}"
"${PART1}_P2_${PART2}" > ${LB}.sam
#How to add other downstream commands??
# Is it something like this
*samtools view -bS ${LB}sam | samtools sort - ${LB}*
*samtools index ${LB}.bam*
Now save the script and on command line run like this
qsub -t 1-$(sed -n '$=' workset) -cwd -V -j y -b y -N check -pe threaded
12 ./jobscript.sh
Is this what you asked to do?? Sorry if I am missing something
Its not working this way. Throwing errors.
Something like this
*[E::main_mem] fail to open file `_P1_'.Usage: samtools sort [-on] [-m
<maxMem>] <in.bam> <out.prefix>[main_samview] fail to open "sam" for
reading.open: No such file or directory[bam_index_build2] fail to open the
BAM file.[E::main_mem] fail to open file `_P1_'.Usage: samtools sort [-on]
[-m <maxMem>] <in.bam> <out.prefix>[main_samview] fail to open "sam" for
reading.open: No such file or directory[bam_index_build2] fail to open the
BAM file.[E::main_mem] fail to open file `_P1_'.Usage: samtools sort [-on]
[-m <maxMem>] <in.bam> <out.prefix>[main_samview] fail to open "sam" for
reading.open: No such file or directory[bam_index_build2] fail to open the
BAM file.[E::main_mem] fail to open file `_P1_'.Usage: samtools sort [-on]
[-m <maxMem>] <in.bam> <out.prefix>[main_samview] fail to open "sam" for
reading.open: No such file or directory[bam_index_build2] fail to open the
BAM file.*
Sorry for troubling you on this
Help!!
Regards
Varun
On Wed, Apr 9, 2014 at 7:03 AM, Reuti <[email protected]> wrote:
> Hi,
>
> Am 09.04.2014 um 07:44 schrieb VG:
>
> > Hello Everyone,
> >
> > Till now I have been writing simple scripts to do my work. Now I want to
> automate my task.
> > I have no idea about submitting array jobs
> > So here is the explanation of what I want to do and then you can help me
> how to design the script.
> >
> > In a directory I have these 4 fastq files
> >
> > A-122-3.XX.lane_1_P1_I24.sacCer3.sequence.fastq
> >
> > A-122-3.XX.lane_1_P2_I24.sacCer3.sequence.fastq
> >
> > A-2-3.XX.lane_1_P1_I47.sacCer3.sequence.fastq
> >
> > A-2-3.XX.lane_1_P2_I47.sacCer3.sequence.fastq
> >
> > I made this script to align my fastq files with yeast genome and it
> worked perfectly fine
>
> As far as I can see here are two questions in the game. First how to
> handle this task as an array job, second to apply the postprocessing. This
> rises the question whether it's suitable at all. Often an array job is
> submitted in case you have the same application and input, but need a
> varying index to specify a point in time or frame of a movie you want to
> render.
>
> Nevertheless you can assemble a list of files first and use the generated
> index then to pick a specific line of this list of files which should be
> computed for a particular run.
>
>
> Step 1:
>
> $ ls *.fastq > workset
>
>
> Step 2:
>
> A jobscript.sh which you will submit:
>
> #!/bin/bash
> f=$(sed -n ${SGE_TASK_ID}p)
> ...
> your commands from below without the loop
> ...
>
>
> Step 3:
>
> Replace the submission command from below by a plain call to your
> application.
> Add the necessary commands for postprocessing as you outlined below.
>
>
> Step 4:
>
> $ qsub -t 1-$(sed -n '$=' workset) -cwd -V -j y -pe smp 12 jobscript.sh
>
>
> Step N:
>
> As you observe above, I replaced "-l num_proc=12" with a request for a PE
> (which you will have to set up). The "num_proc" is a feature of a machine
> like it could be the architecture or OS. The advantage besides being more
> SGE-like, is that it's quite easy to change the number of cores. Inside the
> jobscript you get the variable $NSLOTS set, and using "-t $NSLOTS" as
> argument to your application would instantly uses the granted number, i.e.
> running with only 6 cores would need a change in the submission command,
> but not in the script any longer.
>
> HTH -- Reuti
>
>
>
> > #!/bin/bash
> >
> > for f in *_P1*
> >
> > do
> >
> > LB="${f%%.*}";
> >
> > SM="${f%%.*}";
> >
> > PU="${f%%.lane_*}";
> >
> > PU="${PU#*.}";
> >
> > NUM="${f%%_P*}";
> >
> > NUM="${NUM##*_}";
> >
> >
> > ID="$LB-$PU-$NUM";
> >
> >
> > PART1="${f%%_P*}";
> >
> > PART2="${f##*_P1_}";
> >
> >
> > qsub -l mf=30G -l num_proc=12 -cwd -V -j y -b y -N $LB
> "/apps1/bwa/bwa-0.7.5a/bwa mem -t 12 -R
> '@RG\tID:$ID\tPL:illumina\tPU:$PU\tLB:$LB\tSM:$SM' -v 1 -a -M
> /cork/vgupta12/S.cerevisiae/indexes/bwa/sacCer3.fa "${PART1}_P1_${PART2}"
> "${PART1}_P2_${PART2}" > ${LB}.sam"
> >
> >
> >
> > done
> >
> >
> >
> > Now the result I have is 2 .sam files namely A-122-3.sam and A-2-3.sam
> >
> >
> >
> > This is what I want to do in the same script
> >
> > Convert all the sam files into bam files. Here is the command for one
> sam file
> >
> > samtools view -bS test.sam | samtools sort - test
> >
> > Then make index files for all bam files. Command for one index bam file
> is
> >
> > samtools index test.bam ##output is test.bam.bai
> >
> >
> >
> > I have further downstream analysis , but as of now I would like to get
> my above script plus the commands I just mentioned above(highlighted part)
> into one script only.
> >
> >
> >
> > Any help would be appreciated.
> >
> >
> >
> > Thanks for all the help so far
> >
> >
> >
> > Regards
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users