Mark - Just guessing: One may need to specify "merge -f" in case an earlier invocation of the command has created an empty file with the path generated by "jp(bamFolder, sample + ".bam")". Also may need to specify "merge -n" if, in fact, the output from bwa is already properly name-sorted. Again, just guessing.
- tom blackwell - On Fri, 12 May 2017, Margres, Mark J wrote: > Hello, > > I am having an issue with merging different bam assemblies using samtools. > In my pipeline, I merge reads using flash2, and then perform reference-based > assemblies using bwa, one assembly uses just the merged reads and a second > assembly uses just the unmerged reads. I then attempt to merge the > assemblies using samtools and get the following output: > > > [W::bam_merge_core2] No @HD tag found. > > I am assuming this is due to the unmerged read assembly having a different > header than the merged reads, but I am not sure why (and also not sure if > this is a bwa issue or a samtools issue). Is there a way for me to merge and > then sort the assemblies with maintaining the correct header tags? I have > the original assemblies to work with, but have assembled abut a dozen genomes > and would prefer to not re-assemble if possible. Code is below. Thanks. > > # Run BWA to map PE samples to reference genome > # -t number of threads -R read group header > logFile = jp(resultsDir, sample + '_mapping.log') > cmd = ' '.join(["/home/mark.margres/bwa-0.7.15/bwa mem -t 50 -R > '@RG\tID:bwa\tSM:" + sample + "\tPL:ILLUMINA'", > bwaIndex, jp(resultsDir, sample + "_cleaned_PE1.fastq.gz"), > jp(resultsDir, sample + "_cleaned_PE2.fastq.gz"), "| > samtools view -bS -@ 50 -o", jp(bamFolder, sample + "PE.bam"), > "2>", logFile]) > log(cmd, logCommands) > os.system(cmd) > # Run BWA to map SE samples to reference genome > cmd = ' '.join(["/home/mark.margres/bwa-0.7.15/bwa mem -t 50 -R > '@RG\tID:bwa\tSM:" + sample + "\tPL:ILLUMINA'", > bwaIndex, jp(resultsDir, sample + "_cleaned_SE.fastq.gz"), > "| samtools view -bS -@ 50 -o", jp(bamFolder, sample + "SE.bam"), > "2>>", logFile]) > log(cmd, logCommands) > os.system(cmd) > # merge and sort > cmd = ' '.join(['samtools merge -c', jp(bamFolder, sample + ".bam"), > jp(bamFolder, sample + "PE.bam"), jp(bamFolder, sample + "SE.bam")]) > log(cmd, logCommands) > os.system(cmd) > # sort bam file; -@ number of threads > cmd = ' '.join(['samtools sort -o', jp(bamFolder, sample) + "sorted.bam", > ' -@ 50', jp(bamFolder, sample + ".bam")]) > log(cmd, logCommands) > os.system(cmd) > # Mark and remove PCR duplicates > cmd = ' '.join([picard + "MarkDuplicates INPUT=" + jp(bamFolder, sample + > "sorted.bam"), ' OUTPUT=' + jp(bamFolder, sample + "_markdup.bam"), > ' METRICS_FILE=' + jp(bamFolder, sample + ".metrics"), ' > REMOVE_DUPLICATES=true ', > ' ASSUME_SORTED=true VALIDATION_STRINGENCY=LENIENT', '>>', > logFile, '2>&1']) > log(cmd, logCommands) > os.system(cmd) > # Index bam file: > cmd = ' '.join(['samtools index', jp(bamFolder, sample) + "_markdup.bam"]) > > ---------------------- > Mark J. Margres, Ph.D. > Postdoc, Storfer Lab > School of Biological Sciences > Washington State University > Pullman, WA 99164-4236 > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Samtools-help mailing list Samtools-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/samtools-help