Dear Raf, Thanks for letting me know. I saw the same problem happened here before as well.
Best wishes, Wei On Jun 4, 2013, at 5:49 PM, rcaloger wrote: > Dear Wei, > I think the problem was due to a file system issue. I check the raid and one > of the disk failed, it could be that this created some problem to the I/O > also because during the run of Rsubread there were many other I/O processes > running. > I run again Rsubread and I did not got any error > Thanks for the help > Raf > > On 6/2/13 1:22 AM, Wei Shi wrote: >> Dear Raf, >> >> As Martin pointed, that line seems to be the concatenation of two records. >> But the second record is incomplete (it doesn't have the read identifier). >> It seems more likely to be a file system problem rather than Rsubread >> problem. Could you please also provide the line before the problematic line? >> You may also rerun the alignment on a different disk to see if you will see >> this problem again. >> >> Hope this helps. >> >> >> Best wishes, >> >> Wei >> >> >> On Jun 2, 2013, at 2:35 AM, Martin Morgan wrote: >> >>> On 06/01/2013 08:04 AM, rcaloger wrote: >>>> Hi, >>>> I am using the devel version of Bioconductor as part of the development of >>>> my >>>> package chimera. >>>> Testing a new function in chimera, that uses Rsubread package, I >>>> encountered a >>>> problem in converting a sam file generated by Rsubread in a bam file. >>>> I used the function asBam from Rsamtools and I got the following error: >>>> >>>> In doTryCatch(return(expr), name, parentenv, handler) : >>>> Parse error at line 14667325: sequence and quality are inconsistent >>>> >>>> I managed to run asBam if I use only the sam file till line 14667324 >>>> Instead I get the above error if I use a sam file finishing at line >>>> 14667325 >>>> >>>> The line that create the problem is the following: >>>> >>>> HWI-ST169:273:D0YW6ACXX:2:1201:4070:162856 141 * 0 0 * * 0 >>>> 0 >>>> AAAAAAGGGTTGAATTATTTTCACTTGCCCACGTAGTTTATGAATGTGGGAAATAGCTTCAAAGACAGATTAAATGATTTGCCCAAGGCCACAGAAAAGAG >>>> @@@FFFFFHABHHJGGBFIGIFHGIJHGJGJIFBGHDBG9BDAFIIDHIIGCHCHI<GACC@ADHHHE;7?@DEFED>@;ACCC>ABB;AAD<BC> >>>> 77 * 0 0 * * 0 0 >>>> CATGGATGAGGAGAATGAGGATTTTGCGCCGGCTGCTCAGAAGATACCGTGAATCTAAGAAGATCGATCGCCACATGTATCACAGCCTGTACCTGAAGGGG >>>> @@@DD?BADHF<D<ACG>FFE;BBF@B?@C@F:(?1.=)))883)8=7@(65??EEBDEC37;;>???=BB@<BBCCACBDDCC:?BCBC:@######### >>> This looks like two separate records have been concatenated; it's really >>> hard to know whether this is Rsubread or some aspect of the file system or >>> the way the file has been handled after creation by Rsubread. Picard is one >>> commonly used tool for validation. Martin >>> >>>> >>>> Does anybody has an idea of what is wrong in this line? >>>> There is any way to validate the sam file before running asBam to detect >>>> and >>>> filtered out lines that might create problems in the conversion into Bam? >>>> Cheers >>>> Raf >>>> >>>> ######## >>>> sessionInfo() >>>> R version 3.0.0 (2013-04-03) >>>> Platform: x86_64-unknown-linux-gnu (64-bit) >>>> >>>> locale: >>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>>> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >>>> [7] LC_PAPER=C LC_NAME=C >>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>> >>>> attached base packages: >>>> [1] parallel stats graphics grDevices utils datasets methods >>>> [8] base >>>> >>>> other attached packages: >>>> [1] Rsamtools_1.13.16 Biostrings_2.29.3 GenomicRanges_1.13.15 >>>> [4] XVector_0.1.0 IRanges_1.19.8 BiocGenerics_0.7.2 >>>> >>>> loaded via a namespace (and not attached): >>>> [1] bitops_1.0-5 stats4_3.0.0 zlibbioc_1.7.0 >>>> >>> >>> -- >>> Computational Biology / Fred Hutchinson Cancer Research Center >>> 1100 Fairview Ave. N. >>> PO Box 19024 Seattle, WA 98109 >>> >>> Location: Arnold Building M1 B861 >>> Phone: (206) 667-2793 >> ______________________________________________________________________ >> The information in this email is confidential and intended solely for the >> addressee. >> You must not disclose, forward, print or use it without the permission of >> the sender. >> ______________________________________________________________________ >> > > > -- > > ---------------------------------------- > Prof. Raffaele A. Calogero > Bioinformatics and Genomics Unit > MBC Centro di Biotecnologie Molecolari > Via Nizza 52, Torino 10126 > tel. ++39 0116706457 > Fax ++39 0112366457 > Mobile ++39 3333827080 > email: raffaele.calog...@unito.it > raffaele[dot]calogero[at]gmail[dot]com > www: http://www.bioinformatica.unito.it > ______________________________________________________________________ The information in this email is confidential and intend...{{dropped:6}} _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel