Dear Raf,

Thanks for letting me know. I saw the same problem happened here before as well.

Best wishes,
Wei

On Jun 4, 2013, at 5:49 PM, rcaloger wrote:

> Dear Wei,
> I think the problem was due to a file system issue. I check the raid and one 
> of the disk failed, it could be that this created some problem to the I/O 
> also because during the run of Rsubread there were many other I/O processes 
> running.
> I run again Rsubread  and I did not got any error
> Thanks for the help
> Raf
> 
> On 6/2/13 1:22 AM, Wei Shi wrote:
>> Dear Raf,
>> 
>> As Martin pointed, that line seems to be the concatenation of two records. 
>> But the second record is incomplete (it doesn't have the read identifier). 
>> It seems more likely to be a file system problem rather than Rsubread 
>> problem. Could you please also provide the line before the problematic line? 
>> You may also rerun the alignment on a different disk to see if you will see 
>> this problem again.
>> 
>> Hope this helps.
>> 
>> 
>> Best wishes,
>> 
>> Wei
>> 
>> 
>> On Jun 2, 2013, at 2:35 AM, Martin Morgan wrote:
>> 
>>> On 06/01/2013 08:04 AM, rcaloger wrote:
>>>> Hi,
>>>> I am using the devel version of Bioconductor as part of the development of 
>>>> my
>>>> package chimera.
>>>> Testing a new function in chimera, that uses Rsubread package, I 
>>>> encountered a
>>>> problem in converting a sam file generated by Rsubread in a bam file.
>>>> I used the function asBam from Rsamtools and I got the following error:
>>>> 
>>>> In doTryCatch(return(expr), name, parentenv, handler) :
>>>>   Parse error at line 14667325: sequence and quality are inconsistent
>>>> 
>>>> I managed to run asBam if I use only the sam file till line 14667324
>>>> Instead I get the above error if I use a sam file finishing at line 
>>>> 14667325
>>>> 
>>>> The line that create the problem is the following:
>>>> 
>>>> HWI-ST169:273:D0YW6ACXX:2:1201:4070:162856    141    *    0    0 * *    0  
>>>>   0
>>>> AAAAAAGGGTTGAATTATTTTCACTTGCCCACGTAGTTTATGAATGTGGGAAATAGCTTCAAAGACAGATTAAATGATTTGCCCAAGGCCACAGAAAAGAG
>>>> @@@FFFFFHABHHJGGBFIGIFHGIJHGJGJIFBGHDBG9BDAFIIDHIIGCHCHI<GACC@ADHHHE;7?@DEFED>@;ACCC>ABB;AAD<BC>
>>>> 77    *    0    0    *    *    0    0
>>>> CATGGATGAGGAGAATGAGGATTTTGCGCCGGCTGCTCAGAAGATACCGTGAATCTAAGAAGATCGATCGCCACATGTATCACAGCCTGTACCTGAAGGGG
>>>> @@@DD?BADHF<D<ACG>FFE;BBF@B?@C@F:(?1.=)))883)8=7@(65??EEBDEC37;;>???=BB@<BBCCACBDDCC:?BCBC:@#########
>>> This looks like two separate records have been concatenated; it's really 
>>> hard to know whether this is Rsubread or some aspect of the file system or 
>>> the way the file has been handled after creation by Rsubread. Picard is one 
>>> commonly used tool for validation. Martin
>>> 
>>>> 
>>>> Does anybody has an idea of what is wrong in this line?
>>>> There is any way to validate the sam file before running asBam to detect 
>>>> and
>>>> filtered out lines that might create problems in the conversion into Bam?
>>>> Cheers
>>>> Raf
>>>> 
>>>> ########
>>>> sessionInfo()
>>>> R version 3.0.0 (2013-04-03)
>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>> 
>>>> locale:
>>>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>>  [7] LC_PAPER=C                 LC_NAME=C
>>>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>> 
>>>> attached base packages:
>>>> [1] parallel  stats     graphics  grDevices utils     datasets methods
>>>> [8] base
>>>> 
>>>> other attached packages:
>>>> [1] Rsamtools_1.13.16     Biostrings_2.29.3 GenomicRanges_1.13.15
>>>> [4] XVector_0.1.0         IRanges_1.19.8        BiocGenerics_0.7.2
>>>> 
>>>> loaded via a namespace (and not attached):
>>>> [1] bitops_1.0-5   stats4_3.0.0   zlibbioc_1.7.0
>>>> 
>>> 
>>> -- 
>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N.
>>> PO Box 19024 Seattle, WA 98109
>>> 
>>> Location: Arnold Building M1 B861
>>> Phone: (206) 667-2793
>> ______________________________________________________________________
>> The information in this email is confidential and intended solely for the 
>> addressee.
>> You must not disclose, forward, print or use it without the permission of 
>> the sender.
>> ______________________________________________________________________
>> 
> 
> 
> -- 
> 
> ----------------------------------------
> Prof. Raffaele A. Calogero
> Bioinformatics and Genomics Unit
> MBC Centro di Biotecnologie Molecolari
> Via Nizza 52, Torino 10126
> tel.   ++39 0116706457
> Fax    ++39 0112366457
> Mobile ++39 3333827080
> email: raffaele.calog...@unito.it
>       raffaele[dot]calogero[at]gmail[dot]com
> www:   http://www.bioinformatica.unito.it
> 

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:6}}

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to