Chris,

The results I have sent are in order as I entered in the pydev console
(eclipse on Windows XP). So the os.listdir is before construction of
the NLMSA.

I tried your code and got a lot of file longer than 1 000 000. But I
do not undestand what is the limit exactly.
Is it the physical limit of the file on disk (in this case hg18, mm8
and rn4 pureseq files fill respectively  3 031 842 Ko, 5 204 014 Ko
and 2 767 7063 Ko) ? Or is it the size of individual chromosomes
counted in number of nucleotides ?
Anyway I ran your code and obtained a lot of chr> 1 000 000. That's
the beginning of the list:
hg18.chr1 247249719
hg18.chr10 135374737
hg18.chr11 134452384
hg18.chr12 132349534
hg18.chr13 114142980
hg18.chr14 106368585
hg18.chr15 100338915
hg18.chr16 88827254
hg18.chr17 78774742
hg18.chr17_random 2617613

I ran it with a different value (>= 2 000 000 000) and was surprised
to find this:
mm8.chrM 2666011489

So I reconstructed mm8 without chrM. And then at the end I got this:
mm8.chr9_random 2666012422 (which was hg18.chr9_random 1146434
before).

Michel


On May 11, 6:56 pm, Christopher Lee <l...@chem.ucla.edu> wrote:
> On May 11, 2009, at 9:28 AM, michel bellis wrote:
>
> > msa = cnestedlist.NLMSA('hs18mm8rn4','w',genomeUnion,os.listdir
> > ('maf'))
>
> > Traceback (most recent call last):
> >  File "<console>", line 1, in <module>
> >  File "cnestedlist.pyx", line 1508, in
> > pygr.cnestedlist.NLMSA.__init__
> >  File "cnestedlist.pyx", line 1735, in
> > pygr.cnestedlist.NLMSA.readMAFfiles
> > OverflowError: long int too large to convert to int
>
> Hi Michel,
> this is odd -- it is claiming that an individual sequence in your  
> union of hg18, mm8 or rn4 is longer than 2GB.  Maybe something was  
> wrong with the reading of the sequence files (e.g. an incompatible  
> carriage return mode?), so that it read an entire genome file as one  
> sequence?  Could you try the following:
>
> for name,info in genomeUnion.seqInfoDict.iteritems():
>      if info.length > 1000000:
>          print name,info.length
>
> Also, is your os.listdir() output from *before* or *after* trying to  
> construct the NLMSA from maf files?
>
> -- Chris
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to