Chris, The results I have sent are in order as I entered in the pydev console (eclipse on Windows XP). So the os.listdir is before construction of the NLMSA.
I tried your code and got a lot of file longer than 1 000 000. But I do not undestand what is the limit exactly. Is it the physical limit of the file on disk (in this case hg18, mm8 and rn4 pureseq files fill respectively 3 031 842 Ko, 5 204 014 Ko and 2 767 7063 Ko) ? Or is it the size of individual chromosomes counted in number of nucleotides ? Anyway I ran your code and obtained a lot of chr> 1 000 000. That's the beginning of the list: hg18.chr1 247249719 hg18.chr10 135374737 hg18.chr11 134452384 hg18.chr12 132349534 hg18.chr13 114142980 hg18.chr14 106368585 hg18.chr15 100338915 hg18.chr16 88827254 hg18.chr17 78774742 hg18.chr17_random 2617613 I ran it with a different value (>= 2 000 000 000) and was surprised to find this: mm8.chrM 2666011489 So I reconstructed mm8 without chrM. And then at the end I got this: mm8.chr9_random 2666012422 (which was hg18.chr9_random 1146434 before). Michel On May 11, 6:56 pm, Christopher Lee <l...@chem.ucla.edu> wrote: > On May 11, 2009, at 9:28 AM, michel bellis wrote: > > > msa = cnestedlist.NLMSA('hs18mm8rn4','w',genomeUnion,os.listdir > > ('maf')) > > > Traceback (most recent call last): > > File "<console>", line 1, in <module> > > File "cnestedlist.pyx", line 1508, in > > pygr.cnestedlist.NLMSA.__init__ > > File "cnestedlist.pyx", line 1735, in > > pygr.cnestedlist.NLMSA.readMAFfiles > > OverflowError: long int too large to convert to int > > Hi Michel, > this is odd -- it is claiming that an individual sequence in your > union of hg18, mm8 or rn4 is longer than 2GB. Maybe something was > wrong with the reading of the sequence files (e.g. an incompatible > carriage return mode?), so that it read an entire genome file as one > sequence? Could you try the following: > > for name,info in genomeUnion.seqInfoDict.iteritems(): > if info.length > 1000000: > print name,info.length > > Also, is your os.listdir() output from *before* or *after* trying to > construct the NLMSA from maf files? > > -- Chris --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to pygr-dev@googlegroups.com To unsubscribe from this group, send email to pygr-dev+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---