I have no idea about it, but maybe something goes wrong during the
construction of pygr seqlen files
 (pureseq has I think the correct size, but some length is wrong in
seqlen (chrM is a small file (17Ko) but its length in seqlen is huge).

First I concatenate the fasta files with this program :

import sys, os, string

def concatenate(inputFileDir, outputFile):
    os.chdir(inputFileDir)
    print 'processing ... %s' % os.getcwd()
    fileList=os.listdir(inputFileDir)
    outFile=open(outputFile,'w')
    for fileName in fileList:
        for lines in open(fileName,'r').xreadlines():
            outFile.write(lines)
    outFile.close()
    print '%s is constructed' % os.getcwd()

if __name__=='__main__':
    INPUTFILE_DIR='D:/data/ucsc/mm8'
    OUTPUTFILE='D:/data/ucsc/genome/mm8'
    concatenate(INPUTFILE_DIR,OUTPUTFILE)

then I construct pygr file with this program :

def make_blast_db(inputFileDir):
    os.chdir(inputFileDir)
    fileList=os.listdir(inputFileDir)
    for fileName in fileList:
        seqdb.BlastDB(fileName)

if __name__=='__main__':

    #CREATE PYGR RESSOURCE
    INPUTFILE_DIR='D:/data/ucsc/genome'
    make_blast_db(INPUTFILE_DIR)

On May 12, 3:32 pm, Istvan Albert <istvan.alb...@gmail.com> wrote:
> On May 12, 6:08 am, michel bellis <fill.i...@9online.fr> wrote:
>
> > I tried your code and got a lot of file longer than 1 000 000. But I
> > do not undestand what is the limit exactly.
>
> The number refers the the length of the sequence not the size of the
> file. The limit of a 32 bit long signed integer is 2,147,483,647
>
> The point that Chris was making is that each human chromosome is at
> most 245 million bp long so how could you end up with sequences that
> are over 2 billion long?
>
> Istvan
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to