You can also try the new indexing programs dbxflat and dbxfasta, which can handle files larger than 2 GB.
Regards, David. [EMAIL PROTECTED] schrieb am 21/04/2006 17:43:27: > Hi, > > Yes I also index refseq. I think the problem here is that dbiflat > can only handle files which are less than 2GB. So try splitting the > files first. > > Best, > Isabelle > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:emboss- > [EMAIL PROTECTED] On Behalf Of Olivier Friard > Sent: Friday, April 21, 2006 17:00 > To: [email protected] > Subject: [EMBOSS] index RefSeq for EMBOSS > > > Hi, > > I tried to index the RefSeq database: > > 1) I downloaded all > ftp://ftp.ncbi.nih.gov/refseq/release/complete/complete*.genomic.gbff.gz > file (GB format) > > 2) gunziped > > 3) Added the rs_dna entry to my .embossrc file > > > DB rs_dna [ > type: "N" > method: "emblcd" > format: "GB" > dir: "/home/users/friard/data/refseq_genomic/" > file: "*.gbff" > release: "" > comment: "RefSeq Genomic (upd)" > indexdir: "/home/users/friard/data/refseq_genomic/" > ] > > > 4) used dbiflat with following arguments (from the directory where files > are stored) > > dbiflat > Index a flat file database > Database name: rs_dna > EMBL : EMBL > SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew > GB : Genbank, DDBJ > REFSEQ : Refseq > Entry format [SWISS]: REFSEQ > Database directory [.]: > Wildcard database filename [*.dat]: *.gbff > Release number [0.0]: > Index date [00/00/00]: > > The indexes were created but when I try to access to a sequence (i.e > seqret rs_rna:NC_000004) then results is not the correct sequence but an > other one with the NC_000004 ID! > > > > I also downloaded the file in FASTA format and tried to index them with > the dbifasta command (format: ncbi) without positive results: > > seqret rs_dna:nc_000004 > Reads and writes (returns) sequences > Error: Unable to read sequence 'rs_dna:nc_000004' > Died: seqret terminated: Bad value for '-sequence' and no prompt > > > Does anyone index the RefSeq successfully? > Thank you in advance > > > > > > > -- > > Olivier Friard > Laboratorio di Biologia Computazionale > Facoltà di Scienze MFN > Università di Torino > via Accademia Albertina 13, 10124 TORINO (Italy) > > tel. +39 011 6704689 > > _______________________________________________ > EMBOSS mailing list > [email protected] http://lists.open-bio.org/mailman/listinfo/emboss > > _______________________________________________ > EMBOSS mailing list > [email protected] > http://lists.open-bio.org/mailman/listinfo/emboss _______________________________________________ EMBOSS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/emboss
