> From: Richard Holland [mailto:[EMAIL PROTECTED] > > This is more likely to be a problem with the count button > (usually when > people report problems with it it reports less than what you get, not > more...), or a change in the internal structure of the dbSNP database > since your previous attempt. Do you remember what release of > Ensembl your > previous attempt was made with? Then we can compare the two in the > archives to see if we can spot any differences in the data.
I believe I downloaded it around Nov. 13 from Ensembl, not Biomart. But I'm not sure. When I downloaded it, dbSNP was listed as just dbSNP (maybe version 125?), not dbSNP + Affy etc. Still, I think it's pretty unlikely that 4M validated SNPs were lost between the two versions. I've restarted my "notify by mail" query, and will let you know when I get the results. -Amir > > cheers, > Richard > > > On Wed, February 7, 2007 10:00 pm, Amir Karger wrote: > >> From: Arek Kasprzyk [mailto:[EMAIL PROTECTED] > >> > >> On 7 Feb 2007, at 21:09, Amir Karger wrote: > >> > >> > I'm downloading all valid dbSNPs from Biomart.org. I > >> selected ensembl > >> > variation 42, selected the "valid only" restriction, and left the > >> > default attributes. > >> > > >> > When I ask for a count, there are 5.6M, which is about what > >> I got a few > >> > months ago when I downloaded a slightly different version > >> from ensembl. > >> > However, when I do the actual download, I get only 1.9M lines. I > >> > downloaded as tsv as well as xml. > >> > > >> > -Amir Karger > >> > > >> > >> > >> Amir, > >> can you try 'notify by email' option when downloading a file > >> and check if the problem still persists? > > > > I used the notify by email mechanism for both. I've been > finding that > > more reliable than the regular download. I'll mention that > I downloaded > > a gz, which successfully unzipped, so I don't think it's a > problem in > > transmission. > > > > -Amir > > > >> > >> > >> a. > >> > >> > >> > >> -------------------------------------------------------------- > >> ---------- > >> ------- > >> Arek Kasprzyk > >> EMBL-European Bioinformatics Institute. > >> Wellcome Trust Genome Campus, Hinxton, > >> Cambridge CB10 1SD, UK. > >> Tel: +44-(0)1223-494606 > >> Fax: +44-(0)1223-494468 > >> -------------------------------------------------------------- > >> ---------- > >> ------- > >> > >> > >> > >> > > > > > -- > Richard Holland > BioMart (http://www.biomart.org/) > EMBL-EBI > Hinxton, Cambridgeshire CB10 1SD, UK > >
