Dear friends at UCSC,

Earlier this month your helpful replies guided us to the decision to use rsynch 
to mirror the UCSC hg18 and hg19 mysql database locally, for use in my 
undergraduate capstone class this quarter.  The rsynch of hg19 took a couple of 
days, but went without a hitch.  We are successfully using our mirror of that 
database now.  But the rsynch of hg18 has failed repeatedly.  Here is the 
message from the mysql expert on our support staff who was doing both hg18 and 
hg19 for us:

> For whatever reason, the rsync of hg18 keeps failing-- only 300GB 
> having transferred. Seems to occur at or near the same point each 
> time.
> 
> rsync: connection unexpectedly closed (7559869260 bytes received so 
> far) [receiver] rsync error: error in rsync protocol data stream (code 
> 12) at io.c(601) [receiver=3.0.8]
> rsync: connection unexpectedly closed (787372 bytes received so far) 
> [generator] rsync error: error in rsync protocol data stream (code 12) 
> at io.c(601) [generator=3.0.8]
> 
> I've tried half a dozen times at least. What next? 

Do you have ideas of what might be going wrong?  Suggestions of what to try?

Thanks.
Martin.

> -----Original Message-----
> From: Maximilian Haussler [mailto:[email protected]] 
> Sent: Tuesday, April 05, 2011 2:41 PM
> To: Hiram Clawson
> Cc: Martin Tompa; [email protected]
> Subject: Re: [Genome] possibly excessive MySQL queries
> 
> Hi Martin,
> 
> we've had a similar question on Biostar recently and the 
> person finally found it easier to mirror the UCSC mysql 
> database than to bother with remote access. If you already 
> have a mysql server running somewhere, mirroring the ucsc 
> database for e.g. hg18 requires only one single rsync command.
> 
> Given that you don't want to risk that the mysql access to 
> ucsc directly gets blocked during the course or just 2 hours 
> before they have to hand in their exercises (which is likely, 
> because they will all start 2 hours before the deadline :-), 
> the best solution could be a local mirror of the database 
> (not the genome browser website, only the mysql database itself).
> 
> The biostar thread contains the required command:
> http://biostar.stackexchange.com/questions/4552/getting-ucsc-data-via-mysql/4554#4554
> 
> hope this helps
> cheers
> Max
> --
> Maximilian Haussler
> Office:+44 161 27 55980 Mob: +44 7574 246 789 
> http://www.manchester.ac.uk/research/maximilian.haussler/
> 
> 
> 
> 
> On Tue, Apr 5, 2011 at 10:51 PM, Hiram Clawson 
> <[email protected]> wrote:
> >
> > You could also use the sql definition text files from hgdownload.
> > http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/*.sql
> >
> > also available via FTP and rsync.
> >
> > You could rsync all of these .sql files to a local 
> directory and allow 
> > everyone to use local files.
> >
> > If you want to run MySQL exercises, you should use small 
> samples from 
> > small tables.  Running exercises over an entire database is 
> an immense 
> > amount of work.  There are several hundred Gb of data in hg19.
> >
> > --Hiram
> >
> > ----- Original Message -----
> > From: "robert kuhn" <[email protected]>
> > To: "Martin Tompa" <[email protected]>
> > Cc: "[email protected]" <[email protected]>
> > Sent: Tuesday, April 5, 2011 1:43:02 PM
> > Subject: Re: [Genome] possibly excessive MySQL queries
> >
> > Hi, Martin,
> >
> > thanks for asking.  That might add up to an awful lot of queries if 
> > you are using a human assembly.  there are 1000s of tables in there.
> > You might consider parsing the trackDb table first, because the 
> > entries _______________________________________________
> > Genome maillist  -  [email protected] 
> > https://lists.soe.ucsc.edu/mailman/listinfo/genome
> >
> 
_______________________________________________
Genome maillist  -  [email protected]
https://lists.soe.ucsc.edu/mailman/listinfo/genome

Reply via email to