Dear friends at UCSC, We've been happily using our mirror of the UCSC mysql database for my computational biology capstone class this quarter. I now have a related but different question. Something we need to do now is to query function values from wig and bigwig files that reside in the UCSC web, but are external to the mysql database we've mirrored. Rather than try to download all these files (for which I'm not sure our class can obtain sufficient additional memory), we were wondering whether we could network mount the directories at UCSC containing the appropriate files. Is this against your policy, perhaps because of some limited bandwidth that would impact other users?
Thanks for your advice. Martin. > ----- Original Message ----- > From: "Martin Tompa" <[email protected]> > To: "Maximilian Haussler" <[email protected]>, "Hiram > Clawson" <[email protected]> > Cc: [email protected], "Martin Tompa" <[email protected]> > Sent: Friday, April 22, 2011 10:13:49 PM > Subject: RE: [Genome] possibly excessive MySQL queries > > Dear friends at UCSC, > > Earlier this month your helpful replies guided us to the > decision to use rsynch to mirror the UCSC hg18 and hg19 mysql > database locally, for use in my undergraduate capstone class > this quarter. The rsynch of hg19 took a couple of days, but > went without a hitch. We are successfully using our mirror > of that database now. But the rsynch of hg18 has failed > repeatedly. Here is the message from the mysql expert on our > support staff who was doing both hg18 and hg19 for us: > > > For whatever reason, the rsync of hg18 keeps failing-- only 300GB > > having transferred. Seems to occur at or near the same point each > > time. > > > > rsync: connection unexpectedly closed (7559869260 bytes received so > > far) [receiver] rsync error: error in rsync protocol data > stream (code > > 12) at io.c(601) [receiver=3.0.8] > > rsync: connection unexpectedly closed (787372 bytes > received so far) > > [generator] rsync error: error in rsync protocol data > stream (code 12) > > at io.c(601) [generator=3.0.8] > > > > I've tried half a dozen times at least. What next? > > Do you have ideas of what might be going wrong? Suggestions > of what to try? > > Thanks. > Martin. > > > -----Original Message----- > > From: Maximilian Haussler [mailto:[email protected]] > > Sent: Tuesday, April 05, 2011 2:41 PM > > To: Hiram Clawson > > Cc: Martin Tompa; [email protected] > > Subject: Re: [Genome] possibly excessive MySQL queries > > > > Hi Martin, > > > > we've had a similar question on Biostar recently and the person > > finally found it easier to mirror the UCSC mysql database than to > > bother with remote access. If you already have a mysql > server running > > somewhere, mirroring the ucsc database for e.g. hg18 > requires only one > > single rsync command. > > > > Given that you don't want to risk that the mysql access to ucsc > > directly gets blocked during the course or just 2 hours before they > > have to hand in their exercises (which is likely, because they will > > all start 2 hours before the deadline :-), the best > solution could be > > a local mirror of the database (not the genome browser > website, only > > the mysql database itself). > > > > The biostar thread contains the required command: > > > http://biostar.stackexchange.com/questions/4552/getting-ucsc-data-via- > > mysql/4554#4554 > > > > hope this helps > > cheers > > Max > > -- > > Maximilian Haussler > > Office:+44 161 27 55980 Mob: +44 7574 246 789 > > http://www.manchester.ac.uk/research/maximilian.haussler/ > > > > > > > > > > On Tue, Apr 5, 2011 at 10:51 PM, Hiram Clawson <[email protected]> > > wrote: > > > > > > You could also use the sql definition text files from hgdownload. > > > http://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/*.sql > > > > > > also available via FTP and rsync. > > > > > > You could rsync all of these .sql files to a local > > directory and allow > > > everyone to use local files. > > > > > > If you want to run MySQL exercises, you should use small > > samples from > > > small tables. Running exercises over an entire database is > > an immense > > > amount of work. There are several hundred Gb of data in hg19. > > > > > > --Hiram > > > > > > ----- Original Message ----- > > > From: "robert kuhn" <[email protected]> > > > To: "Martin Tompa" <[email protected]> > > > Cc: "[email protected]" <[email protected]> > > > Sent: Tuesday, April 5, 2011 1:43:02 PM > > > Subject: Re: [Genome] possibly excessive MySQL queries > > > > > > Hi, Martin, > > > > > > thanks for asking. That might add up to an awful lot of > queries if > > > you are using a human assembly. there are 1000s of > tables in there. > > > You might consider parsing the trackDb table first, because the > > > entries _______________________________________________ > > > Genome maillist - [email protected] > > > https://lists.soe.ucsc.edu/mailman/listinfo/genome > > > > > > _______________________________________________ Genome maillist - [email protected] https://lists.soe.ucsc.edu/mailman/listinfo/genome
