Thanks Namshin, the MSA download seems to be proceeding without any problems. I'll keep you posted on any issues I may encounter. Also, it looks like the resources are no longer built as BlastDB's Thanks, Paul On Sun, Sep 6, 2009 at 4:52 AM, Namshin Kim <deepr...@gmail.com> wrote:
> Oops... I figure out what would cause the problem. > > We had added uniprot mnemonic keyword for every genome assembly. But, for > ponAbe2, it looks like I used PONPA before, but now I used PONAB. Thus, it > is pointing out different locations... > > >>> worldbase.dir('ponAbe2', matchType='r') > ['Bio.MSA.UCSC.ponAbe2_multiz8way', 'Bio.MSA.UCSC.ponAbe2_pairwiseCalJac1', > 'Bio.MSA.UCSC.ponAbe2_pairwiseGalGal3', 'Bio.MSA.UCSC.ponAbe2_pairwiseHg18', > 'Bio.MSA.UCSC.ponAbe2_pairwiseHg19', 'Bio.MSA.UCSC.ponAbe2_pairwiseMm9', > 'Bio.MSA.UCSC.ponAbe2_pairwiseMonDom4', > 'Bio.MSA.UCSC.ponAbe2_pairwiseOrnAna1', > 'Bio.MSA.UCSC.ponAbe2_pairwisePanTro2', > 'Bio.MSA.UCSC.ponAbe2_pairwiseRheMac2', 'Bio.Seq.Genome.*PONAB.ponAbe2'*, > 'Bio.Seq.Genome.*PONPA.ponAbe2'*] > > Now, I fixed it and PONAB.ponAbe2 and PONPA.ponAbe2 will reference same > URL. > > Eventually, I will have to remove PONPA.ponAbe2 and replace all referenced > NLMSAs. > > Sometimes, there is no mnemonic keyword until uniprot decides to have it, > thus I used tempory keyword... > > Yours, > Namshin Kim > > > > > On Sat, Sep 5, 2009 at 3:04 PM, Paul Rigor (gmail) <paulri...@gmail.com>wrote: > >> Hi Namshin, >> Sorry, no dice. Still getting the same error: >> INFO downloader.download_unpickler: Beginning download of >> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz to >> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz... >> INFO downloader.download_monitor: downloaded 8192 bytes (3593.0%)... >> INFO downloader.download_unpickler: Download done. >> INFO downloader.uncompress_file: untarring >> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz... >> >> >> The chromFa.tar.gz file does not exist still. >> >> I made sure to start the download in a newly created folder. Also, the >> WOLRDBASEPATH env variable was unset. >> >> >> >> On Fri, Sep 4, 2009 at 7:05 PM, Paul Rigor (gmail) >> <paulri...@gmail.com>wrote: >> >>> Great, I'm re-running my script now. Will keep you posted Paul >>> >>> >>> On Fri, Sep 4, 2009 at 6:54 PM, Namshin Kim <deepr...@gmail.com> wrote: >>> >>>> Hi Paul, >>>> Would you test it again? >>>> >>>> I had made a mistake when I first updated the server, saving >>>> downloadable resources into WORLDBASEPATH (downloadable resources should be >>>> saved into different path, separate from WORLDBASEPATH). I thought I >>>> deleted >>>> all of them, but I forgot to delete .fasta .txt, actual SourceURLs in >>>> WORLDBASEPATH. >>>> >>>> Now, it is clean. Would you test it again? >>>> >>>> Thanks! >>>> Namshin Kim >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Sat, Sep 5, 2009 at 10:08 AM, Paul Rigor (gmail) < >>>> paulri...@gmail.com> wrote: >>>> >>>>> Hi Namshin, >>>>> I just wanted to point out that the MSA pulls the incorrect url for >>>>> ponAbe2. For example, if I now attempt to download that genome separately >>>>> from the UCLA XMLRPC, the downloader obtains the correct file: >>>>> >>>>> import os >>>>> os.environ['WORLDBASEPATH'] = '., >>>>> http://biodb2.bioinformatics.ucla.edu:5000' >>>>> from pygr import worldbase >>>>> g = worldbase('Bio.Seq.Genome.PONAB.ponAbe2',download=True) >>>>> INFO downloader.download_unpickler: Beginning download of >>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to >>>>> /tmp/ponAbe2.gz.. >>>>> >>>>> >>>>> However, as you can see from the error log of the MSA resource >>>>> download, the downloader is pulling the wrong ponAbe2 file from the biodb >>>>> server: >>>>> >>>>> >>>>> >>>>> INFO downloader.download_unpickler: Beginning download >>>>> ofhttp://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz >>>>> to >>>>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz... >>>>> >>>>> Thanks for looking into this!! >>>>> >>>>> Paul >>>>> >>>>> >>>>> On Fri, Sep 4, 2009 at 2:43 PM, Paul Rigor (gmail) < >>>>> paulri...@gmail.com> wrote: >>>>> >>>>>> Hi Namshin, >>>>>> >>>>>> Ah, gotcha. >>>>>> So the downloader picks up the binaries from biodb. So when using the >>>>>> XMLRPC, it unpickles the URL for the tar balls. But I'm still getting an >>>>>> error when the downloader attempts to pick up the ponAbe2 tar file. >>>>>> Could >>>>>> you fix the path on your server since url stored by the worldbase XMLRPC >>>>>> server is wrong. >>>>>> >>>>>> Please see the attached log. >>>>>> >>>>>> Thanks, >>>>>> Paul >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Sep 3, 2009 at 6:29 PM, Namshin Kim <deepr...@gmail.com>wrote: >>>>>> >>>>>>> Hi Paul, >>>>>>> Actual files for all downloadable resources are saved in biodb, not >>>>>>> biodb2. biodb2 has binary files for genomes and NLMSA, and not >>>>>>> accessible >>>>>>> via http. The files are too big to save in one server... >>>>>>> >>>>>>> biodb - only for downloading via http >>>>>>> biodb2 - only for accessing via xmlrpc >>>>>>> >>>>>>> And biodb URLs for public resources. You can download files from >>>>>>> below URLs. >>>>>>> >>>>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA # for text converted >>>>>>> binaries (NLMSA) >>>>>>> http://biodb.bioinformatics.ucla.edu/MEGATEST # for megatest files >>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES # for genome >>>>>>> assemblies, compressed files >>>>>>> >>>>>>> Yours, >>>>>>> Namshin Kim >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 4, 2009 at 10:20 AM, Paul Rigor (gmail) < >>>>>>> paulri...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi Namshin, >>>>>>>> Downloading the genome works for me too. But downloading the >>>>>>>> multiple alignment doesn't pull all the necessary resources. >>>>>>>> >>>>>>>> However, I think you might have missed the latter part of my email >>>>>>>> in which I described a possible error with the hard-coded XMLRPC >>>>>>>> server url >>>>>>>> which points to 'biodb' instead of 'biodb2' as you've described in your >>>>>>>> instructions on this thread. >>>>>>>> >>>>>>>> I'm now starting to download the 44way alignment separately. Now >>>>>>>> even though I've explicitly set the WORLDBASEPATH to use the 'biodb2' >>>>>>>> server, the url still points to 'biodb'. See the log below. So I'm >>>>>>>> guessing that the XMLRPC server needs to be updated to reflect the >>>>>>>> actual >>>>>>>> locations of the gzipped picked files. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Paul >>>>>>>> >>>>>>>> ===log=== >>>>>>>> (2.6.2)06:00 PM 20129 >>>>>>>> pri...@mine-17/extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way >>>>>>>> $ time python downloadMSA.py >>>>>>>> INFO downloader.download_unpickler: Beginning download of >>>>>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA/hg18_multiz44way.txt.gzto >>>>>>>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/hg18_multiz44way.txt.gz... >>>>>>>> >>>>>>>> INFO downloader.download_monitor: downloaded 7628292096 bytes >>>>>>>> (10.0%)... >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Sep 3, 2009 at 6:10 PM, Namshin Kim <deepr...@gmail.com>wrote: >>>>>>>> >>>>>>>>> Hi Paul, >>>>>>>>> I am testing it and there is no problem. >>>>>>>>> >>>>>>>>> >>> from pygr import worldbase >>>>>>>>> >>> ponAbe2 = worldbase.Bio.Seq.Genome.PONAB.ponAbe2(download=True) >>>>>>>>> INFO downloader.download_unpickler: Beginning download of >>>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to >>>>>>>>> /home/deepreds/test/ponAbe2.gz... >>>>>>>>> >>>>>>>>> And, you may need to unset PYGRDATAPATH as well. >>>>>>>>> >>>>>>>>> Yours, >>>>>>>>> Namshin Kim >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Sep 4, 2009 at 10:04 AM, Paul Rigor (gmail) < >>>>>>>>> paulri...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Namshin, >>>>>>>>>> It's not set, my script sets that environment variable to '., >>>>>>>>>> http://biodb2.bioinformatics.ucla.edu:5000' I've made sure to >>>>>>>>>> remove .pygrdata files, etc. >>>>>>>>>> Thanks >>>>>>>>>> Paul >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Sep 3, 2009 at 6:02 PM, Namshin Kim < >>>>>>>>>> deepr...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Paul, >>>>>>>>>>> Would you please give your WORLDBASEPATH? >>>>>>>>>>> >>>>>>>>>>> $ echo $WORLDBASEPATH >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Namshin Kim >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Sep 4, 2009 at 9:54 AM, Paul Rigor (gmail) < >>>>>>>>>>> paulri...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Namshin, >>>>>>>>>>>> I'm still encountering the same error as before regarding the >>>>>>>>>>>> download of genome resources. Download the 44way alignment also >>>>>>>>>>>> does not >>>>>>>>>>>> download the necessary dependencies. Could there by something >>>>>>>>>>>> wrong with the >>>>>>>>>>>> URL paths saved in on the resources. You specified 'biodb2' but >>>>>>>>>>>> during the >>>>>>>>>>>> download, the url's are directed to 'biodb.' >>>>>>>>>>>> >>>>>>>>>>>> With WOLRDBASEPATH not set, I get the following error. >>>>>>>>>>>> >>>>>>>>>>>> Please see: >>>>>>>>>>>> INFO downloader.download_unpickler: Beginning download of >>>>>>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz >>>>>>>>>>>> to >>>>>>>>>>>> /extra/bal >>>>>>>>>>>> dig1/genomics/pygrdata/genomes/ponAbe2.tar.gz... >>>>>>>>>>>> INFO downloader.download_monitor: downloaded 8192 bytes >>>>>>>>>>>> (3593.0%)... >>>>>>>>>>>> INFO downloader.download_unpickler: Download done. >>>>>>>>>>>> INFO downloader.uncompress_file: untarring >>>>>>>>>>>> /extra/baldig1/genomics/pygrdata/genomes/ponAbe2.tar.gz... >>>>>>>>>>>> ... >>>>>>>>>>>> ReadError: not a gzip file >>>>>>>>>>>> >>>>>>>>>>>> However, when I set my WORLDBASEPATH to use the biodb2 server, >>>>>>>>>>>> the download is successful. So there's something wrong with the >>>>>>>>>>>> hardcoded >>>>>>>>>>>> url that's distributed with pygr (0.8 beta). But perhaps the xmlrpc >>>>>>>>>>>> resources on the server should be fixed as well. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Paul >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Sep 3, 2009 at 4:22 PM, cjlee112 <cjlee...@gmail.com>wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Yes, please pass on to Namshin any debugging information you >>>>>>>>>>>>> can supply. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks! >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On Sep 3, 2009, at 3:26 PM, Paul Rigor (gmail) wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> > Hi Chris, >>>>>>>>>>>>> > >>>>>>>>>>>>> > Does this mean that the 44way alignment distributed by the >>>>>>>>>>>>> UCLA >>>>>>>>>>>>> > server is foobar since I ran exactly those same commands and >>>>>>>>>>>>> no >>>>>>>>>>>>> > genome dependencies were downloaded? >>>>>>>>>>>>> > >>>>>>>>>>>>> > Thanks, >>>>>>>>>>>>> > Paul >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Paul Rigor >>>>>>>>>> Graduate Student >>>>>>>>>> Institute for Genomics and Bioinformatics >>>>>>>>>> Donald Bren School of Information and Computer Sciences >>>>>>>>>> University of California, Irvine >>>>>>>>>> http://www.paulrigor.net/ >>>>>>>>>> http://www.ics.uci.edu/~prigor >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Paul Rigor >>>>>>>> Graduate Student >>>>>>>> Institute for Genomics and Bioinformatics >>>>>>>> Donald Bren School of Information and Computer Sciences >>>>>>>> University of California, Irvine >>>>>>>> http://www.paulrigor.net/ >>>>>>>> http://www.ics.uci.edu/~prigor >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Paul Rigor >>>>>> Graduate Student >>>>>> Institute for Genomics and Bioinformatics >>>>>> Donald Bren School of Information and Computer Sciences >>>>>> University of California, Irvine >>>>>> http://www.paulrigor.net/ >>>>>> http://www.ics.uci.edu/~prigor >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Paul Rigor >>>>> Graduate Student >>>>> Institute for Genomics and Bioinformatics >>>>> Donald Bren School of Information and Computer Sciences >>>>> University of California, Irvine >>>>> http://www.paulrigor.net/ >>>>> http://www.ics.uci.edu/~prigor >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> -- >>> Paul Rigor >>> Graduate Student >>> Institute for Genomics and Bioinformatics >>> Donald Bren School of Information and Computer Sciences >>> University of California, Irvine >>> http://www.paulrigor.net/ >>> http://www.ics.uci.edu/~prigor >>> >> >> >> >> -- >> Paul Rigor >> Graduate Student >> Institute for Genomics and Bioinformatics >> Donald Bren School of Information and Computer Sciences >> University of California, Irvine >> http://www.paulrigor.net/ >> http://www.ics.uci.edu/~prigor >> >> >> -- Paul Rigor Graduate Student Institute for Genomics and Bioinformatics Donald Bren School of Information and Computer Sciences University of California, Irvine http://www.paulrigor.net/ http://www.ics.uci.edu/~prigor --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to pygr-dev@googlegroups.com To unsubscribe from this group, send email to pygr-dev+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---