Oops... I figure out what would cause the problem. We had added uniprot mnemonic keyword for every genome assembly. But, for ponAbe2, it looks like I used PONPA before, but now I used PONAB. Thus, it is pointing out different locations...
>>> worldbase.dir('ponAbe2', matchType='r') ['Bio.MSA.UCSC.ponAbe2_multiz8way', 'Bio.MSA.UCSC.ponAbe2_pairwiseCalJac1', 'Bio.MSA.UCSC.ponAbe2_pairwiseGalGal3', 'Bio.MSA.UCSC.ponAbe2_pairwiseHg18', 'Bio.MSA.UCSC.ponAbe2_pairwiseHg19', 'Bio.MSA.UCSC.ponAbe2_pairwiseMm9', 'Bio.MSA.UCSC.ponAbe2_pairwiseMonDom4', 'Bio.MSA.UCSC.ponAbe2_pairwiseOrnAna1', 'Bio.MSA.UCSC.ponAbe2_pairwisePanTro2', 'Bio.MSA.UCSC.ponAbe2_pairwiseRheMac2', 'Bio.Seq.Genome.*PONAB.ponAbe2'*, 'Bio.Seq.Genome.*PONPA.ponAbe2'*] Now, I fixed it and PONAB.ponAbe2 and PONPA.ponAbe2 will reference same URL. Eventually, I will have to remove PONPA.ponAbe2 and replace all referenced NLMSAs. Sometimes, there is no mnemonic keyword until uniprot decides to have it, thus I used tempory keyword... Yours, Namshin Kim On Sat, Sep 5, 2009 at 3:04 PM, Paul Rigor (gmail) <paulri...@gmail.com>wrote: > Hi Namshin, > Sorry, no dice. Still getting the same error: > INFO downloader.download_unpickler: Beginning download of > http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz to > /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz... > INFO downloader.download_monitor: downloaded 8192 bytes (3593.0%)... > INFO downloader.download_unpickler: Download done. > INFO downloader.uncompress_file: untarring > /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz... > > > The chromFa.tar.gz file does not exist still. > > I made sure to start the download in a newly created folder. Also, the > WOLRDBASEPATH env variable was unset. > > > > On Fri, Sep 4, 2009 at 7:05 PM, Paul Rigor (gmail) <paulri...@gmail.com>wrote: > >> Great, I'm re-running my script now. Will keep you posted Paul >> >> >> On Fri, Sep 4, 2009 at 6:54 PM, Namshin Kim <deepr...@gmail.com> wrote: >> >>> Hi Paul, >>> Would you test it again? >>> >>> I had made a mistake when I first updated the server, saving downloadable >>> resources into WORLDBASEPATH (downloadable resources should be saved into >>> different path, separate from WORLDBASEPATH). I thought I deleted all of >>> them, but I forgot to delete .fasta .txt, actual SourceURLs in >>> WORLDBASEPATH. >>> >>> Now, it is clean. Would you test it again? >>> >>> Thanks! >>> Namshin Kim >>> >>> >>> >>> >>> >>> >>> On Sat, Sep 5, 2009 at 10:08 AM, Paul Rigor (gmail) <paulri...@gmail.com >>> > wrote: >>> >>>> Hi Namshin, >>>> I just wanted to point out that the MSA pulls the incorrect url for >>>> ponAbe2. For example, if I now attempt to download that genome separately >>>> from the UCLA XMLRPC, the downloader obtains the correct file: >>>> >>>> import os >>>> os.environ['WORLDBASEPATH'] = '., >>>> http://biodb2.bioinformatics.ucla.edu:5000' >>>> from pygr import worldbase >>>> g = worldbase('Bio.Seq.Genome.PONAB.ponAbe2',download=True) >>>> INFO downloader.download_unpickler: Beginning download of >>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to >>>> /tmp/ponAbe2.gz.. >>>> >>>> >>>> However, as you can see from the error log of the MSA resource download, >>>> the downloader is pulling the wrong ponAbe2 file from the biodb server: >>>> >>>> >>>> >>>> INFO downloader.download_unpickler: Beginning download >>>> ofhttp://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz >>>> to >>>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz... >>>> >>>> Thanks for looking into this!! >>>> >>>> Paul >>>> >>>> >>>> On Fri, Sep 4, 2009 at 2:43 PM, Paul Rigor (gmail) <paulri...@gmail.com >>>> > wrote: >>>> >>>>> Hi Namshin, >>>>> >>>>> Ah, gotcha. >>>>> So the downloader picks up the binaries from biodb. So when using the >>>>> XMLRPC, it unpickles the URL for the tar balls. But I'm still getting an >>>>> error when the downloader attempts to pick up the ponAbe2 tar file. Could >>>>> you fix the path on your server since url stored by the worldbase XMLRPC >>>>> server is wrong. >>>>> >>>>> Please see the attached log. >>>>> >>>>> Thanks, >>>>> Paul >>>>> >>>>> >>>>> >>>>> On Thu, Sep 3, 2009 at 6:29 PM, Namshin Kim <deepr...@gmail.com>wrote: >>>>> >>>>>> Hi Paul, >>>>>> Actual files for all downloadable resources are saved in biodb, not >>>>>> biodb2. biodb2 has binary files for genomes and NLMSA, and not accessible >>>>>> via http. The files are too big to save in one server... >>>>>> >>>>>> biodb - only for downloading via http >>>>>> biodb2 - only for accessing via xmlrpc >>>>>> >>>>>> And biodb URLs for public resources. You can download files from below >>>>>> URLs. >>>>>> >>>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA # for text converted >>>>>> binaries (NLMSA) >>>>>> http://biodb.bioinformatics.ucla.edu/MEGATEST # for megatest files >>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES # for genome assemblies, >>>>>> compressed files >>>>>> >>>>>> Yours, >>>>>> Namshin Kim >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Sep 4, 2009 at 10:20 AM, Paul Rigor (gmail) < >>>>>> paulri...@gmail.com> wrote: >>>>>> >>>>>>> Hi Namshin, >>>>>>> Downloading the genome works for me too. But downloading the >>>>>>> multiple alignment doesn't pull all the necessary resources. >>>>>>> >>>>>>> However, I think you might have missed the latter part of my email in >>>>>>> which I described a possible error with the hard-coded XMLRPC server url >>>>>>> which points to 'biodb' instead of 'biodb2' as you've described in your >>>>>>> instructions on this thread. >>>>>>> >>>>>>> I'm now starting to download the 44way alignment separately. Now >>>>>>> even though I've explicitly set the WORLDBASEPATH to use the 'biodb2' >>>>>>> server, the url still points to 'biodb'. See the log below. So I'm >>>>>>> guessing that the XMLRPC server needs to be updated to reflect the >>>>>>> actual >>>>>>> locations of the gzipped picked files. >>>>>>> >>>>>>> Thanks, >>>>>>> Paul >>>>>>> >>>>>>> ===log=== >>>>>>> (2.6.2)06:00 PM 20129 >>>>>>> pri...@mine-17/extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way >>>>>>> $ time python downloadMSA.py >>>>>>> INFO downloader.download_unpickler: Beginning download of >>>>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA/hg18_multiz44way.txt.gzto >>>>>>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/hg18_multiz44way.txt.gz... >>>>>>> >>>>>>> INFO downloader.download_monitor: downloaded 7628292096 bytes >>>>>>> (10.0%)... >>>>>>> >>>>>>> >>>>>>> On Thu, Sep 3, 2009 at 6:10 PM, Namshin Kim <deepr...@gmail.com>wrote: >>>>>>> >>>>>>>> Hi Paul, >>>>>>>> I am testing it and there is no problem. >>>>>>>> >>>>>>>> >>> from pygr import worldbase >>>>>>>> >>> ponAbe2 = worldbase.Bio.Seq.Genome.PONAB.ponAbe2(download=True) >>>>>>>> INFO downloader.download_unpickler: Beginning download of >>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to >>>>>>>> /home/deepreds/test/ponAbe2.gz... >>>>>>>> >>>>>>>> And, you may need to unset PYGRDATAPATH as well. >>>>>>>> >>>>>>>> Yours, >>>>>>>> Namshin Kim >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 4, 2009 at 10:04 AM, Paul Rigor (gmail) < >>>>>>>> paulri...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi Namshin, >>>>>>>>> It's not set, my script sets that environment variable to '., >>>>>>>>> http://biodb2.bioinformatics.ucla.edu:5000' I've made sure to >>>>>>>>> remove .pygrdata files, etc. >>>>>>>>> Thanks >>>>>>>>> Paul >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Sep 3, 2009 at 6:02 PM, Namshin Kim < >>>>>>>>> deepr...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Paul, >>>>>>>>>> Would you please give your WORLDBASEPATH? >>>>>>>>>> >>>>>>>>>> $ echo $WORLDBASEPATH >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Namshin Kim >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Sep 4, 2009 at 9:54 AM, Paul Rigor (gmail) < >>>>>>>>>> paulri...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Namshin, >>>>>>>>>>> I'm still encountering the same error as before regarding the >>>>>>>>>>> download of genome resources. Download the 44way alignment also >>>>>>>>>>> does not >>>>>>>>>>> download the necessary dependencies. Could there by something wrong >>>>>>>>>>> with the >>>>>>>>>>> URL paths saved in on the resources. You specified 'biodb2' but >>>>>>>>>>> during the >>>>>>>>>>> download, the url's are directed to 'biodb.' >>>>>>>>>>> >>>>>>>>>>> With WOLRDBASEPATH not set, I get the following error. >>>>>>>>>>> >>>>>>>>>>> Please see: >>>>>>>>>>> INFO downloader.download_unpickler: Beginning download of >>>>>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz >>>>>>>>>>> to >>>>>>>>>>> /extra/bal >>>>>>>>>>> dig1/genomics/pygrdata/genomes/ponAbe2.tar.gz... >>>>>>>>>>> INFO downloader.download_monitor: downloaded 8192 bytes >>>>>>>>>>> (3593.0%)... >>>>>>>>>>> INFO downloader.download_unpickler: Download done. >>>>>>>>>>> INFO downloader.uncompress_file: untarring >>>>>>>>>>> /extra/baldig1/genomics/pygrdata/genomes/ponAbe2.tar.gz... >>>>>>>>>>> ... >>>>>>>>>>> ReadError: not a gzip file >>>>>>>>>>> >>>>>>>>>>> However, when I set my WORLDBASEPATH to use the biodb2 server, >>>>>>>>>>> the download is successful. So there's something wrong with the >>>>>>>>>>> hardcoded >>>>>>>>>>> url that's distributed with pygr (0.8 beta). But perhaps the xmlrpc >>>>>>>>>>> resources on the server should be fixed as well. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Paul >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Sep 3, 2009 at 4:22 PM, cjlee112 <cjlee...@gmail.com>wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Yes, please pass on to Namshin any debugging information you can >>>>>>>>>>>> supply. >>>>>>>>>>>> >>>>>>>>>>>> Thanks! >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On Sep 3, 2009, at 3:26 PM, Paul Rigor (gmail) wrote: >>>>>>>>>>>> >>>>>>>>>>>> > Hi Chris, >>>>>>>>>>>> > >>>>>>>>>>>> > Does this mean that the 44way alignment distributed by the >>>>>>>>>>>> UCLA >>>>>>>>>>>> > server is foobar since I ran exactly those same commands and >>>>>>>>>>>> no >>>>>>>>>>>> > genome dependencies were downloaded? >>>>>>>>>>>> > >>>>>>>>>>>> > Thanks, >>>>>>>>>>>> > Paul >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Paul Rigor >>>>>>>>> Graduate Student >>>>>>>>> Institute for Genomics and Bioinformatics >>>>>>>>> Donald Bren School of Information and Computer Sciences >>>>>>>>> University of California, Irvine >>>>>>>>> http://www.paulrigor.net/ >>>>>>>>> http://www.ics.uci.edu/~prigor >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Paul Rigor >>>>>>> Graduate Student >>>>>>> Institute for Genomics and Bioinformatics >>>>>>> Donald Bren School of Information and Computer Sciences >>>>>>> University of California, Irvine >>>>>>> http://www.paulrigor.net/ >>>>>>> http://www.ics.uci.edu/~prigor >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Paul Rigor >>>>> Graduate Student >>>>> Institute for Genomics and Bioinformatics >>>>> Donald Bren School of Information and Computer Sciences >>>>> University of California, Irvine >>>>> http://www.paulrigor.net/ >>>>> http://www.ics.uci.edu/~prigor >>>>> >>>> >>>> >>>> >>>> -- >>>> Paul Rigor >>>> Graduate Student >>>> Institute for Genomics and Bioinformatics >>>> Donald Bren School of Information and Computer Sciences >>>> University of California, Irvine >>>> http://www.paulrigor.net/ >>>> http://www.ics.uci.edu/~prigor >>>> >>>> >>>> >>> >>> >>> >> >> >> -- >> Paul Rigor >> Graduate Student >> Institute for Genomics and Bioinformatics >> Donald Bren School of Information and Computer Sciences >> University of California, Irvine >> http://www.paulrigor.net/ >> http://www.ics.uci.edu/~prigor >> > > > > -- > Paul Rigor > Graduate Student > Institute for Genomics and Bioinformatics > Donald Bren School of Information and Computer Sciences > University of California, Irvine > http://www.paulrigor.net/ > http://www.ics.uci.edu/~prigor > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to pygr-dev@googlegroups.com To unsubscribe from this group, send email to pygr-dev+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---