Yes, there is ponAbe2 genome. >>> import os >>> os.environ['WORLDBASEPATH'] = '., http://biodb2.bioinformatics.ucla.edu:5000' >>> from pygr import worldbase >>> ponAbe2 = worldbase.Bio.Seq.Genome.PONAB.ponAbe2(download=True) INFO downloader.download_unpickler: Beginning download of http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to /data/server/downloadable/pygr/tests/biodb2_update/t/ponAbe2.gz... INFO downloader.download_monitor: downloaded 100663296 bytes (10.0%)...
On Fri, Sep 4, 2009 at 4:15 AM, Paul Rigor (gmail) <paulri...@gmail.com>wrote: > Thanks Namshim!!! > But don't I also have to build the individual genome resources as well? In > any case, it would be great if the non-existent ponAbe2 genome would be made > available through XMLRPC as well. > > Thanks, > Paul > > > On Thu, Sep 3, 2009 at 5:07 AM, Namshin Kim <deepr...@gmail.com> wrote: > >> Hi Paul, >> >> I am now building hg18_multiz44way NLMSA without any problems. Please give >> me some error message if you still have those problems. You may need to >> start over after you delete .pygr_data in your writable WORLDBASEPATH. >> If your WORLDBASEBUILDDIR is not final repository, you can move all NLMSA >> files into your destination directory. And, update .seqDictP like this: >> >> You can open genome using seqdb.SequenceFileDB (should use absolute path) >> or from worldbase. >> hg18 = seqdb.SequenceFileDB('hg18') or hg18 = >> worldbase.Bio.Seq.Genome.HUMAN.hg18() >> >> genomeDict = {'hg18':hg18, ...} # supply all 44 genomes >> genomeUnion = seqdb.PrefixUnionDict(genomeDict) >> msa = cnestedlist.NLMSA('hg18_multiz44way, genomeUnion, 'r') >> msa.save_seq_dict() >> >> Then, .seqDictP will be updated and you can access without any problems. >> >> chr1_slice = msa.seqDict['hg18.chr1'][1000:2000] >> edges = msa[chr1_slice].edges() >> >> -- >> Namshin Kim >> >> >> >> >> On Thu, Sep 3, 2009 at 7:20 AM, Namshin Kim <deepr...@gmail.com> wrote: >> >>> Strange... Correct URL will be >>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz The URL >>> you used does not exist, thus it give 404 error (HTML doc). >>> Hmm... I never downloaded and built the hg18_multiz44way via XMLRPC. I >>> will try that... >>> >>> Thanks, >>> Namshin Kim >>> >>> >>> >>> On Thu, Sep 3, 2009 at 6:54 AM, Paul Rigor (gmail) >>> <paulri...@gmail.com>wrote: >>> >>>> Hi Namshim, >>>> Downloading the 44way alignment was successful. However, the persistend >>>> data (.pygrdata) seems to be unworkable. The metabase lists Bio.MSA, etc, >>>> but it cannot be loaded. >>>> >>>> Also, I've attempted to download the genomes from the UCLA metabase, but >>>> a genome might be corrupt. Specifically, >>>> >>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz >>>> >>>> which gives the error message below. In fact, checking the file that is >>>> downloaded (ponAbe2.tar.gz), is an HTML document! >>>> >>>> $ file ponAbe2.tar.gz >>>> ponAbe2.tar.gz: HTML document text >>>> >>>> >>>> ....[error trace below] >>>> .... >>>> /home/dock/shared_libraries/lx64/pkgs/pythonsandbox/2.6.2/lib/python2.6/site-packages/pygr-0.8.0.beta1-py2.6-linux-x86_64.egg/pygr/downloader.pyc >>>> in do_untar(filepath, mode, newpath, singleFile, **kwargs) >>>> 105 newpath = filepath + '.out' >>>> 106 import tarfile >>>> --> 107 t = tarfile.open(filepath, mode) >>>> 108 try: >>>> 109 if singleFile: # extract to a single file >>>> >>>> /home/dock/shared_libraries/lx64/pkgs/pythonsandbox/2.6.2/lib/python2.6/tarfile.pyc >>>> in open(cls, name, mode, fileobj, bufsize, **kwargs) >>>> 1662 else: >>>> 1663 raise CompressionError("unknown compression type >>>> %r" % comptype) >>>> -> 1664 return func(name, filemode, fileobj, **kwargs) >>>> 1665 >>>> 1666 elif "|" in mode: >>>> >>>> /home/dock/shared_libraries/lx64/pkgs/pythonsandbox/2.6.2/lib/python2.6/tarfile.pyc >>>> in gzopen(cls, name, mode, fileobj, compresslevel, **kwargs) >>>> 1713 **kwargs) >>>> 1714 except IOError: >>>> -> 1715 raise ReadError("not a gzip file") >>>> 1716 t._extfileobj = False >>>> 1717 return t >>>> >>>> ReadError: not a gzip file >>>> >>>> >>>> >>>> >>>> On Tue, Sep 1, 2009 at 9:55 PM, Paul Rigor (gmail) <paulri...@gmail.com >>>> > wrote: >>>> >>>>> Well, we have time, storage and bandwidth =) >>>>> I'll let you know how it goes? Maybe we can host an XMLRPC mirror >>>>> someday too. >>>>> >>>>> Thanks, >>>>> Paul >>>>> >>>>> >>>>> On Tue, Sep 1, 2009 at 9:41 PM, Namshin Kim <deepr...@gmail.com>wrote: >>>>> >>>>>> Hi Paul, >>>>>> I just checked the size of hg18_multiz44way and it is 167GB for just >>>>>> NLMSA. If we consider genome assemblies you may not have, it would be ~ >>>>>> 250GB. I think it would take a long time to download all files. >>>>>> >>>>>> -- >>>>>> Namshin Kim >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Sep 2, 2009 at 1:33 PM, Paul Rigor (gmail) < >>>>>> paulri...@gmail.com> wrote: >>>>>> >>>>>>> >>>>>>> Hi Namshin, >>>>>>> I'm running this over night =) Has anyone successfully pulled and >>>>>>> used this alignment? >>>>>>> >>>>>>> Thanks, >>>>>>> Paul >>>>>>> >>>>>>> On Sun, Aug 2, 2009 at 4:40 PM, Namshin Kim <deepr...@gmail.com>wrote: >>>>>>> >>>>>>>> Now the downloadable resources are available on biodb2 XMLRPC >>>>>>>> server. >>>>>>>> >>>>>>>> Two ways to build NLMSA. >>>>>>>> >>>>>>>> 1. metabase >>>>>>>> >>>>>>>> >>> import os >>>>>>>> >>> os.environ['WORLDBASEPATH'] = '., >>>>>>>> http://biodb2.bioinformatics.ucla.edu:5000' >>>>>>>> >>> from pygr import metabase >>>>>>>> >>> mdb = metabase.MetabaseList() >>>>>>>> >>> hg18 = mdb('Bio.MSA.UCSC.hg18_multiz44way',download=True) >>>>>>>> >>>>>>>> 2. from text files >>>>>>>> >>>>>>>> download text files from >>>>>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA/ >>>>>>>> use cnestedlist.textfile_to_binaries('hg18_multiz44way') function to >>>>>>>> convert from text to binaries >>>>>>>> >>>>>>>> If you want to see the script used to add these resources, visit >>>>>>>> this URL. >>>>>>>> >>>>>>>> >>>>>>>> http://github.com/deepreds/pygr/tree/d7ab9247dcd39b7d474029cb8749a53eb8582968/tests/biodb2_update >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Paul Rigor >>>>> Graduate Student >>>>> Institute for Genomics and Bioinformatics >>>>> Donald Bren School of Information and Computer Sciences >>>>> University of California, Irvine >>>>> http://www.paulrigor.net/ >>>>> http://www.ics.uci.edu/~prigor >>>>> >>>> >>>> >>>> >>>> -- >>>> Paul Rigor >>>> Graduate Student >>>> Institute for Genomics and Bioinformatics >>>> Donald Bren School of Information and Computer Sciences >>>> University of California, Irvine >>>> http://www.paulrigor.net/ >>>> http://www.ics.uci.edu/~prigor >>>> >>>> >>>> >>> >> >> >> > > > -- > Paul Rigor > Graduate Student > Institute for Genomics and Bioinformatics > Donald Bren School of Information and Computer Sciences > University of California, Irvine > http://www.paulrigor.net/ > http://www.ics.uci.edu/~prigor > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pygr-dev" group. To post to this group, send email to pygr-dev@googlegroups.com To unsubscribe from this group, send email to pygr-dev+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/pygr-dev?hl=en -~----------~----~----~----~------~----~------~--~---