Hi Marc, That script has this in it:
## For now just get data for the ones that we have traditionally supported ## I don't even know if the other species are available... speciesList = c("chipsrc_human.sqlite", "chipsrc_rat.sqlite", "chipsrc_chicken.sqlite", "chipsrc_zebrafish.sqlite", # "chipsrc_worm.sqlite", # "chipsrc_fly.sqlite", "chipsrc_mouse.sqlite", "chipsrc_bovine.sqlite" # "chipsrc_arabidopsis.sqlite" ## this is available and could be "activated" ## But to activate arabidopsis, remember you have to pre-add the tables... # "chipsrc_canine.sqlite", # "chipsrc_rhesus.sqlite", # "chipsrc_chimp.sqlite", # "chipsrc_anopheles.sqlite" ) And there is no mention of yeast anywhere. If I search all the scripts for say 'INSERT INTO pfam', I get custom_anno/script/bindb.sql 328:INSERT INTO pfam pfam/script/srcdb_pfam.sql 202:-- INSERT INTO pfamb organism_annotation/script/bindb_yeast.sql 441:-- INSERT INTO pfam yeast/script/bindb.sql 241:-- INSERT INTO pfam The first one is just doing all the metadata tables, and the other three are in code blocks that are commented out. Is it possible that you used a script that didn't make it into svn? Jim On Sun, Oct 4, 2015 at 2:36 PM, Marc Carlson <mrj...@gmail.com> wrote: > Hi Jim, > > You asked me on Friday where the PFAM Ids for yeast came from and I > couldn't recall because at the moment I was at Seattle Childrens (and thus > nowhere near my copy of my source code). But I also said I would look into > it for you later (and I have). Here is what my code tells me: So ever > since IPI shut down, we have been getting the PFAM and IPI data from > UniProt. There is a script in the UniProt.ws package > called processDataForBuild.R that is supposed to be called by the script > "src_build.sh" (it's the last thing that script does). That code should > get the pfam data from yeast for you. Please note that yeast required a > lot of special code to get it processed. Nothing with yeast annotations is > ever easy. It's like karmic accounting to compensate for all the bread and > beer. ;) > > Let me know if you need any more explanations about what is in there. > Because of the crazy timing, before I left I build I pushed into devel a > fresh set of .DB0s and core packages (in late August) just in case it was > too crazy to do a refresh right now. But it sounds like you won't need > that. > > > Marc > > > > On Sun, Oct 4, 2015 at 6:27 AM, James W. MacDonald <jmac...@uw.edu> wrote: > >> I am building the annotation db0 packages for the upcoming Bioconductor >> release, which are used to generate all the orgDb and chip annotation >> packages that we distribute. Up to the previous release we have always >> included IPI identifiers (as part of the table containing the PROSITE and >> PFAM IDs). Unfortunately, IPI <https://www.ebi.ac.uk/IPI> is no longer >> maintained (since 2011), and UniProt, which is where we got data for the >> last few releases, has now dropped support as well. >> >> Given that this annotation source is no longer maintained, I decided to >> exclude these IDs from the current build of the following db0 packages: >> >> - rat.db0 >> - chicken.db0 >> - zebrafish.db0 >> - mouse.db0 >> - bovine.db0 >> - human.db0 >> >> In addition, it is not clear to me (nor can Marc recall) where the data >> for >> PFAM in the yeast.db0 package comes from. Given that we are pretty far >> behind schedule for these packages, I have excluded that table as well. >> >> If this will break anybody's package, or if there are people who rely on >> these IDs, I can just parse out of the last release and deprecate, so you >> will have the IDs for one more release. However, if nobody cares about >> such >> things, I will just go with what we have. Please speak up if this will >> affect you. >> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099 [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel