I hope to get comments from the JESS community on the pros and cons of using JESS on a problem I face in bioinformatics.
I have used JESS in a few tiny pilot projects, with some success. But before I do serious development, I'd like to get a critique from those with greater experience. Here's the background: I am involved in the Cytoscape project (http://www.cytoscape.org), an open source java tool for exploring molecular networks. We often face a data cross-reference problem. By this I mean that we work with entities (typically genes, mRNA, or proteins) for which we have identifiers, and we need to cross-reference them to other related data. The related data often uses a different set of identifiers. New kinds of data are appearing all the time. There are a number of laudable efforts in the biological community to standardize names, and I use them whenever I can, but these efforts don't seem to keep pace with the burgeoning kinds and quantities of biological data we are interested in. To make this concrete, here is a recent example: 1) In a study of prostate cancer, we started with microrarry measurments identified by unigene cluster ID of human mRNA fragments 2) I mapped unigene cluster ID to LocusLink ID for most of the mRNA 3) From LocusLink ID I mapped to HUGO gene symbol, and RefSeq protein identifier 4) From LocusLink ID, I was also able to get the Enzyme Commision (EC) term of the related protein from KEGG. (KEGG's EC assignments were better than LocusLink.) 5) From RefSeq protein ID, I was able to get the amino acid sequence 6) From amino acid sequence, I was able to BLAST against yeast sequences, and determine the yeast orthologs of the human genes, which set the stage for inferring the possible 'interactome' context of the human genes 7) From the EC number, I was able to map the human genes onto KEGG metabolic pathways 8) From the RefSeq protein ID, I was able to get IPI number, and thus the latest GeneOntology annotation from the GOA project at EBI This chain of reasoning and cross-referencing needs to be done for just about every data set I see. I have developed a bag of tools (in java and python) which partially automate the process. But extending, managing, and invoking these tools is not very easy. So I am thinking about adding JESS. I am drawn to JESS because my process seems to consist of the following steps: 1) defining rules that perform fairly simple operations 2) applying the rules as needed 3) adding new rules all the time, to accomodate the latest kind of experimental data and desired cross-reference 4) caching results to avoid repetitious labor Is this a good project for JESS? I will be grateful for your replies! Regards, Paul Shannon Software Engineer Institute for Systems Biology Seattle -------------------------------------------------------------------- To unsubscribe, send the words 'unsubscribe jess-users [EMAIL PROTECTED]' in the BODY of a message to [EMAIL PROTECTED], NOT to the list (use your own address!) List problems? Notify [EMAIL PROTECTED] --------------------------------------------------------------------
