Re: Oracle Uniprot RDF data set and benchmarks

2006-02-08 Thread Eric Jain
Ian Wilson wrote: We will thus want to maintain a local copy of this extract (on the wiki?) so changes in the graph don't change the benchmarking results. The data in http://www.isb-sib.ch/~ejain/rdf/data/ is indeed updated every two weeks, but I could also provide some more stable data sets

Re: Fwd: Nature: A call for a public gene Wiki

2006-02-08 Thread Eric Jain
Pierre LINDENBAUM wrote: I agree, a wiki would be great way for sharing knowledge as it would allow experts of a protein, of a gene to freely add, modify and share annotations. But I fear it could also be a problem for knowledge discovery because a wiki is not a "semantic web" source of informa

Re: Nature: A call for a public gene Wiki

2006-02-08 Thread Eric Jain
Matthew Cockerill wrote: Adding comments does not provide the same motivation as updating the core data. I guess it doesn't, though I think this would already be much better than a feedback form, as far as immediacy and motivation are concerned. The problem with expecting people to update t

Re: Nature: A call for a public gene Wiki

2006-02-08 Thread Eric Jain
Matthew Cockerill wrote: Again, important conceptual issues. But again, the proof of the pudding is in the eating. We'll be doing some tests with some "hard core" users sometime this year, so we'll see to what extent they are willing to eat (or is it cook?) the pudding :-)

Re: Oracle Uniprot RDF data set and benchmarks

2006-02-08 Thread Eric Jain
Ian Wilson wrote: That would be great. Since these graphs change over time, do archived annual snapshots make sense? Any thoughts on how you might derive these subgraphs? You are likely more familiar than anyone else with these graphs. The separate graphs you provide for distribution are alre

Re: Oracle Uniprot RDF data set and benchmarks

2006-02-08 Thread Eric Jain
Jim Hendler wrote: I love this idea, but I would go a bit further - be even nicer for us non-biologists if it also included some example queries to run (and maybe even the correct answer sets) - I think if that existed, we could push some of the triple store developers to use it as a benchmark

Re: [HCLSIG] Uniprot RDF data set and benchmarks

2006-02-18 Thread Eric Jain
Susie Stephens wrote: It seems the document with example queries is no longer available via Eric's site I have this unfortunate tendency to rearrange pages randomly :-) In any case, I created a page that could be used to collect this kind of information: http://esw.w3.org/topic/LifeScience

Re: [BioRDF] Meeting Notes Feb 27, 2006

2006-02-28 Thread Eric Jain
One note on the "conversion" issue: There do not seem to be too many data sets that are well "in tune" with the RDF philosophy. It may be easy to get something that looks like RDF, but then again you can also convert arbitrary flat text files to XML by adding a start and an end tag to each li

Re: [BioRDF] Meeting Notes Feb 27, 2006

2006-03-01 Thread Eric Jain
Tom Stambaugh wrote: It seems to me that RDF helps us describe and model the structure of our data. In my view, we'll then *use* this RDF-derived description and model to build relational databases that hold said data. In this worldview, the existence of the RDF description then helps us keep

Re: [BioRDF] RDF query languages

2006-03-09 Thread Eric Jain
Seaborne, Andy wrote: - It appears that ex9 has no answers on the 10e6 triple 7.0a UniProt extract Perhaps "2.7.7.-" and "3.1.3.16" should be replaced with more general enzyme categories, e.g. "2.-.-.-" (all transferases) and "3.-.-.-" (all hydrolases). - Similarly, none of the queries hav

Re: 44-52 That';s the Number

2006-04-11 Thread Eric Jain
Phillip Lord wrote: TH> Background: The "info" URI scheme is a means of grandfathering TH> legacy namespaces onto the Web in their own right (e.g. PubMed TH> identifiers, ADS bibcodes, etc., etc.). Many Web applications TH> expect identifiers to be packaged as URIs (Uniform Resource TH

Re: The O'Reilly Filter?

2006-04-29 Thread Eric Jain
Eric Neumann wrote: Semantic Web (not on the list) has 396 references in the main area, and 191 within group/semweb-lifesci, placing it between Malaria and Medicine in popularity (509 documents within Connotea have some mention of Semantic Web). Even RDF has a 236 references as well! Hmmm..

Re: proposal for standard NCBI database URI

2006-05-09 Thread Eric Jain
Alan Ruttenberg wrote: As far as I know there is no standard URI for a resource at NCBI. I would like to propose that there be one, since we will all need them to use when we refer to these resources in our RDF. (and I need one *now*) Next you'll probably also need standard URL's for all the

Re: proposal for standard NCBI database URI

2006-05-09 Thread Eric Jain
Xiaoshu Wang wrote: To propose a standard URI for a domain that we don't own is like proposing Iran to drop their nuclear program. It is wishful thinking. In this particular case it seems more like Iran proposing that the United States drop their nuclear program :-)

Re: [BioRDF] All about the LSID URI/URN

2006-07-31 Thread Eric Jain
Carole Goble wrote: However, we have problems with the implementation, specifically the use of SOAP within the resolution system, because: 1. its not needed conceptually 2. its costly 3. its overkill which affects performance 4. the main implementation is Axis based - not suitable for phones, p

Re: [BioRDF] All about the LSID URI/URN

2006-07-31 Thread Eric Jain
Benjamin H Szekely wrote: The LSID Java Toolkit supports both SOAP and HTTP. The HTTP version is very simple and does not use Axis. I believe work is being done outside of development team to implement a more lightweight, HTTP-only version of the LSID Java stack. As far as I am concerne

Re: [BioRDF] All about the LSID URI/URN

2006-07-31 Thread Eric Jain
Benjamin H Szekely wrote: The LSID Java Toolkit supports both SOAP and HTTP. The HTTP version is very simple and does not use Axis. I believe work is being done outside of development team to implement a more lightweight, HTTP-only version of the LSID Java stack. As far as I am concerne

Re: Size estimates of current LS space

2006-08-03 Thread Eric Jain
Eric Neumann wrote: As per today's Telcon, does any person with genomics knowledge (that includes you too Carole) have estimates for the following numbers: 1. How many bio-molecular and organism-anatomical-functional entities and records (broad sense) are currently accessible through the web

Re: Prototype URL to Life Science Identifier (LSID) gateway now available

2006-10-13 Thread Eric Jain
Sean Martin wrote: Here is our first stab at the syntax of the mapping. Looks good, but as far as I am concerned you could go one step further and simply redirect http://lsid-info.org/urn:lsid:foo:bar:baz:1 to the actual site, e.g. http://foo.org/xyz/bar/baz?version=1, and leave it up to

Custom Google Life Sciences Search Engine

2006-10-24 Thread Eric Jain
Completely unsemantic, but perhaps useful to some people here:

Re: OWL vs RDF

2006-10-25 Thread Eric Jain
Alan Ruttenberg wrote: Well it would be educational to get your view on what you can you do with owl without a reasoner that's not easier to do without owl? owl:sameAs for example could be used as a standard way to express that two identifiers represent the same resource, wouldn't need a reas

Re: Hosted Triple Store Oracle RDF DM - and access to BioRDF examples in general

2007-01-22 Thread Eric Jain
William Bug wrote: Also - for loading remote LSID resources: I thought I remembered after having loaded Roderick Page's Firefox LSID extension, I could just paste the following in the Firefox URL text box, hit return, and the plug-in contacted an LSID Resource Resolution server at IBM to retu

Re: Hosted Triple Store Oracle RDF DM - and access to BioRDF examples in general

2007-01-22 Thread Eric Jain
Sean Martin wrote: The one valid one that Eric tried works fine for me, both for data and metadata. http://lsid-info.org/urn:lsid:ncbi.nlm.nih.gov.lsid.biopathways.org:pubmed:12441807? returns: Indeed, and it looks like the problem with the plug-in was that it has to connect to port 9090 o

Re: More inspiration for the WWW HCLSIG demo

2007-02-22 Thread Eric Jain
William Bug wrote: Is Eric Jain's RDF-ization of UniProt being used in this WikiProteins system? The Wiki part is based on OmegaWiki [http://www.omegawiki.org/] (previously known as WiktionaryZ), so importing data from RDF wouldn't be any easier than importing data from any other format -- a

UniProt RDF via HTTP

2007-05-11 Thread Eric Jain
For those of you who are interested in getting subsets of the UniProt data for demos etc, the following may be of use. Get a single database entry: ...

Re: UniProt RDF via HTTP

2007-05-11 Thread Eric Jain
Alan Ruttenberg wrote: Is there http access given a lsid. I have already figured out that I can get the html for urn:lsid:uniprot.org:uniprot:P00750 with: http://beta.uniprot.org/?dataset=uniprot&query=urn:lsid:uniprot.org:uniprot:P00750&sort=score&lucky=no&random=no Looks like you disco

Re: UniProt RDF via HTTP

2007-05-12 Thread Eric Jain
Alan Ruttenberg wrote: Would it be possible to add a service so that I can get from the lsid directly to rdf and xml versions at least? Would it be correct to assume that all lsids in uniprot have such versions? The only common format in UniProt is RDF (e.g. there is no XML representation of

Re: UniProt RDF via HTTP

2007-05-13 Thread Eric Jain
Alan Ruttenberg wrote: This is a reasonable choice. However I think that there ought to be a link based on the identifier. Ideally the identifier itself would be resolvable. The idea here was that the LSID is be a more abstract identifier for the "Thing", whereas the URL(s) identify specific

Re: UniProt RDF via HTTP

2007-05-13 Thread Eric Jain
Eric Neumann wrote: I'm not sure this is true-- why not set up your uniprot server to do (internal) redirection similar to how purl.org works? Then the url would be persistent and almost look like an lsid: e.g., http://purl.uniprot.org/uniprot/P12345 You're right, initially I was concerned a

Re: Advancing translational research with the Semantic Web

2007-05-16 Thread Eric Jain
Just catching up on reading papers :-) "It is also useful to know who believes something and why. However, there is no standard way of expressing such information about a statement [...]" Reification?

Re: Advancing translational research with the Semantic Web

2007-05-16 Thread Eric Jain
Phillip Lord wrote: "EJ" == Eric Jain <[EMAIL PROTECTED]> writes: EJ> Just catching up on reading papers :-) EJ> <http://www.biomedcentral.com/1471-2105/8/S3/S2> EJ> "It is also useful to know who believes something and EJ> why. However, ther

Re: Advancing translational research with the Semantic Web

2007-05-16 Thread Eric Jain
Pat Hayes wrote: Our paper suggests that the URI of the RDF/XML document be used as the name in this case. This works as long as you are consistent about it. ...and as long as the document is always opened from its official location (versa from a local copy), and you keep each item is a sepa

Re: Advancing translational research with the Semantic Web

2007-05-17 Thread Eric Jain
Chris Mungall wrote: Provenance is kind of important for science, and it doesn't do it any favours to mix provenance at the document and statement levels +1 I wouldn't mind if the two levels were supported through the same mechanism, however this risks complicating and slowing down things.

Re: Advancing translational research with the Semantic Web

2007-05-17 Thread Eric Jain
Alan Ruttenberg wrote: > prefix dc: > prefix sc: > select ?s ?p ?o ?by > from > where > { ?s ?p ?o. >?statement rdf:type rdf:Statement. >?statement rdf:subject ?s. >?

Re: Advancing translational research with the Semantic Web

2007-05-17 Thread Eric Jain
[EMAIL PROTECTED] wrote: To return to the original question: In many of the biomedical ontologies we are currently using or developing most of the biological relations that matter ARE already reified. For example, most current ontologies would not contain the statement " ", rather they would co

Re: Advancing translational research with the Semantic Web

2007-05-17 Thread Eric Jain
[EMAIL PROTECTED] wrote: How would you say e.g. "protein a is expressed in tissue b, according to source c"? through something like . . . The "protein expression process" class that needs to be introduced here does seem a bit like bending over backwards (I know in other cases such

Re: Advancing translational research with the Semantic Web

2007-05-17 Thread Eric Jain
[EMAIL PROTECTED] wrote: The choice of this design pattern is not arbitrary, it is based on the OBO Relation Ontology [1], BFO and the OWL version of the Gene Ontoloy, which are becoming widely accepted. Other foundational ontologies (like DOLCE) have similar relations and entities. In other wor

Re: Advancing translational research with the Semantic Web

2007-05-18 Thread Eric Jain
Alan Ruttenberg wrote: There is a subclass of gene expression processes, during each instance of which some instance of protein a is the participant which is "the thing produced", and which is located_in some instance of tissue b. Phew :-) I would then attach, as above, via an annotation p

Re: Advancing translational research with the Semantic Web

2007-05-18 Thread Eric Jain
Alan Ruttenberg wrote: Not if the classes are given logical expressions, as I did in my example - in that case a reasoner can infer that they are the equivalent, based on their logical definitions. Well, ideally :-) My main concern is that different people will come up with different "desig

Re: Advancing translational research with the Semantic Web

2007-05-18 Thread Eric Jain
Alan Ruttenberg wrote: If you want to say that the protein is found in some tissue, that's what should be said. However, in your email you wrote that the protein is expressed in the tissue. Sorry about that, should run a consistency checker on my outgoing mail :-) If it is know to be found

Re: Fwd: ISMB/ECCB2007 Demo Acceptance

2007-06-16 Thread Eric Jain
Eric Neumann wrote: We have been given a slot to show the HCLS demo there (see below), so for those of you planning to be at ISMB next month, please be sure to attend/participate. Great, I'll be there -- provided it doesn't overlap with my own demo :-) Is it the same demo that was presented

Re: Fwd: ISMB/ECCB2007 Demo Acceptance

2007-06-16 Thread Eric Jain
[EMAIL PROTECTED] wrote: If there is interest, I could try to organize an informal face to face meeting of people that are active in the surroundings of the HCLSIG and that are in Vienna at the time of ISMB. We could probably use some seminar or lecture rooms of the Vienna Medical University, if

Re: BioRDF Telcon

2007-06-19 Thread Eric Jain
Marc-Alexandre Nolin wrote: [...] As much has we would like to call uniprot:p19367 a concept, it is still a database number from the database uniprot. The real concept would be protein:Hexokinase. This is what the searcher would be looking for. The concept would then link to any database talking

Re: BioRDF Telcon

2007-06-20 Thread Eric Jain
Marc-Alexandre Nolin wrote: [...] What I wanted to point out with this is people work with concept. P19367 is the identificator to access the information about the Hexokinase concept in the Uniprot database. P19367 doesn't have a sense in itself, it is only a string of number with a letter in pr

Re: BioRDF Telcon

2007-06-20 Thread Eric Jain
Jonathan Rees wrote: I really think we should put together a set of requirements and desiderata for proposed URI solutions, especially since we seem to have at least 4 proposals (LSID, Banff demo, yours / Bio2RDF, BioGUID) on the table just in this one sub-area (URI's for public database records

Re: [hcls] A map of the Semantic Web for life science and health care

2007-06-23 Thread Eric Jain
[EMAIL PROTECTED] wrote: After a lot of thinking I have finally decided on the style for the visualization. I chose a geographical metaphor, here is an example (with arbitrary connections): http://neuroweb.med.yale.edu/senselab/temporary_files/map-draft-2.png The metaphor is clearly flawed, o

Job!

2007-06-25 Thread Eric Jain
If you are a software developer (with some Java experience), and are following this mailing list, this position may be of interest: http://eric.jain.name/2007/06/26/hiring/ Feel free to forward...

URL +1, LSID -1

2007-07-10 Thread Eric Jain
In the latest release of UniProt (11.3), all URIs of the form: urn:lsid:uniprot.org:{db}:{id} have been replaced with URLs: http://purl.uniprot.org/{db}/{id} In general, these URLs can be resolved to a human readable web page (a few are still broken, will be fixed). Some of these web pag

Re: URL +1, LSID -1

2007-07-10 Thread Eric Jain
On Tue, 10 Jul 2007 10:13:48 -0700, Michel_Dumontier wrote: What if I have a semantic web application in which I would like to retrieve more information about this resource? Since the document is not an RDF document with machine understandable statements about it, it seems that my application wo

Re: URL +1, LSID -1

2007-07-10 Thread Eric Jain
Alan Ruttenberg wrote: Perhaps Eric would be so kind as to create http://purl.uniprot.org/rdf/uniprot/P12345 to link directly to the RDF document. In addition, there is a LINK REL mechanism to link the HTML version to RDF. If Eric was in a particularly good mood, maybe he would consider movi

Re: URL +1, LSID -1

2007-07-11 Thread Eric Jain
Matthias Samwald wrote: I would actually prefer the "commons/source/id.type" pattern, if it can be implemented with the current purl.org system. It is more intuitive to most developers/users and it brings the elements of the URI into a nice hierarchical order. id.type certainly seems like a

Re: URL +1, LSID -1

2007-07-11 Thread Eric Jain
Michel_Dumontier wrote: Unfortunately, this again demonstrates the problem in which the identifier for a biological entity - say mitochondrial Aspartate aminotransferase resolves to a nicely formatted HTML page. What if I have a semantic web application in which I would like to retrieve more inf

Re: URL +1, LSID -1

2007-07-11 Thread Eric Jain
Alan Ruttenberg wrote: from http://www.w3.org/2001/tag/doc/alternatives-discovery.html > [...] While I'm not always a fan of TAG findings, I think this one makes a TON of sense. I'll agree with that, but where do they talk about having canonical URIs for each specific representation? I

Re: URL +1, LSID -1

2007-07-11 Thread Eric Jain
Alan Ruttenberg wrote: On Jul 11, 2007, at 3:16 AM, Eric Jain wrote: http://purl.uniprot.org/uniprot/P12345 does not identify an RDF resource, it represents our concept of some protein. What concept would that be? What are instances of the class of proteins that this identifiers denotes

Re: URL +1, LSID -1

2007-07-11 Thread Eric Jain
Alan Ruttenberg wrote: I don't feel comfortable with showing a minimalistic page with some weird acronyms when someone enters e.g. http://purl.uniprot.org/uniprot/P12345. How is someone to know, then, that you mean that the name denotes a class of proteins, rather than a page of html? Does

Re: URL +1, LSID -1

2007-07-11 Thread Eric Jain
Alan Ruttenberg wrote: Some resources are quite simple and straightforward to understand, e.g. http://purl.uniprot.org/uniparc/UPI1328C5 represents a specific amino acid sequence, The instances are sequences of letters? Qualities of a class of molecules? The molecules themselves? I guess

Re: URL +1, LSID -1

2007-07-11 Thread Eric Jain
Mark Wilkinson wrote: My understanding is that the 303 could only redirect the agent to a single alternative URI. Am I wrong? You can either use the Accept header the client sent to choose the most appropriate representation to redirect to, or you can generate a proper response with a list

Re: URL +1, LSID -1

2007-07-12 Thread Eric Jain
Roderic Page wrote: Lastly, I wrote a PHP client to test LSID servers, which is online at http://linnaeus.zoology.gla.ac.uk/~rpage/lsid/. I use this to debug both my client code, and test LSID servers. Nice! I noticed that none of the example LSIDs on the page (or the one you mention above,

Re: URL +1, LSID -1

2007-07-12 Thread Eric Jain
Xiaoshu Wang wrote: IMHO, I think it would be nicer and less confusing if you make "http://purl.uniprot.org/uniprot/P12345"; a skeleton and 303 redirect to either "http://purl.uniprot.org/uniprot/P12345.html"; or "http://purl.uniprot.org/uniprot/P12345.rdf"; depends on the value of Accept he

Re: Sparql endpoints +1

2007-07-13 Thread Eric Jain
Matthias Samwald wrote: *) Most of the entities that have URIs on the Semantic Web are not documents, rather they are entities in the real world that cannot be 'resolved' in any meaningful way. What people mean when they talk about 'resolving' such non-information-entities is in fact the proce

Re: URL +1, LSID -1

2007-07-13 Thread Eric Jain
Xiaoshu Wang wrote: Oh, sorry, I didn't notice the difference between purl and beta. My bad. It will be easier to distinguish once the "beta" is dropped from http://beta.uniprot.org/ :-)

Re: URL +1, LSID -1

2007-07-14 Thread Eric Jain
Alan Ruttenberg wrote: Regarding "Don't quite see any benefit in having another set of redirectable identifiers for the actual representations". I have tried to explain this many times and I guess that I am just not good at it. Let me try again, by asking you to review some statements, and p

Re: URL +1, LSID -1

2007-07-15 Thread Eric Jain
Alan Ruttenberg wrote: The point of having the PURLs is to ensure that there is a mechanism for handling three cases that LSIDs were intended to address (but which can be addressed without the trouble of introducing a separate resolving mechanism) 1) To be immune from the "actual URL of the r

Re: 303 +1, WSDL -1

2007-07-15 Thread Eric Jain
Mark Wilkinson wrote: WSDL -1 if you wish, but that puts you in opposition to the majority of the world, where WSDL (thanks to Ajax) is finally starting to make it's mark! Most Ajax libraries use REST? In any case, should we get bored of the PURL/LSID discussion, it looks like we could swit

Re: Immunity of SW statements to changes in location. Was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Alan Ruttenberg wrote: Yes, but how will we handle the case where some set of people make statements with the subject being http://beta.uniprot.org/entry/P12345 and another set makes statements about http://uniprot.org/entry/P12345. They are really talking about the same subject, but our seman

Re: Time dependence/interaction with PURLs was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Alan Ruttenberg wrote: Two points to make about this. First, the current locations are not hidden - the rewrite rules are accessible, and the agent that follows the redirect can note the actual location. You're right, and in fact the "Wayback Machine" seems to do that. However, if I wanted t

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Alan Ruttenberg wrote: There are proteins, and there are records about proteins. Records come in different formats. If I make a statement using this url, is is about the record? or the protein? How should the agent come to know? The concept of "protein" is abstract enough that anything you mi

Re: Crawlers need content negotiation, not! was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Alan Ruttenberg wrote: Except this isn't an issue. A link in the html suffices to let them know where the RDF is, and the extra retrieval isn't going to kill them. There are something like 30M RDF documents on http://beta.uniprot.org/ alone. If for each document you have to retrieve and pars

Re: Immunity of SW statements to changes in location - data integration use case

2007-07-16 Thread Eric Jain
M. Scott Marshall wrote: During data integration or "data reuse", we have to relate statements about 'biothings' to each other in order to be sure that we can properly use someone else's statements/data. In that case, it is extremely convenient if we have used the same identifier to refer to t

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Marijke Keet wrote: and by analogy, then there is no real Eric Jain, just a webpage with that name, a blog, an URL http://eric.jain.name/, some database records in the uniprot HR systems with a string "Eric Jain" and related data, email trails in the hcls archive and, well,

Re: Immunity of SW statements to changes in location - data integration use case

2007-07-16 Thread Eric Jain
Matthias Samwald wrote: I think the benefit would be immense. I have seen so much confusion arising in Semantic Web projects that try to do it the 'smart' way and lump both together. The slightly increased complexity of the implementation is nothing compared to the long-term costs caused by this

Re: Immunity of SW statements to changes in location - data integration use case

2007-07-16 Thread Eric Jain
Khalid Belhajjame wrote: I am not sure whether the following issue has already been discussed. By using the identifiers to also locate where the RDFs statements describing the resource in question are, don’t we somehow dictate where the document(s) containing the RDFs statements describing a giv

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Marijke Keet wrote: "...due to lack of knowledge...": and I presume it may be that biologists disagree also because of insufficient knowledge about the protein, and/or its (over-)simplification, that is, comparing apples and oranges at a too coarse level of granularity. Moreover, that we don't

Re: Ambiguous names. mapping URI's to Ontology

2007-07-16 Thread Eric Jain
M. Scott Marshall wrote: It should be possible for people to make statements specifically about the DNA, mRNA, amino acid sequence, (in organism human, mouse,..), NMR, MS(mass spec), etc. that is associated with a protein in addition to saying something general about the protein itself e.g. "P

Re: Crawlers need content negotiation, not! was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Stian Soiland wrote: The Link header can be useful information for a browser that is not RDF aware, but know of some application that is. It would not normally involve this application to try different content type, but if the resource can list other representations it could put up a nice litt

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Waclaw Kusnierczyk wrote: sure. how can you determine that *two* entities are *one* entity? (they may become one, but that's a different story.) You mean how we *decide* that they should be a single "entity"? I'm afraid I can't tell, not because it's our trade-secret, but because such decisi

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Waclaw Kusnierczyk wrote: Oh, no. If there are two proteins out there, they are two, and you have nothing to *decide* about that. You may fuse them in that or another way, but this does not change the fact that at the previous time there were two. "Out there" you'll find all kinds of molec

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Matthias Samwald wrote: Well, they might talk like database entries and physical objects would be the same, but this is not what they *think*. With the Semantic Web / ontologies we want to capture the semantics and the actual thinking, not the linguistic / textual surface representations. http

Re: Immunity of SW statements to changes in location - data integration use case

2007-07-16 Thread Eric Jain
M. Scott Marshall wrote: From the recent threads, I get the impression that we are trying to do combine two functions into the URI: 1) the unambiguous *identification* of a given concept in our own RDF 2) retrieve associated data records from the same URI Although 2) seems like a nice conven

Re: Ambiguous names. mapping URI's to Ontology

2007-07-16 Thread Eric Jain
M. Scott Marshall wrote: I like this form of ontology versioning (date in the URL): http://www.co-ode.org/ontologies/amino-acid/2006/05/18/amino-acid.owl Notice that you can easily adjust for a different version by changing the declared namespace. With a database like UniProt, only a small pa

Re: Immunity of SW statements to changes in location - data integration use case

2007-07-16 Thread Eric Jain
Matthias Samwald wrote: Can a database record have a molecular weight or be part of a protein complex? Can a protein be curated and added to a database? You're assuming there are pure and absolute concepts of specific proteins floating around somewhere, but as far as I am concerned, a "protei

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Alan Ruttenberg wrote: I'm confused. I think we all would agree that there are instances of proteins and we have a good idea of what they are. We also know that there are groups of proteins that are built off the same template and share certain properties. If we define classes using such prope

Re: [hcls] ISMB Bioontologies SIG Poster

2007-07-16 Thread Eric Jain
Matthias Samwald wrote: * Add connections between RDF/OWL resources that I have overlooked so far, based on the criteria described on the wiki page: "Only make connections where there really are substantial connections on the RDF level, i.e. relations spanning the two ontologies. As a counter-ex

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-16 Thread Eric Jain
Alan Ruttenberg wrote: We've got a SW language for making definitions - it's called OWL. One thing I can say here is that there is the trend that curators create rules (and check the outcome) instead of adding data themselves directly. Unfortunately OWL is insufficient for the kind of ugly r

Re: Rules (was Re: Ambiguous names. was: Re: URL +1, LSID -1)

2007-07-16 Thread Eric Jain
Bijan Parsia wrote: Eric, I would be very much interested in some more details about the sort of rules used and how they are used. I personally tend to distinguish between the use of rules in modeling and the use of rules for data munging tasks. Obviously, where you draw this boundary can be a

Re: Rules (was Re: Ambiguous names. was: Re: URL +1, LSID -1)

2007-07-17 Thread Eric Jain
Alan Ruttenberg wrote: To clarify, no, I didn't mean this. I meant that the definition of Uniprot records are already broad in the sense that sometimes multiple splice variants are included in a single record, as are population and disease-causing variants, according to Eric. Basically I don't

Re: Rules (was Re: Ambiguous names. was: Re: URL +1, LSID -1)

2007-07-17 Thread Eric Jain
Chris Mungall wrote: We have also switched from talk of defining specific proteins to rules to automatically annotate protein records. You're right, small digression, hope it's of interest anyway :-) I read "broad classes of proteins" as being more inclusive than the class denoted by OPSD_H

Re: Rules (was Re: Ambiguous names. was: Re: URL +1, LSID -1)

2007-07-19 Thread Eric Jain
Alan Ruttenberg wrote: In that case, I would recommend that it is unwise to use Uniprot ids as identifiers of protein classes on the semantic web. Doing so would encourage exactly the kind of ambiguity that we need to avoid in order to write statements that will not confuse semantic web agent

Re: protein entities (was Re: Rules (was Re: Ambiguous names. was: Re: URL +1, LSID -1)

2007-07-19 Thread Eric Jain
Darren Natale wrote: We recently began a new Protein Ontology (PRO) effort geared precisely toward the formal definition of the "smaller entities" referred to by Alan. By "we" I mean the PRO Consortium, comprising the PIs Cathy Wu of PIR (which is also a member organization of the UniProt Con

Re: protein entities (was Re: Rules (was Re: Ambiguous names. was: Re: URL +1, LSID -1)

2007-07-19 Thread Eric Jain
Darren Natale wrote: We don't yet have formal definitions for many of the classes and relations (the effort only began in earnest a few months ago). But, basically, there is a distinction made between the full-length (in terms of amino acid sequence) protein and the sub-length parts of protei

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-20 Thread Eric Jain
Alan Ruttenberg wrote: Who's mission? Remember that one of the reasons this came up was the claim that the Uniprot URI identified the protein in the real world. Who claimed that? If we wanted to identify each protein in the real world we'd have to assign zillions of URIs just for the protein

ISMB 2007

2007-07-20 Thread Eric Jain
For those who are (or are going to be) in Vienna, when to meet where?

Re: ISMB 2007

2007-07-20 Thread Eric Jain
Eric Neumann wrote: There are a couple of possible times to meet: 1) Saturday evening at the opening reception Lots of people there? 2) At Matthias' poster on Sunday (2pm?) The poster session is in the evening, but I'll be glued to my own poster... 3) Monday during the demo: M

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-20 Thread Eric Jain
Alan Ruttenberg wrote: "Remember that one of the reasons this came up was the claim that the Uniprot URI should be used to identify a set of real things." OK, I think that describes my current point of view. I get confused when I read statements that sound like "x means the same thing in in

Re: Ambiguous names. was: Re: URL +1, LSID -1

2007-07-22 Thread Eric Jain
Phillip Lord wrote: Well, swissprot refers to isoforms I think. Push comes to shove, just use the sequence. Note that we do have stable identifiers for isoforms, for example in http://beta.uniprot.org/uniprot/P00750.rdf you can find URIs for the isoforms we describe, e.g. http://purl.unipro

Re: Paper: URI Identity Management for Semantic Web Data Integration and Linkage

2007-08-02 Thread Eric Jain
Eric Neumann wrote: I leave it to the group to discuss the possible value of this paper to our ongoing URI activity... The problem I guess is that what is sameAs depends on the context, but I don't quite agree that this means we should replace all sameAs statements. If you don't agree with

Re: the HCLS and URI schemes

2007-08-06 Thread Eric Jain
Michel_Dumontier wrote: An analysis into these motivations and summarizing existing mechanisms would surely be a useful contribution for current and future adopters. Agreed! I believe that's work in progress: http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Tasks/URI_Best_Practices Maybe at o

Re: making statements on the semantic web

2007-08-08 Thread Eric Jain
Michel_Dumontier wrote: Great! I think the creation of a comprehensive registry that mints and publicizes URIs is well worth pursuing. Perhaps a few of us, including Bio2RDF, can forge ahead and do what needs to be done? Wasn't that the idea behind the HCLS PURL scheme? If you feel that some n

Re: Does follow-your-nose apply in the enterprise? was: RDF for molecules, using InChI

2007-08-08 Thread Eric Jain
Michel_Dumontier wrote: Sure, HTTP URIs can be used as identifiers, but why would I mint arbitrary HTTP URIs when I can use a scheme that has no resolution protocol implicitly or explicitly associated with it? Indeed, especially since at the moment HTTP URIs will cost you at least 5 cents a

Re: Does follow-your-nose apply in the enterprise? was: RDF for molecules, using InChI

2007-08-09 Thread Eric Jain
Xiaoshu Wang wrote: [...] why not designate a top domain name like "tmp" to signal this. For instance, use "http://example.com.tmp/doc"; as the temporary URI for the eventual resource of "http://example.com/doc";. There are in fact already several reserved TLDs such as .test and .invalid,

  1   2   >