Stefan,
thanks a lot.
The supplied first link does not seem to lead to the ER diagram but rather to
the help file.
Cheers,
Chris
Stefan Kuhn wrote:
Hi Claus,
thanks for your interest in our work. I try to make the approach used in
NMRShiftDB a bit clearer. I try to point to relevant classes - if you need
detailed explanations on code, please ask.
- The database: NMRShiftDB has a database which splits a molecule in tables
for molecule, atom, bond and their connections. The atom and bond tables have
pretty much the same purpose as the atom and bond arrays in the cdk molecule
object. You find an ER-diagram here:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/nmrshiftdb/nmrshiftdb/doc/nmrshiftdbhelp.pdf?rev=HEAD&content-type=application/pdf
- NMRShiftDB uses on OR mapper called torque. This has objects for every
table. In NMRShiftDB their names start with DB, so there is a DBMolecule
object representing the molecule table.
- The save and load process: There is code (not in DBMolecule, but in
SubmitingData, but this is bad design and could be changed) which takes a cdk
molecule. It then checks for duplicates via SMILES, returns the exising
DBMolecule or saves the data to molecule, atom, bond etc. table and return
the new DBMolecule. When loading, you get a DBMolecule in some way and do a
getAsCDKMolecule on it, which reads atom, bond etc. into a cdk molecule and
returns this.
- Searches: the exact/similarity/substructure search is done in
GeneralUtils.executeSearch This method does all searches, so it looks
complicated. It uses SMILES and simple sql for exact searches, fingerprints
and a UDF for similarity search and performs an isomorphism check (via cdk
objects/methods) for exact substructure search.
I hope this makes things clear. NMRShiftDB really contains code for maintining
a structure database. The code is also used on
http://www.chemistry-development-kit.org/ for the database, but no serches
are implemented here. This uses a torque library a bit newer the
nmrshiftdb.org and does not have the spectrum part. It could easily be
extended with some search code from nmrshiftdb to form a structure database
library.
If you think this code could be helpfull, please let me now and we should
start to extend the cdkweb code with searches.
Stefan
Am Thursday 29 December 2005 11:28 schrieb Claus Stie Kallesøe:
Hi Christoph,
and thank you for the answer. happy to hear that you would like to help us
out. My idea actually was to reuse some code from NMRshiftdb
Here is the story:
Today we receive the user input structure as a molfile from Marvin. We
then parse that on to a JChemSearch object. This object (together with an
updatehandler) pretty much takes care of everything related to substructure
searching, dublicate check etc. JChem therefore of cause has requirements
to the structure table.
We would now like to try to port to first CDK and then JChemPaint to make
the chemicalinventory true opensource. I read the journals about the design
of CDK as well as NMRshiftdb and my plan was to get inspiration for the
structure tables as well as code for searching and storing.
I have found the sql statements for the tables but I really do have a hard
time finding the code. So if you could point me in the right direction that
would be a great help.
I do understand that there is going to be more coding from our hands in
order to perform searches and inserts using CDK.
All we basically want to do is do a dublicate check during insert of new
structures (to keep the structure table unique (in 2D)) and be able to
perform exact match, similarity and substructure searches depending on user
choices.
Also I think I now (after reading through most of the API yestersday)
understand the concept of ChemObjects, AtomContainer, Bond and Atoms. But I
still don't see how I then break up a drawn molecule (molfile) in order to
store the fingerprints, smiles, bonds and atoms etc.
Again If you are able to point me to the code so I can see examples in
order to fully understand I think we can handle the hard work.
Another option is of cause if any of you would like to join us. I have
made a new CVS module (chemicalinventory-cdk), where I will start to edit
the current code to use CDK. Just let me know and I will register you.
Thank you
claus
Christoph Steinbeck <[EMAIL PROTECTED]> skrev: Those classes are
indeed "prehistoric material" by CDK means. I guess they have not been used
or maintained for years.
Claus, if you are interested in writing structures to and from databases,
we should discuss the issues.
As you will know, we have written quite some code for doing this within the
NMRShiftDB project. The code is part of the NMRShiftDB code base, which
uses a lot of CDK code.
Cheers,
Chris
Egon Willighagen wrote:
On Wednesday 28 December 2005 16:22, Claus Stie Kallesøe wrote:
I have been reading through the cdk library here:
http://cdk.sourceforge.net/api/
in order to find out how I can use the library.
As we use a MySql database to store the structures I wanted to use the
DBReader.class as the describtion tells me: " Reader that can read from a
relational database that can be accessed through JDBC."
But I can't fint the class in the cdk-20050826.jar file I have?
The DBReader, DBWriter and DBAdmin classes are in the CDK module
'orphaned', meaning that no one was maintaining them, and they did not
seemed to be used.
The sources, however, can be found in CVS at:
http://cvs.sourceforge.net/viewcvs.py/cdk/cdk/src/org/openscience/cdk/dat
abase/
Under org.openscience.cdk.database I only see the XindiceReader.
Can you tell me where I can find the DBReader?
Note, that the DBReader and DBWriter is closely tied together, and store
the molecules CML string in the 'molecules' table:
ps = con.prepareStatement("INSERT INTO molecules VALUES('', ?)");
This does likely not match the setup you are working with.
Nevertheless, the source should give you some insight in how I used MySQL
in the past.
Egon
--
Priv. Doz. Dr. Christoph Steinbeck ([EMAIL PROTECTED])
Head of the Research Group for Molecular Informatics
Cologne University BioInformatics Center (http://almost.cubic.uni-koeln.de)
Zülpicher Str. 47, 50674 Cologne
Tel: +49(0)221-470-7426 Fax: +49 (0) 221-470-7786
What is man but that lofty spirit - that sense of enterprise.
... Kirk, "I, Mudd," stardate 4513.3..
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user